Nothing Special   »   [go: up one dir, main page]

KR20150037659A - A method and an apparatus for encoding/decoding a multi-layer video signal - Google Patents

A method and an apparatus for encoding/decoding a multi-layer video signal Download PDF

Info

Publication number
KR20150037659A
KR20150037659A KR20140130926A KR20140130926A KR20150037659A KR 20150037659 A KR20150037659 A KR 20150037659A KR 20140130926 A KR20140130926 A KR 20140130926A KR 20140130926 A KR20140130926 A KR 20140130926A KR 20150037659 A KR20150037659 A KR 20150037659A
Authority
KR
South Korea
Prior art keywords
picture
layer
current
poc
prediction
Prior art date
Application number
KR20140130926A
Other languages
Korean (ko)
Inventor
이배근
김주영
Original Assignee
주식회사 케이티
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 케이티 filed Critical 주식회사 케이티
Publication of KR20150037659A publication Critical patent/KR20150037659A/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/105Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/30Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
    • H04N19/34Scalability techniques involving progressive bit-plane based encoding of the enhancement layer, e.g. fine granular scalability [FGS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A decoding method of a multi-layer video signal according to the present invention induces POC of a current picture included in a current layer in the light of picture type of the current picture and produces a reference picture list for the current picture based on the POC of the current picture and performs the inter prediction of the current picture based on the reference picture list.

Description

[0001] METHOD AND APPARATUS FOR ENCODING / DECODING A MULTI-LAYER VIDEO SIGNAL [0002]

The present invention relates to a multi-layer video signal encoding / decoding method and apparatus.

Recently, the demand for high resolution and high quality images such as high definition (HD) image and ultra high definition (UHD) image is increasing in various applications. As the image data has high resolution and high quality, the amount of data increases relative to the existing image data. Therefore, when the image data is transmitted using a medium such as a wired / wireless broadband line or stored using an existing storage medium, The storage cost is increased. High-efficiency image compression techniques can be utilized to solve such problems as image data becomes high-resolution and high-quality.

An inter picture prediction technique for predicting a pixel value included in a current picture from a previous or a subsequent picture of a current picture by an image compression technique, an intra picture prediction technique for predicting a pixel value included in a current picture using pixel information in the current picture, There are various techniques such as an entropy encoding technique in which a short code is assigned to a value having a high appearance frequency and a long code is assigned to a value having a low appearance frequency. Image data can be effectively compressed and transmitted or stored using such an image compression technique.

On the other hand, demand for high-resolution images is increasing, and demand for stereoscopic image content as a new image service is also increasing. Video compression techniques are being discussed to effectively provide high resolution and ultra-high resolution stereoscopic content.

It is an object of the present invention to provide a method and apparatus for deriving a picture output order in encoding / decoding a multi-layer video signal.

SUMMARY OF THE INVENTION It is an object of the present invention to provide a method and apparatus for up-sampling a picture of a reference layer in encoding / decoding a multi-layer video signal.

An object of the present invention is to provide a method and apparatus for constructing a reference picture list using an interlayer reference picture in encoding / decoding a scalable video signal.

An object of the present invention is to provide a method and apparatus for effectively deriving texture information of a current layer through inter-layer prediction in encoding / decoding a scalable video signal.

A scalable video signal decoding method and apparatus according to the present invention derives an output order (POC) of a current picture belonging to a current layer, generates a reference picture list for the current picture based on a POC of the current picture, And performs inter-prediction of the current picture based on the generated reference picture list.

In the scalable video signal decoding method and apparatus according to the present invention, in the step of deriving the POC of the current picture, the output order of the current picture considers at least one picture type among the current picture and the corresponding picture of the reference layer And is reset.

In the scalable video signal decoding method and apparatus according to the present invention, if the picture type of the corresponding picture of the reference layer is a random access picture and the picture type of the current picture of the current layer is not a random access picture, Is reset to zero.

In the scalable video signal decoding method and apparatus according to the present invention, the resetting is performed based on a POC reset indicator.

In the scalable video signal decoding method and apparatus according to the present invention, the POC reset indicator indicates whether the POC of the current picture is reset.

In a scalable video signal decoding method and apparatus according to the present invention, the POC reset indicator is obtained based on a random access alignment flag.

In the scalable video signal decoding method and apparatus according to the present invention, if the picture of the reference layer is a random access picture, the picture of the current layer is also a random access picture, and the picture of the reference layer is random And the picture of the current layer is not a random access picture unless it is an access picture.

A scalable video signal encoding method and apparatus according to the present invention derives an output order (POC) of a current picture belonging to a current layer, generates a reference picture list for the current picture based on a POC of the current picture, And performs inter-prediction of the current picture based on the generated reference picture list.

In the scalable video signal decoding method and apparatus according to the present invention, in the step of deriving the POC of the current picture, the output order of the current picture considers at least one picture type among the current picture and the corresponding picture of the reference layer And is reset.

In the scalable video signal encoding method and apparatus according to the present invention, when the picture type of the corresponding picture of the reference layer is a random access picture and the picture type of the current picture of the current layer is not a random access picture, Is reset to zero.

In the scalable video signal encoding method and apparatus according to the present invention, the resetting is performed based on the POC reset indicator.

In the scalable video signal encoding method and apparatus according to the present invention, the POC reset indicator indicates whether the POC of the current picture is reset.

In a scalable video signal encoding method and apparatus according to the present invention, the POC reset indicator is obtained based on a random access alignment flag.

In the scalable video signal encoding method and apparatus according to the present invention, if the picture of the reference layer is a random access picture, the picture of the current layer is also a random access picture, and the picture of the reference layer is random And the picture of the current layer is not a random access picture unless it is an access picture.

According to the present invention, it is possible to effectively derive the output order of the picture of the current layer.

According to the present invention, a picture of a reference layer can be effectively upsampled.

According to the present invention, it is possible to effectively construct a reference picture list including an interlayer reference picture.

According to the present invention, texture information of a current layer can be effectively guided through inter-layer prediction.

1 is a block diagram schematically illustrating an encoding apparatus according to an embodiment of the present invention.
2 is a block diagram schematically illustrating a decoding apparatus according to an embodiment of the present invention.
FIG. 3 is a flowchart illustrating a process of inter-prediction of a current layer using a corresponding picture of a reference layer, to which the present invention is applied.
FIG. 4 illustrates a case where the POC of a picture is reset according to a picture type, to which the present invention is applied.
FIG. 5 shows a syntax of the POC reset indicator (poc_reset_flag) according to an embodiment to which the present invention is applied.
FIG. 6 shows a syntax for obtaining a POC reset indicator based on a random access sort flag (cross_layer_irap_aligned_flag), according to an embodiment to which the present invention is applied.
FIG. 7 illustrates a method of specifying a local reference picture stored in a decoding picture buffer according to an embodiment of the present invention. Referring to FIG.
FIG. 8 illustrates a method of specifying a long-term reference picture according to an embodiment of the present invention. Referring to FIG.
FIG. 9 illustrates a method of constructing a reference picture list using a near reference picture and a long distance reference picture according to an embodiment of the present invention.
10 to 13 illustrate a method of constructing a reference picture list in a multi-layer structure according to an embodiment to which the present invention is applied.
FIG. 14 is a flowchart illustrating a method of upsampling a corresponding picture of a reference layer to which the present invention is applied.

Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Prior to this, terms and words used in the present specification and claims should not be construed as limited to ordinary or dictionary terms, and the inventor should appropriately interpret the concepts of the terms appropriately It should be interpreted in accordance with the meaning and concept consistent with the technical idea of the present invention based on the principle that it can be defined. Therefore, the embodiments described in this specification and the configurations shown in the drawings are merely the most preferred embodiments of the present invention and do not represent all the technical ideas of the present invention. Therefore, It is to be understood that equivalents and modifications are possible.

When an element is referred to herein as being "connected" or "connected" to another element, it may mean directly connected or connected to the other element, Element may be present. In addition, the content of " including " a specific configuration in this specification does not exclude a configuration other than the configuration, and means that additional configurations can be included in the scope of the present invention or the scope of the present invention.

The terms first, second, etc. may be used to describe various configurations, but the configurations are not limited by the term. The terms are used for the purpose of distinguishing one configuration from another. For example, without departing from the scope of the present invention, the first configuration may be referred to as the second configuration, and similarly, the second configuration may be named as the first configuration.

In addition, the components shown in the embodiments of the present invention are shown independently to represent different characteristic functions, and do not mean that the components are composed of separate hardware or software constituent units. That is, each constituent unit is included in each constituent unit for convenience of explanation, and at least two constituent units of each constituent unit may form one constituent unit or one constituent unit may be divided into a plurality of constituent units to perform a function. The integrated embodiments and the separate embodiments of each component are also included in the scope of the present invention unless they depart from the essence of the present invention.

In addition, some of the components are not essential components to perform essential functions in the present invention, but may be optional components only to improve performance. The present invention can be implemented only with components essential for realizing the essence of the present invention, except for the components used for the performance improvement, and can be implemented by only including the essential components except the optional components used for performance improvement Are also included in the scope of the present invention.

The coding and decoding of video supporting a plurality of layers (multi-layers) in a bitstream is referred to as scalable video coding. Since there is a strong correlation between a plurality of layers, it is possible to remove redundant elements of data and improve the coding performance of an image by performing prediction using such a relation. Hereinafter, prediction of the current layer using information of another layer is referred to as inter-layer prediction or inter-layer prediction.

The plurality of layers may have different resolutions, where the resolution may refer to at least one of spatial resolution, temporal resolution, and image quality. Resampling such as up-sampling or down-sampling of a layer may be performed to adjust the resolution in the inter-layer prediction.

1 is a block diagram schematically illustrating an encoding apparatus according to an embodiment of the present invention.

The encoding apparatus 100 according to the present invention includes an encoding unit 100a for an upper layer and an encoding unit 100b for a lower layer.

The upper layer may be represented by a current layer or an enhancement layer and the lower layer may be represented by an enhancement layer, a base layer, or a reference layer having a resolution lower than that of the upper layer . The upper layer and the lower layer may have different spatial resolution, temporal resolution according to the frame rate, and image quality depending on the color format or the quantization size. Upsampling or downsampling of a layer may be performed when a resolution change is required to perform inter-layer prediction.

The encoding unit 100a of the upper layer includes a decomposing unit 110, a predicting unit 120, a transforming unit 130, a quantizing unit 140, a rearranging unit 150, an entropy encoding unit 160, 170, an inverse transform unit 180, a filter unit 190, and a memory 195.

The lower layer encoding unit 100b includes a partitioning unit 111, a predicting unit 125, a transforming unit 131, a quantizing unit 141, a reordering unit 151, an entropy coding unit 161, an inverse quantization unit 171, an inverse transform unit 181, a filter unit 191, and a memory 196.

The encoding unit may be implemented by the image encoding method described in the embodiments of the present invention, but operations in some components may not be performed for lowering the complexity of the encoding apparatus or for fast real-time encoding. For example, in performing intra-picture prediction in the prediction unit, it is not necessary to use a method of selecting an optimal intra-picture coding method using all the intra-picture prediction mode methods in order to perform coding in real time, The intra-picture prediction mode may be used as the final intra-picture prediction mode. As another example, it is also possible to restrictively use the type of the prediction block used in intra-picture prediction or inter-picture prediction.

The unit of the block processed by the encoding apparatus may be a coding unit for performing encoding, a prediction unit for performing prediction, and a conversion unit for performing conversion. The coding unit can be expressed by CU (Coding Unit), the prediction unit by PU (Prediction Unit), and the conversion unit by TU (Transform Unit).

In the division units 110 and 111, the layer image is divided into a plurality of encoding blocks, a prediction block, and a conversion block, and is divided into a coding block, a prediction block, Can be selected to divide the layer. For example, a recursive tree structure such as a quad tree structure can be used to divide an encoding unit in a layer image. Hereinafter, in the embodiment of the present invention, the meaning of a coding block may be used not only for a coding block but also for a block to perform decoding.

The prediction block may be a unit for performing prediction such as intra-picture prediction or inter-picture prediction. The block for intra prediction may be a square block such as 2Nx2N, NxN. As a block for performing inter picture prediction, there is a prediction block dividing method using AMP (Asymmetric Motion Partitioning), which is a square shape such as 2Nx2N or NxN or a rectangular shape or an asymmetric shape such as 2NxN and Nx2N. The method of performing the transform in the transform unit 115 may vary depending on the type of the prediction block.

The prediction units 120 and 125 of the encoding units 100a and 100b include intra prediction units 121 and 126 for performing intra prediction and inter prediction units for performing inter prediction, (122, 127). The predicting unit 120 of the upper layer encoding unit 100a may further include an inter-layer predicting unit 123 that performs prediction on an upper layer using information of a lower layer.

The prediction units 120 and 125 can determine whether to use inter-picture prediction or intra-picture prediction for the prediction block. The process of determining an intra prediction mode in units of prediction blocks in performing intra prediction and performing intra prediction on the basis of the determined intra prediction mode may be performed on a conversion block basis. The residual value (residual block) between the generated prediction block and the original block can be input to the conversion units 130 and 131. In addition, the prediction mode information, motion information, and the like used for prediction can be encoded by the entropy encoding unit 130 and transmitted to the decoding apparatus together with the residual value.

When the PCM (Pulse Coded Modulation) coding mode is used, it is also possible to directly encode the original block and transmit it to the decoding unit without performing the prediction through the prediction units 120 and 125.

Intra prediction units 121 and 126 can generate a predicted block on the basis of reference pixels existing in the vicinity of the current block (block to be predicted). In the intra prediction method, the intra prediction mode may have a directional prediction mode using the reference pixel according to the prediction direction and a non-directional mode not considering the prediction direction. The mode for predicting luma information and the mode for predicting chrominance information may be different types. In order to predict the color difference information, an intra prediction mode in which luma information is predicted or predicted luma information can be utilized. If the reference pixel is not available, replace the unavailable reference pixel with another pixel and use it to create a prediction block.

The prediction block may include a plurality of transform blocks. When intra prediction is performed, if the size of the prediction block and the size of the transform block are the same, a pixel existing on the left side of the prediction block, In-picture prediction for the prediction block based on the pixels existing in the prediction block. However, when intra prediction is performed, when the size of the prediction block is different from the size of the transform block, when a plurality of transform blocks are included in the prediction block, the intra-picture prediction is performed using the neighboring pixels adjacent to the transform block as reference pixels. Can be performed. Here, the neighboring pixels adjacent to the transform block may include at least one of neighboring pixels adjacent to the prediction block and pixels already decoded in the prediction block.

The intra-picture prediction method can generate a prediction block after applying a mode dependent intra-smoothing (MDIS) filter to the reference picture according to the intra-picture prediction mode. The type of MDIS filter applied to the reference pixel may be different. The MDIS filter can be used to reduce residuals in intra-frame predicted blocks generated after performing intra-prediction and applied to reference pixels and prediction as additional filters applied to intra-frame predicted blocks. In performing MDIS filtering, the filtering of the reference pixel and some columns included in the intra prediction block can perform filtering according to the direction of the intra prediction mode.

The inter-picture prediction units 122 and 127 can perform prediction by referring to information of a block included in at least one of a previous picture of a current picture or a following picture. The inter-picture prediction units 122 and 127 may include a reference picture interpolating unit, a motion predicting unit, and a motion compensating unit.

In the reference picture interpolating unit, the reference picture information is supplied from the memories 195 and 196, and pixel information of an integer pixel or less can be generated in the reference picture. In the case of luma pixels, a DCT-based interpolation filter (DCT) based on a different filter coefficient may be used to generate pixel information of an integer number of pixels or less in units of quarter pixels. In the case of a color difference signal, a DCT-based 4-tap interpolation filter having a different filter coefficient may be used to generate pixel information of an integer number of pixels or less in units of 1/8 pixel.

The inter-picture prediction units 122 and 127 can perform motion prediction based on the reference pictures interpolated by the reference picture interpolating unit. Various methods such as Full Search-based Block Matching Algorithm (FBMA), Three Step Search (TSS), and New Three-Step Search Algorithm (NTS) can be used to calculate motion vectors. The motion vector may have a motion vector value of 1/2 or 1/4 pixel unit based on the interpolated pixel. The inter-picture prediction units 122 and 127 can perform prediction on the current block by applying one inter-picture prediction method among various inter-picture prediction methods.

As the inter-picture prediction method, various methods such as a skip method, a merge method, and a method using a motion vector predictor (MVP) can be used.

In the inter-picture prediction, information such as motion information, such as reference indices, motion vectors, and residual signals, is entropy-encoded and transmitted to the decoding unit. When the skip mode is applied, a residual signal is not generated, so that the conversion and quantization process for the residual signal may be omitted.

The inter-layer predicting unit 123 performs inter-layer prediction for predicting an upper layer using information of the lower layer. The inter-layer predicting unit 123 may perform inter-layer prediction using texture information and motion information of a lower layer.

Inter-layer prediction can predict a current block of an upper layer using motion information on a picture of a lower layer (reference layer) using a picture of a lower layer as a reference picture. A picture of a reference layer used as a reference picture in inter-layer prediction may be a picture sampled according to the resolution of the current layer. In addition, the motion information may include a motion vector and a reference index. At this time, the value of the motion vector for the picture of the reference layer can be set to (0, 0).

As an example of inter-layer prediction, a prediction method of using a picture of a lower layer as a reference picture has been described, but the present invention is not limited to this. The inter-layer predicting unit 123 may perform inter-layer texture prediction, inter-layer motion prediction, inter-layer syntax prediction, inter-layer difference prediction, and the like.

Inter-layer texture prediction can derive the texture of the current layer based on the texture of the reference layer. The texture of the reference layer can be sampled according to the resolution of the current layer, and the inter-layer predicting unit 123 can predict the texture of the current layer based on the texture of the sampled reference layer.

The inter-layer motion prediction can derive the motion vector of the current layer based on the motion vector of the reference layer. At this time, the motion vector of the reference layer can be scaled according to the resolution of the current layer. In the inter-layer syntax prediction, the syntax of the current layer can be predicted based on the syntax of the reference layer. For example, the inter-layer predicting unit 123 may use the syntax of the reference layer as the syntax of the current layer. In the inter-layer difference prediction, the picture of the current layer can be restored by using the difference between the restored image of the reference layer and the restored image of the current layer.

A residual block including residue information which is a difference value between the prediction blocks generated by the prediction units 120 and 125 and the reconstruction blocks of the prediction blocks is generated and the residual blocks are input to the transform units 130 and 131. [

The transforming units 130 and 131 can transform the residual block using a transform method such as DCT (Discrete Cosine Transform) or DST (Discrete Sine Transform). Whether to apply the DCT or the DST to transform the residual block can be determined based on the intra prediction mode information and the prediction block size information of the prediction block used to generate the residual block. That is, the transforming units 130 and 131 can apply the transforming method differently according to the size of the prediction block and the prediction method.

The quantization units 140 and 141 may quantize the values converted into the frequency domain by the transform units 130 and 131. [ The quantization factor may vary depending on the block or the importance of the image. The values calculated by the quantization units 140 and 141 may be provided to the dequantization units 170 and 17 and the reordering units 150 and 151, respectively.

The reordering units 150 and 151 can reorder the coefficient values with respect to the quantized residual values. The reordering units 150 and 151 may change the two-dimensional block type coefficient to a one-dimensional vector form through a coefficient scanning method. For example, the rearrangement units 150 and 151 may scan a DC coefficient to a coefficient of a high frequency region using a Zig-Zag scan method, and change the DC coefficient to a one-dimensional vector form. A vertical scanning method of scanning a two-dimensional block type coefficient in a column direction instead of a jig-jag scanning method according to a size of a conversion block and an intra-picture prediction mode, and a horizontal scanning method of scanning a two- Can be used. That is, it is possible to determine whether any scan method among the jig-jag scan, the vertical scan and the horizontal scan is used according to the size of the conversion block and the intra prediction mode.

The entropy encoding units 160 and 161 can perform entropy encoding based on the values calculated by the reordering units 150 and 151. [ For entropy encoding, various encoding methods such as Exponential Golomb, Context-Adaptive Variable Length Coding (CAVLC), and Context-Adaptive Binary Arithmetic Coding (CABAC) may be used.

The entropy encoding units 160 and 161 receive the residual value coefficient information, the block type information, the prediction mode information, the division unit information, the prediction block information, and the transmission information of the encoding block from the reordering units 150 and 151 and the prediction units 120 and 125, And various information such as unit information, motion information, reference frame information, block interpolation information, filtering information, and the like, and performs entropy encoding based on a predetermined encoding method. In addition, the entropy encoding units 160 and 161 can entropy-encode the coefficient values of the encoding units input from the reordering units 150 and 151.

The entropy encoding units 160 and 161 may encode the intra-picture prediction mode information of the current block by performing binarization on the intra-picture prediction mode information. The entropy encoding units 160 and 161 may include a codeword mapping unit for performing such a binarization operation, and binarization may be performed differently depending on the size of a prediction block for performing intra prediction. In the codeword mapping unit, a codeword mapping table may be adaptively generated or stored in advance through a binarization operation. In another embodiment, the entropy encoding units 160 and 161 may represent the current intra prediction mode information using a codeword mapping unit that performs codeword mapping and a codeword mapping unit that performs codeword mapping. In the codeword mapping unit and the codeword mapping unit, a codeword mapping table and a codeword mapping table may be generated or stored.

The inverse quantization units 170 and 171 and the inverse transform units 180 and 181 dequantize the quantized values in the quantization units 140 and 141 and invert the converted values in the transform units 130 and 131. The residual values generated by the inverse quantization units 170 and 171 and the inverse transform units 180 and 181 are predicted through a motion estimation unit, a motion compensation unit, and an intra prediction unit included in the prediction units 120 and 125, It can be combined with the prediction block to generate a reconstructed block.

The filter units 190 and 191 may include at least one of a deblocking filter and an offset correcting unit.

The deblocking filter can remove block distortion caused by the boundary between the blocks in the reconstructed picture. It may be determined whether to apply a deblocking filter to the current block based on pixels included in a few columns or rows included in the block to determine whether to perform deblocking. When a deblocking filter is applied to a block, a strong filter or a weak filter may be applied according to the deblocking filtering strength required. In applying the deblocking filter, horizontal filtering and vertical filtering may be performed concurrently when vertical filtering and horizontal filtering are performed.

The offset correction unit may correct the offset of the deblocked image with respect to the original image in units of pixels. In order to perform offset correction for a specific picture, pixels included in an image are divided into a predetermined area, and then an area to be offset is determined, and an offset is applied to the area, or an offset is applied considering edge information of each pixel Can be used.

The filter units 190 and 191 may apply only the deblocking filter without applying both the deblocking filter and the offset correction, or both the deblocking filter and the offset correction.

The memories 195 and 196 may store restored blocks or pictures calculated through the filter units 190 and 191 and the stored restored blocks or pictures may be provided to the predicting units 120 and 125 have.

The information output from the entropy encoding unit 100b of the lower layer and the information output from the entropy encoding unit 100a of the upper layer can be multiplexed by the MUX 197 and output as a bitstream.

The MUX 197 may be included in the encoding unit 100a of the upper layer or the encoding unit 100b of the lower layer or may be implemented as an independent device or module separate from the encoding unit 100. [

2 is a block diagram schematically illustrating a decoding apparatus according to an embodiment of the present invention.

As shown in FIG. 2, the decoding apparatus 200 includes a decoding unit 200a of an upper layer and a decoding unit 200b of a lower layer.

The decryption unit 200a of the upper layer includes an entropy decoding unit 210, a reordering unit 220, an inverse quantization unit 230, an inverse transformation unit 240, a prediction unit 250, a filter unit 260, a memory 270 ).

The lower layer decoding unit 200b includes an entropy decoding unit 211, a rearrangement unit 221, an inverse quantization unit 231, an inverse transformation unit 241, a prediction unit 251, a filter unit 261, a memory 271 ).

When a bitstream including a plurality of layers is transmitted from the encoding apparatus, the DEMUX 280 demultiplexes information for each layer and transmits the demultiplexed information to the decoding units 200a and 200b for the respective layers. The input bitstream can be decoded in a procedure opposite to that of the encoding apparatus.

The entropy decoding units 210 and 211 may perform entropy decoding in a procedure opposite to that in which entropy encoding is performed in the entropy encoding unit of the encoding apparatus. The information for generating a prediction block from the information decoded by the entropy decoding units 210 and 211 is provided to the predictors 250 and 251 and the residual values obtained by performing entropy decoding in the entropy decoding units 210 and 211, (220, 221).

As with the entropy encoding units 160 and 161, the entropy decoding units 210 and 211 may use at least one of CABAC and CAVLC.

The entropy decoding units 210 and 211 can decode information related to the intra-picture prediction and the inter-picture prediction performed by the coding apparatus. The entropy decoding units 210 and 211 may include a codeword mapping table for generating a codeword including the codeword mapping unit in the in-picture prediction mode number. The codeword mapping table may be pre-stored or adaptively generated. When using the code-mapped mapping table, a code-mapped mapping unit for performing code-mapped mapping may additionally be provided.

The reordering units 220 and 221 can perform reordering based on a method in which the entropy decoding units 210 and 211 rearrange the entropy-decoded bitstreams in the encoding unit. The coefficients represented by the one-dimensional vector form can be rearranged by restoring the coefficients of the two-dimensional block form again. The reordering units 220 and 221 can perform reordering by providing information related to the coefficient scanning performed by the encoding unit and performing a reverse scanning based on the scanning order performed by the encoding unit.

The inverse quantization units 230 and 231 may perform inverse quantization based on the quantization parameters provided by the encoding apparatus and the coefficient values of the re-arranged blocks.

The inverse transform units 240 and 241 may perform inverse DCT or inverse DST on the DCT or DST performed by the transform units 130 and 131 with respect to the quantization result performed by the encoding apparatus. The inverse transform can be performed based on the transmission unit determined by the encoding apparatus. In the transforming unit of the encoding apparatus, DCT and DST can be selectively performed according to a plurality of information such as a prediction method, a size and a prediction direction of a current block, and the inverse transforming units 240 and 241 of a decoding apparatus It is possible to perform an inverse conversion based on the performed conversion information. Conversion can be performed based on an encoding block rather than a conversion block.

The prediction units 250 and 251 can generate prediction blocks based on the prediction block generation related information provided by the entropy decoding units 210 and 211 and the previously decoded blocks or picture information provided in the memories 270 and 271 .

The prediction units 250 and 251 may include a prediction unit determination unit, an inter-frame prediction unit, and an intra-frame prediction unit.

The prediction unit determination unit receives various information such as prediction unit information input from the entropy decoding unit, prediction mode information of the intra prediction method, motion prediction related information of the inter picture prediction method, and separates prediction blocks in the current coding block. It is possible to determine whether the inter-picture prediction is performed or the intra-picture prediction is performed.

The inter-picture prediction unit uses the information necessary for the inter-picture prediction of the current prediction block provided by the coding apparatus to predict the current picture based on the information included in at least one of the previous picture of the current picture or the following picture The inter-picture prediction can be performed. In order to perform inter-picture prediction, a motion prediction method of a prediction block included in a coded block based on a coded block is classified into a skip mode, a merge mode, a mode using an MVP (motion vector predictor) Mode) can be determined.

The intra prediction unit can generate a prediction block based on the reconstructed pixel information in the current picture. If the prediction block is a prediction block in which intra prediction is performed, intra prediction can be performed based on intra prediction mode information of the prediction block provided by the encoder. The intra-picture prediction unit includes an MDIS filter that performs filtering on the reference pixels of the current block, a reference pixel interpolator that interpolates reference pixels to generate reference pixels of a pixel unit less than an integer value, Lt; RTI ID = 0.0 > DCF < / RTI >

The predicting unit 250 of the upper layer decoding unit 200a may further include an inter-layer predicting unit for performing inter-layer prediction for predicting an upper layer using information of a lower layer.

The inter-layer prediction unit may perform inter-layer prediction using intra-picture prediction mode information, motion information, and the like.

Inter-layer prediction can predict a current block of an upper layer using motion information on a lower layer (reference layer) picture using a picture of a lower layer as a reference picture.

A picture of a reference layer used as a reference picture in inter-layer prediction may be a picture sampled according to the resolution of the current layer. In addition, the motion information may include a motion vector and a reference index. At this time, the value of the motion vector for the picture of the reference layer can be set to (0, 0).

As an example of inter-layer prediction, a prediction method of using a picture of a lower layer as a reference picture has been described, but the present invention is not limited to this. The inter-layer predicting unit 123 may further perform inter-layer texture prediction, inter-layer motion prediction, inter-layer syntax prediction, and inter-layer difference prediction.

Inter-layer texture prediction can derive the texture of the current layer based on the texture of the reference layer. The texture of the reference layer can be sampled to the resolution of the current layer, and the inter-layer prediction unit can predict the texture of the current layer based on the sampled texture. The inter-layer motion prediction can derive the motion vector of the current layer based on the motion vector of the reference layer. At this time, the motion vector of the reference layer can be scaled according to the resolution of the current layer. In the inter-layer syntax prediction, the syntax of the current layer can be predicted based on the syntax of the reference layer. For example, the inter-layer predicting unit 123 may use the syntax of the reference layer as the syntax of the current layer. In the inter-layer difference prediction, the picture of the current layer can be restored by using the difference between the restored image of the reference layer and the restored image of the current layer.

The reconstructed block or picture may be provided to the filter units 260 and 261. The filter units 260 and 261 may include a deblocking filter and an offset correction unit.

Information on whether or not a deblocking filter has been applied to the block or picture from the encoding device and information on whether a strong filter or a weak filter is applied can be provided when the deblocking filter is applied. In the deblocking filter of the decoding apparatus, the deblocking filter related information provided by the encoding apparatus is provided, and the decoding apparatus can perform deblocking filtering on the corresponding block.

The offset correction unit may perform offset correction on the reconstructed image based on the type of offset correction applied to the image and the offset value information during encoding.

The memories 270 and 271 can store the reconstructed picture or block to be used as a reference picture or a reference block, and can also output the reconstructed picture.

The encoding apparatus and the decoding apparatus can perform encoding on three or more layers instead of two layers. In this case, the encoding unit for the upper layer and the decoding unit for the upper layer are provided in a plurality corresponding to the number of the upper layers .

In SVC (Scalable Video Coding) which supports multi-layer structure, there is a relation between layers. By using this association, prediction can be performed to remove redundant elements of data and enhance the image coding performance.

Therefore, in the case of predicting a picture (video) of a current layer (enhancement layer) to be encoded / decoded, not only inter prediction or intra prediction using information of the current layer but also interlayer prediction using information of another layer can be performed .

In performing inter-layer prediction, the current layer may generate a prediction sample of a current layer using a decoded picture of a reference layer used for inter-layer prediction as a reference picture.

At this time, since at least one of the spatial resolution, the temporal resolution, and the image quality may be different between the current layer and the reference layer (i.e., due to the inter-layer scalability difference), the picture of the decoded reference layer, After resampling is performed, it can be used as a reference picture for interlayer prediction of the current layer. Resampling means up-sampling or down-sampling of the samples of the reference layer picture in accordance with the picture size of the current layer.

In this specification, a current layer refers to a layer on which encoding or decoding is currently performed, and may be an enhancement layer or an upper layer. A reference layer is a layer that the current layer refers to for interlayer prediction, and can be a base layer or a lower layer. A picture of a reference layer (i.e., a reference picture) used for inter-layer prediction of the current layer may be referred to as an inter-layer reference picture or a reference picture between layers.

FIG. 3 is a flowchart illustrating a process of inter-prediction of a current layer using a corresponding picture of a reference layer, to which the present invention is applied.

Referring to FIG. 3, the output order of the current picture belonging to the current layer can be derived (S300).

Here, the POC can be used as a variable for specifying a current picture to derive motion information such as a motion vector prediction or a merge mode. Hereinafter, a value indicating a picture output order is referred to as a picture order count (POC).

The POC of the current picture may be derived based on at least one of a most significant bit (MSB) of the POC extracted from the bitstream and a least significant bit (LSB) of the POC.

In addition, the POC of the current picture may be reset in consideration of at least one picture type of the current picture and the corresponding picture of the reference layer. A method of resetting the POC of the current picture will be described in detail with reference to FIG. 4 to FIG.

Referring to FIG. 3, a reference picture list for the current picture can be generated based on the POC of the current picture derived in step S300 (S310).

Here, the reference picture list may include at least one of a temporal reference picture and an interlayer reference picture.

The temporal reference picture may refer to a picture belonging to the same layer as the current picture and used for inter prediction of the current picture. The temporal reference picture may refer to a picture having an output order (e.g., picture order count, POC) different from the current picture. A method of generating a reference picture list including temporal reference pictures will be described with reference to FIGS. 7 to 9. FIG.

On the other hand, when the current picture performs inter-layer prediction, the reference picture list may further include an inter-layer reference picture. That is, in a multi-layer structure (for example, scalable video coding, multi-view video coding), the current layer can use a picture of the same layer as well as a picture of another layer as a reference picture.

Specifically, the current picture of the current layer can use the corresponding picture belonging to the reference layer as an interlayer reference picture.

Here, the reference layer may mean a base layer or another enhancement layer having a lower resolution than the current layer. The reference layer can be identified by a reference layer identifier (RefPiclayerId). The reference layer identifier can be derived based on the syntax of the slice header, inter_layer_pred_layer_idc (hereinafter referred to as an interlayer indicator). The interlayer indicator may indicate a layer of a picture used for inter-layer prediction of the current picture.

The corresponding picture may be a picture located in the same time zone as the current picture of the current layer.

For example, the corresponding picture may be a picture having the same picture order count (POC) as the current picture of the current layer. The corresponding picture may belong to the same access unit (AU) as the current picture of the current layer. The corresponding picture may have the same temporal level identifier (TemporalID) as the current picture of the current layer. Here, the time level identifier may mean an identifier for specifying each of a plurality of scalably coded layers according to a temporal resolution.

A picture upsampled by the corresponding picture of the reference layer may be used as an interlayer reference picture of the current picture. A method of upsampling the corresponding picture of the reference layer will be described with reference to FIG.

On the other hand, the current picture of the current layer may use one inter-layer reference picture or a plurality of inter-layer reference pictures.

For example, the interlayer reference picture may include at least one of a first interlayer reference picture and a second interlayer reference picture. The first interlayer reference picture indicates a reference picture that has been subjected to filtering for an integer position and the second interlayer reference picture may refer to a reference picture that has not been filtered for an integer position. Herein, the integer position may mean a pixel in the integer unit of the corresponding picture to be upsampled. Alternatively, in an upsampling process, when interpolation is performed in units of integer pixels or less, that is, 1 / n pixels, n phases are generated, and at this time, ). Filtering for the integer position can be performed using the surrounding integer position. The surrounding integer position may be located in the same row or the same column as the current integer position being filtered. The surrounding integer position may include a plurality of integer positions belonging to the same row or the same column. Here, the plurality of integer positions may be arranged in the same column or sequentially in the same row.

In addition, in order to selectively use the first interlayer reference picture and the second interlayer reference picture, both the first interlayer reference picture and the second interlayer reference picture are used in units of pictures, It is possible to select whether to use only the reference picture. Furthermore, when either one of the first interlayer reference picture and the second interlayer reference picture is selected and used, it is possible to select which of the two interlayer reference pictures to use. To do this, the encoder can signal information about which of the two interlayer reference pictures to use.

Alternatively, a reference index may be used for the selective use. Specifically, only the first inter-layer reference picture may be selected by the reference index in units of prediction blocks, or only the second inter-layer reference picture may be selected, and both the first and second inter- .

When an interlayer reference picture is added to the reference picture list, it is necessary to change the number of reference pictures arranged in the reference picture list or the range of the number of reference indices allocated for each reference picture.

Here, it is assumed that the range of the num_ref_idx_l0_active_minus1 and num_ref_idx_l1_active_minus1 syntaxes of the slice header indicating the reference index maximum value of the reference picture list for the base layer has a value between 0 and 14.

In the case of using either the first interlayer reference picture or the second interlayer reference picture, the range of the num_ref_idx_l0_active_minus1 and the num_ref_idx_l1_active_minus1 syntaxes indicating the maximum value of the reference index of the reference picture list for the current layer is a value between 0 and 15 Can be defined. Alternatively, even when both the first interlayer reference picture and the second interlayer reference picture are used, when two interlayer reference pictures are added to different reference picture lists, the range of num_ref_idx_l0_active_minus1 and num_ref_idx_l1_active_minus1 is a value between 0 and 15 Can be defined.

For example, when the number of temporal reference pictures in the reference picture list L0 is 15, 16 total reference pictures exist when the first or second interlaced reference pictures are added to the reference picture list, and the value of num_ref_idx_l0_active_minus1 is 15 do.

Alternatively, when both the first interlayer reference picture and the second interlayer reference picture are used and two interlayer reference pictures are added to the same reference picture list, the reference index maximum value of the reference picture list for the current layer The range of the num_ref_idx_l0_active_minus1 and the num_ref_idx_l1_active_minus1 syntaxes may be defined as a value between 0 and 16.

For example, if the number of temporal reference pictures in the reference picture list L0 is 15, and a first interlaced reference picture and a second interlaced reference picture are added to the reference picture list L0, a total of 17 reference pictures exist, and num_ref_idx_l0_active_minus1 The value is 16.

As described above, an interlayer reference picture can be used for inter-layer prediction of the current picture of the current layer, and a method of generating a reference picture list including an interlayer reference picture will be described with reference to FIGS. 10 to 13 .

Referring to FIG. 3, the inter prediction of the current picture may be performed based on the reference picture list generated in step S310 (S320).

Specifically, the reference picture corresponding to the reference index of the current block is selected from the reference picture list. The selected reference picture may be a temporal reference picture in the same layer as the current block or an up-sampled inter-layer reference picture from a corresponding picture of the reference layer.

The reference block in the reference picture is specified based on the motion vector of the current block and the reconstructed sample value or texture information of the specified reference block is used to predict the sample value or the texture information of the current block. In this case, if the reference picture corresponding to the reference index of the current block is an interlayer reference picture, the reference block may be a block at the same position as the current block. To this end, if the reference picture of the current block is an interlayer reference picture, the motion vector of the current block may be set to (0, 0).

FIG. 4 illustrates a case where the POC of a picture is reset according to a picture type, to which the present invention is applied.

4, the picture pic A of the reference layer belonging to the access unit 1 (AU1) is a random access picture, the picture pic B of the current layer is not a random access picture, and the access unit 2 (AU 2) , Pic B and pic C are lost on the network if the POC value of the reference picture pic C belonging to the current layer is different from the POC value of the picture pic D of the current layer, A problem that it is unknown to belong to different access units may occur.

To solve this problem, the POC of the current picture can be reset by considering at least one picture type of the current picture and the corresponding picture of the reference layer.

 Here, the picture type includes a random access picture (IRAP picture), a leading picture, a random access skipped leading picture (RASL picture), a random access decoded picture leading picture, RADL picture).

Specifically, when a video is played back, a certain frame may be played back by skipping the frame currently being played according to a user's request. This is called random access. Random access is possible in a picture whose decoding order does not refer to a preceding picture, and this is called a random access picture. The order of decoding is later than that of the random access picture, but the preceding picture is called a leading picture. A picture which is not decoded during random access among the leading pictures is referred to as a random access skip leading picture. On the other hand, a picture that can be decoded during random access among the leading pictures is referred to as a random access decoding picture.

(1) Consider the picture type of the current picture

The POC of the current picture can be reset in consideration of the picture type of the current picture.

For example, if the picture type of the current picture is not a random access picture, the POC of the current picture can be reset.

More specifically, when the picture type of the current picture is not a random access picture, the POC of the current picture can be reset to zero.

Alternatively, when the picture type of the current picture is not a random access picture, a POC reset indicator (for example, poc_reset_flag) indicating whether or not to reset the POC of the current picture may be signaled, or based on the POC reset indicator, Of the POC. For example, if the value of the POC reset indicator is 1, the POC of the current picture can be reset to zero. The POC reset indicator may be obtained from at least one of a picture parameter set, a slice header, or a slice segment header, do.

Alternatively, if the picture type of the current picture is not a random access picture, a variable (pocresettingFlag) indicating whether or not to reset the POC of the current picture in the decoder may be derived, and the POC of the current picture may be reset according to the pocresettingFlag have. For example, if the picture type of the current picture is not a random access picture, the variable pocresettingFlag may be derived as 1, in which case the POC of the current picture may be reset to zero.

As described above, when the POC of the current picture is reset, the corresponding picture of the reference layer can be reset to be the same as the POC of the current picture.

(2) Consider whether or not the picture type of the current picture of the current layer matches the picture type of the corresponding picture of the reference layer

The POC of the current picture can be reset in consideration of the same picture type between the current picture of the current layer and the corresponding picture of the reference layer.

For example, if the picture type between the current picture of the current layer and the corresponding picture of the reference layer is different, the POC of the current picture can be reset.

Specifically, when the picture type of the corresponding picture of the reference layer is a random access picture and the picture type of the current picture of the current layer is not a random access picture, the POC of the current picture can be reset to zero.

Alternatively, when the picture type of the corresponding picture of the reference layer is a random access picture and the picture type of the current picture of the current layer is not a random access picture, a POC reset indicator (for example, poc_reset_flag), or may reset the POC of the current picture based on the POC reset indicator. For example, if the value of the POC reset indicator is 1, the POC of the current picture can be reset to zero. The POC reset indicator may be obtained from at least one of a picture parameter set, a slice header, or a slice segment header, do.

Alternatively, when the picture type of the corresponding picture of the reference layer is a random access picture and the picture type of the current picture of the current layer is not a random access picture, a variable (pocresettingFlag) indicating whether or not to reset the POC of the current picture is derived And reset the POC of the current picture according to the variable pocresettingFlag. For example, if the picture type of the current picture is not a random access picture, the variable pocresettingFlag may be derived as 1, in which case the POC of the current picture may be reset to zero.

As described above, when the POC of the current picture is reset, the corresponding picture of the reference layer can be reset to be the same as the POC of the current picture.

When the POC of the current picture is reset in consideration of the picture type of the current picture, the POC of pic A is 0 and the POC of pic D is 1 in Fig. 4, .

FIG. 5 shows a syntax of the POC reset indicator (poc_reset_flag) according to an embodiment to which the present invention is applied.

Referring to FIG. 5, a POC reset indicator (poc_reset_flag) may be obtained from the bitstream (S500).

Here, the POC reset indicator may indicate identification information indicating whether or not the POC of the current picture is reset. Also, the POC reset indicator may be extracted from the slice segment header area.

The POC reset indicator may be obtained by considering whether the current picture is a dependent slice segment. A dependent slice segment may refer to a slice segment that is not an independent slice segment with slice segment header information. The dependent slice segment may refer to a slice segment that uses the same slice segment header information of the independent slice segment.

Referring to FIG. 5, the POC reset indicator may be obtained based on a dependent slice flag (dependent_slice_segment_flag). Here, the dependent slice flag may indicate whether the current slice is a dependent slice segment. Here, the current slice is included in the current picture, and the current picture may be composed of one or more slices. For example, if the value of the dependent slice flag is 1, then the current slice is a dependent slice segment, and vice versa, if the value of the dependent slice flag is 0, the current slice is an independent slice segment.

As shown in FIG. 5, the POC reset indicator may be obtained only if the value of the dependent slice flag for the current slice is zero (i.e., the current slice is not a dependent slice segment).

FIG. 6 shows a syntax for obtaining a POC reset indicator based on a random access sort flag (cross_layer_irap_aligned_flag), according to an embodiment to which the present invention is applied.

The random access sort flag (cross_layer_irap_aligned_flag) indicates that the current layer picture is not a random access picture if the picture of the current layer is a random access picture and the picture of the reference layer is not a random access picture if the picture of the reference layer is a random access picture .

For example, when the value of the random access sort flag is 1, if the picture pic A of the reference layer belonging to the access unit 1 is a random access picture, the picture pic B of the current layer belonging to the access unit 1 is also random access It should be a picture.

For example, when the value of the random access sort flag is 1, all the pictures belonging to the access unit 1 (the picture pic A of the reference layer and the picture pic B of the current layer) may be IDR pictures.

Assuming that the POC of the reference picture pic C belonging to the access unit 2 is different from the POC of the picture pic D of the current layer, pic A is a random access picture even if pic B and pic C are lost on the network , pic D belongs to different access units because it is not a random access picture.

Therefore, as shown in FIG. 6, the POC clear indicator can be obtained only when the value of the random access sort flag is one.

Further, as described above in Fig. 5, it goes without saying that the POC reset indicator can be obtained based on the dependent slice flag (dependent_slice_segment_flag). That is, the POC reset indicator may be obtained only if the value of the dependent slice flag for the current picture is zero (i.e., the current picture is not a dependent slice segment).

FIG. 7 illustrates a method of specifying a local reference picture stored in a decoding picture buffer according to an embodiment of the present invention. Referring to FIG.

The temporal reference picture may be stored in the decoding picture buffer (DPB) and used as a reference picture if necessary for inter prediction of the current picture. The temporal reference picture stored in the decoding picture buffer may include a short-term reference picture. The near reference picture means a picture in which the difference between the current picture and the POC value is not large.

Information specifying the near reference picture to be stored in the decoding picture buffer at the current time is composed of a reference order of picture (POC) and a flag indicating whether to directly refer to the current picture (for example, used_by_curr_pic_s0_flag, used_by_curr_pic_s1_flag) , Which is referred to as a reference picture set. Specifically, when the value of the used_by_curr_pic_s0_flag [i] is 0, if the i-th nearest reference picture in the near reference picture set has a value smaller than the output order (POC) of the current picture, the i-th nearest reference picture is the reference picture Is not used. If the value of the used_by_curr_pic_s1_flag [i] is 0, if the i-th nearest reference picture in the near reference picture set has a value larger than the output order (POC) of the current picture, the i-th nearest reference picture is a reference picture of the current picture Is not used.

Referring to FIG. 7, in the case of a picture having a POC value of 26, all three pictures (i.e., pictures having POC values of 25, 24, and 20) can be used as a near reference picture in inter prediction. However, since the value of used_by_curr_pic_s0_flag of a picture having a POC value of 25 is 0, a picture having a POC value of 25 is not directly used for inter prediction of a picture having a POC value of 26.

 Thus, the near reference picture can be specified based on the output order (POC) of the reference picture and the flag indicating whether the current picture is used as the reference picture.

On the other hand, a picture not shown in the reference picture set for the current picture can be displayed (for example, unused for reference) not to be used as a reference picture, and further removed from the decoding picture buffer.

FIG. 8 illustrates a method of specifying a long-term reference picture according to an embodiment of the present invention. Referring to FIG.

In the case of the long-distance reference picture, since the difference between the current picture and the POC value is large, it can be expressed using the least significant bit (LSB) and the most significant bit (MSB) of the POC value.

Therefore, the POC value of the long-distance reference picture can be derived by using the difference between the LSB value of the POC value of the reference picture, the POC value of the current picture, and the MSB of the POC value of the current picture and the MSB of the POC value of the reference picture.

For example, it is assumed that the POC value of the current picture is 331, the maximum value that can be represented by the LSB is 32, and the picture having the POC value of 308 is used as the long-distance reference picture.

In this case, the POC value 331 of the current picture can be expressed as 32 * 10 + 11, where 10 is the MSB value and 11 is the LSB value. The POC value 308 of the long distance reference picture is represented by 32 * 9 + 20, where 9 is the MSB value and 20 is the LSB value. At this time, the POC value of the long-distance reference picture can be derived as shown in the equation of FIG.

FIG. 9 illustrates a method of constructing a reference picture list using a near reference picture and a long distance reference picture according to an embodiment of the present invention.

Referring to FIG. 9, a reference picture list including temporal reference pictures can be generated in consideration of whether a temporal reference picture is a near reference picture and a POC value of a near reference picture. Here, the reference picture list may include at least one of a reference picture list 0 for L0 prediction and a reference picture list 1 for L1 prediction.

Specifically, in the reference picture list 0, a near reference picture (RefPicSetCurr0) having a POC value smaller than the current picture, a near reference picture (RefPicSetCurr1) having a POC value larger than the current picture, and a long distance reference picture (RefPicSetLtCurr) have.

On the other hand, in the reference picture list 1, a near reference picture (RefPicSetCurr1) having a POC value larger than the current picture, a near reference picture (RefPicSetCurr0) having a POC value smaller than the current picture, and a long distance reference picture (RefPicSetLtCurr) .

In addition, a plurality of temporal reference pictures included in the reference picture list may be rearranged to improve the coding efficiency of the reference index of the temporal reference picture. This can be performed adaptively based on the list rearrangement flag (list_modification_present_flag). Here, the list rearrangement flag is information for specifying whether or not reference pictures in the reference picture list are rearranged. The list rearrangement flag can be signaled for the reference picture list 0 and the reference picture list 1, respectively.

For example, when the value of the list rearrangement flag (list_modification_present_flag) is 0, the reference pictures in the reference picture list are not rearranged, and only when the value of the list rearrangement flag (list_modification_present_flag) is 1, The reference pictures can be rearranged.

If the value of the list rearrangement flag (list_modification_present_flag) is 1, the reference pictures in the reference picture list can be rearranged using the list entry information list_entry [i]. Here, the list entry information (list_entry [i]) can specify the reference index of the reference picture to be located at the current position (i.e., the i-th entry) in the reference picture list.

Specifically, the reference picture corresponding to the list entry information (list_entry [i]) can be specified in the pre-generated reference picture list, and the specified reference picture can be rearranged to the i-th entry in the reference picture list.

The list entry information can be obtained by the number of reference pictures included in the reference picture list or by the maximum reference index value of the reference picture list. Also, the list entry information can be obtained in consideration of the slice type of the current picture. That is, if the slice type of the current picture is a P slice, the list entry information list_entry_l0 [i] for the reference picture list 0 is acquired. If the slice type of the current picture is a B slice, It is possible to additionally obtain the entry information list_entry_l1 [i].

10 to 13 illustrate a method of constructing a reference picture list in a multi-layer structure according to an embodiment to which the present invention is applied.

Referring to FIG. 10, the reference picture list 0 in the multi-layer structure has a POC value smaller than the POC value of the current picture (hereinafter, referred to as a first near reference picture) A large local reference picture (hereinafter referred to as a second near reference picture), and a long distance reference picture. The reference picture list 1 may be composed of a second near reference picture, a first near reference picture, and a long distance reference picture. The inter-layer reference picture can be added to the reference picture list 0 and the reference picture list 1 after the long-distance reference picture.

However, if the enhancement layer image is similar to the base layer image in the multi-layer structure, the enhancement layer may frequently use an interlayer reference picture of the base layer. In this case, if the interlayer reference picture is added to the end of the reference picture list, the coding performance of the reference picture list may be degraded. Therefore, as shown in Figs. 11 to 13, the coding performance of the reference picture list can be improved by adding the interlayer reference picture before the long-distance reference picture.

Referring to FIG. 11, the interlayer reference pictures may be arranged between the near reference pictures in the reference picture list. The reference picture list 0 in the multi-layer structure may be composed of a first near reference picture, an interlayer reference picture, a second near reference picture, and a long distance reference picture. The reference picture list 1 may be composed of a second near reference picture, an interlayer reference picture, a first near reference picture, and a long distance reference picture.

Alternatively, the interlayer reference pictures may be arranged between the near reference pictures and the long distance reference pictures in the reference picture list. Referring to FIG. 12, the reference picture list 0 in the multi-layer structure may be composed of a first near reference picture, a second near reference picture, an interlayer reference picture, and a long distance reference picture. The reference picture list 1 may be composed of a second near reference picture, a first near reference picture, an interlayer reference picture, and a long distance reference picture.

Alternatively, the interlayer reference pictures may be arranged before the near reference pictures in the reference picture list. Referring to FIG. 13, the reference picture list 0 in the multi-layer structure may be configured in the order of an interlayer reference picture, a first near reference picture, a second near reference picture, and a long distance reference picture. The reference picture list 1 may be composed of an interlayer reference picture, a second near reference picture, a first near reference picture, and a long distance reference picture.

On the other hand, in Figs. 10 to 13, a reference picture list is constituted, for example, a near reference picture having a smaller POC value than the current picture, a near reference picture having a larger POC value than the current picture, a long distance reference picture, (A short distance reference picture set), long distance reference pictures (i.e., a long distance reference picture set), interlaced pictures (a long distance reference picture set), and the like. It should be noted that reference pictures (i.e., a set of inter-layer reference pictures) can be used.

In addition, when a plurality of inter-layer reference pictures are used, the plurality of inter-layer reference pictures may be divided into a first inter-layer reference picture set and a second inter-layer reference picture set to constitute a reference picture list.

Specifically, the first inter-layer reference picture set may be arranged between the first near reference picture and the second near reference picture, and the second inter-layer reference picture set may be arranged after the long distance reference picture. However, the present invention is not limited thereto and may include all possible embodiments from the combination of the embodiments shown in FIGS. 10 to 13.

Here, the first inter-layer reference picture set refers to reference pictures that have been subjected to filtering on the integer position, and the second inter-layer reference picture set refers to reference pictures that have not been subjected to filtering on the integer position .

Alternatively, the first interlayer reference picture set may refer to reference pictures of a reference layer having a reference layer identifier (RefPiclayerId) smaller than the layer identifier (CurrlayerId) of the current layer, May refer to reference pictures of a reference layer having a reference layer identifier (RefPiclayerId) larger than a layer identifier (CurrlayerId).

FIG. 14 is a flowchart illustrating a method of upsampling a corresponding picture of a reference layer to which the present invention is applied.

Referring to FIG. 14, a reference sample position of a reference layer corresponding to a current sample position of a current layer may be derived (S1400).

Since the resolutions of the current layer and the reference layer may be different, a reference sample position corresponding to the current sample position can be derived taking into account the difference in resolution between them. That is, the aspect ratio between the picture of the current layer and the picture of the reference layer can be considered. In addition, since the upsampled picture of the reference layer may not coincide with the picture of the current layer, an offset for correcting the upsampled picture may be required.

For example, the reference sample position may be derived taking into account the scale factor and the upsampled reference layer offset.

Here, the scale factor can be calculated based on the ratio of the width and the height between the current picture of the current layer and the corresponding picture of the reference layer.

The upsampled reference layer offset may mean position difference information between any one of the samples located at the edge of the current picture and one of the samples located at the edge of the interlayer reference picture. For example, the upsampled reference layer offset includes positional difference information in the horizontal / vertical direction between the upper left sample of the current picture and the upper left sample of the interlayer reference picture, and the difference information between the lower right sample of the current picture and the lower right sample Directional horizontal / vertical directional difference information.

The upsampled reference layer offset may be obtained from the bitstream. For example, the upsampled reference layer offset may be obtained from at least one of a Video Parameter Set, a Sequence Parameter Set, a Picture Parameter Set, and a Slice Header .

The filter coefficient of the up-sampling filter may be determined considering the phase of the reference sample position derived in step S1400 (S1410).

Here, the up-sampling filter may use either a fixed up-sampling filter or an adaptive up-sampling filter.

1. Fixed Upsampling Filter

The fixed up-sampling filter may refer to an up-sampling filter having a predetermined filter coefficient without considering the characteristics of the image. A tap filter can be used as the fixed up-sampling filter, which can be defined for the luminance component and the chrominance component, respectively. A fixed up-sampling filter having an accuracy of 1/16 sample units will be described with reference to Tables 1 to 2 below.

Phase p
Interpolation filter coefficient
f [p, 0] f [p, 1] f [p, 2] f [p, 3] f [p, 4] f [p, 5] f [p, 6] f [p, 7] 0 0 0 0 64 0 0 0 0 One 0 One -3 63 4 -2 One 0 2 -One 2 -5 62 8 -3 One 0 3 -One 3 -8 60 13 -4 One 0 4 -One 4 -10 58 17 -5 One 0 5 -One 4 -11 52 26 -8 3 -One 6 -One 3 -3 47 31 -10 4 -One 7 -One 4 -11 45 34 -10 4 -One 8 -One 4 -11 40 40 -11 4 -One 9 -One 4 -10 34 45 -11 4 -One 10 -One 4 -10 31 47 -9 3 -One 11 -One 3 -8 26 52 -11 4 -One 12 0 One -5 17 58 -10 4 -One 13 0 One -4 13 60 -8 3 -One 14 0 One -3 8 62 -5 2 -One 15 0 One -2 4 63 -3 One 0

Table 1 is a table defining the filter coefficients of the fixed up-sampling filter with respect to the luminance component.

As shown in Table 1, in the case of upsampling on the luminance component, an 8-tap filter is applied. That is, interpolation can be performed using a reference sample of the reference layer corresponding to the current sample of the current layer and a neighboring sample adjacent to the reference sample. Here, the neighbor samples can be specified according to the direction in which the interpolation is performed. For example, when interpolation is performed in the horizontal direction, the neighboring sample may include three consecutive samples to the left and four consecutive samples to the right based on the reference sample. Alternatively, when interpolation is performed in the vertical direction, the neighboring sample may include three consecutive samples at the top and four consecutive samples at the bottom based on the reference sample.

Since interpolation is performed with an accuracy of 1/16 sample units, there are a total of 16 phases. This is to support resolution of various magnifications such as 2 times and 1.5 times.

In addition, the fixed up-sampling filter may use different filter coefficients for each phase (p). The size of each filter coefficient may be defined to fall within a range of 0 to 63, except when the phase p is zero. This means that the filtering is performed with a precision of 6 bits. Here, the phase (p) of 0 means the position of an integer multiple of n when interpolation is performed in 1 / n sample units.

Phase p
Interpolation filter coefficient
f [p, 0] f [p, 1] f [p, 2] f [p, 3] 0 0 64 0 0 One -2 62 4 0 2 -2 58 10 -2 3 -4 56 14 -2 4 -4 54 16 -2 5 -6 52 20 -2 6 -6 46 28 -4 7 -4 42 30 -4 8 -4 36 36 -4 9 -4 30 42 -4 10 -4 28 46 -6 11 -2 20 52 -6 12 -2 16 54 -4 13 -2 14 56 -4 14 -2 10 58 -2 15 0 4 62 -2

Table 2 defines the filter coefficients of the fixed up-sampling filter for the chrominance components.

As shown in Table 2, in case of up-sampling for the chrominance components, a 4-tap filter can be applied unlike the luminance component. That is, interpolation can be performed using a reference sample of the reference layer corresponding to the current sample of the current layer and a neighboring sample adjacent to the reference sample. Here, the neighbor samples can be specified according to the direction in which the interpolation is performed. For example, when interpolation is performed in the horizontal direction, the neighboring sample may include one continuous sample to the left and two consecutive samples to the right based on the reference sample. Alternatively, when interpolation is performed in the vertical direction, the neighboring sample may include one continuous sample at the top and two consecutive samples at the bottom based on the reference sample.

On the other hand, as in the case of the luminance component, since interpolation is performed with an accuracy of 1/16 sample units, there are a total of 16 phases, and different filter coefficients can be used for each phase (p). And, the size of each filter coefficient can be defined to fall in the range of 0 to 62, except when the phase (p) is zero. This also means that filtering is performed with a precision of 6 bits.

The 8-tap filter is applied to the luminance component and the 4-tap filter is applied to the chrominance component. However, the present invention is not limited to this, and the order of the tap filter may be variably determined in consideration of the coding efficiency Of course it is.

2. Adaptive up-sampling filter

It is possible to determine the optimum filter coefficient in the encoder considering the feature of the image without using the fixed filter coefficient, signaling it to the decoder, and transmit it to the decoder. It is the adaptive up-sampling filter that uses adaptively determined filter coefficients in the encoder. Since the characteristics of the image are different in picture units, it is possible to improve the coding efficiency by using an adaptive up-sampling filter capable of expressing characteristics of the image better than using a fixed up-sampling filter in all cases.

The inter-layer reference picture may be generated by applying the filter coefficient determined in step S1410 to the corresponding picture of the reference layer (S1420).

Specifically, the filter coefficient of the determined up-sampling filter may be applied to the samples of the corresponding picture to perform interpolation. Here, the interpolation may be performed primarily in the horizontal direction and may be performed in the vertical direction with respect to the sample generated after the interpolation in the horizontal direction.

Claims (15)

Deriving an output order (POC) of a current picture belonging to a current layer;
Generating a reference picture list for the current picture based on a POC of the current picture; And
And performing inter-prediction of the current picture based on the generated reference picture list,
In deriving the output order,
Wherein the output order of the current picture is reset in consideration of at least one picture type of the current picture and the corresponding picture of the reference layer.
The method according to claim 1,
Wherein the output order of the current picture is reset to 0 if the picture type of the corresponding picture of the reference layer is a random access picture and the picture type of the current picture of the current layer is not a random access picture / RTI >
3. The method of claim 2, wherein the resetting is performed based on a POC reset indicator,
Wherein the POC reset indicator indicates whether the POC of the current picture is reset.
4. The apparatus of claim 3, wherein the POC reset indicator is obtained based on a random access alignment flag,
And the random access sort flag is a random access picture if the picture of the current layer is a random access picture and the picture of the current layer is not a random access picture if the picture of the reference layer is a random access picture Wherein the video signal is a video signal.
Generating an output order (POC) of a current picture belonging to a current layer, generating a reference picture list for the current picture based on the POC of the current picture, and generating a reference picture list of the current picture based on the generated reference picture list And a prediction unit for performing prediction,
Wherein the output order of the current picture is reset in consideration of at least one picture type of the current picture and the corresponding picture of the reference layer.
6. The method of claim 5,
Wherein the output order of the current picture is reset to 0 if the picture type of the corresponding picture of the reference layer is a random access picture and the picture type of the current picture of the current layer is not a random access picture / RTI >
7. The method of claim 6, wherein the resetting is performed based on a POC reset indicator,
Wherein the POC reset indicator indicates whether the POC of the current picture is reset.
8. The method of claim 7, wherein the POC reset indicator is obtained based on a random access alignment flag,
And the random access sort flag is a random access picture if the picture of the current layer is a random access picture and the picture of the current layer is not a random access picture if the picture of the reference layer is a random access picture Wherein the multi-layer video signal decoding apparatus comprises:
Deriving an output order (POC) of a current picture belonging to a current layer;
Generating a reference picture list for the current picture based on a POC of the current picture; And
And performing inter-prediction of the current picture based on the generated reference picture list,
In deriving the output order,
Wherein the output order of the current picture is reset in consideration of at least one picture type of the current picture and the corresponding picture of the reference layer.
10. The method of claim 9,
Wherein the output order of the current picture is reset to 0 if the picture type of the corresponding picture of the reference layer is a random access picture and the picture type of the current picture of the current layer is not a random access picture Signal encoding method.
11. The method of claim 10, wherein the resetting is performed based on a POC reset indicator,
Wherein the POC reset indicator indicates whether the POC of the current picture is reset.
12. The method of claim 11, wherein the POC reset indicator is obtained based on a random access alignment flag,
And the random access sort flag is a random access picture if the picture of the current layer is a random access picture and the picture of the current layer is not a random access picture if the picture of the reference layer is a random access picture Wherein the step of encoding the multi-layer video signal comprises:
Generating an output order (POC) of a current picture belonging to a current layer, generating a reference picture list for the current picture based on the POC of the current picture, and generating a reference picture list of the current picture based on the generated reference picture list And a prediction unit for performing prediction,
Wherein the output order of the current picture is reset in consideration of at least one picture type of the current picture and the corresponding picture of the reference layer.
14. The method of claim 13, wherein when the picture type of the corresponding picture of the reference layer is a random access picture and the picture type of the current picture of the current layer is not a random access picture, Wherein the multi-layer video signal encoding apparatus comprises: 15. The method of claim 14, wherein the resetting is performed based on a POC reset indicator,
Wherein the POC reset indicator indicates whether the POC of the current picture is reset.
KR20140130926A 2013-09-30 2014-09-30 A method and an apparatus for encoding/decoding a multi-layer video signal KR20150037659A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR20130116309 2013-09-30
KR1020130116309 2013-09-30

Publications (1)

Publication Number Publication Date
KR20150037659A true KR20150037659A (en) 2015-04-08

Family

ID=53033533

Family Applications (1)

Application Number Title Priority Date Filing Date
KR20140130926A KR20150037659A (en) 2013-09-30 2014-09-30 A method and an apparatus for encoding/decoding a multi-layer video signal

Country Status (1)

Country Link
KR (1) KR20150037659A (en)

Similar Documents

Publication Publication Date Title
US10148949B2 (en) Scalable video signal encoding/decoding method and apparatus
US10425650B2 (en) Multi-layer video signal encoding/decoding method and apparatus
KR20140145560A (en) A method and an apparatus for encoding/decoding a scalable video signal
KR20150133683A (en) A method and an apparatus for encoding and decoding a scalable video signal
KR20150075040A (en) A method and an apparatus for encoding/decoding a multi-layer video signal
KR20150133682A (en) A method and an apparatus for encoding and decoding a scalable video signal
KR20150099496A (en) A method and an apparatus for encoding and decoding a scalable video signal
KR20150029593A (en) A method and an apparatus for encoding and decoding a scalable video signal
KR20150110295A (en) A method and an apparatus for encoding/decoding a multi-layer video signal
KR20150064677A (en) A method and an apparatus for encoding and decoding a multi-layer video signal
KR20150048077A (en) A method and an apparatus for encoding/decoding a multi-layer video signal
KR20150009468A (en) A method and an apparatus for encoding/decoding a scalable video signal
KR20150133684A (en) A method and an apparatus for encoding and decoding a scalable video signal
KR20150043990A (en) A method and an apparatus for encoding/decoding a multi-layer video signal
KR20140145559A (en) A method and an apparatus for encoding/decoding a scalable video signal
KR20150046742A (en) A method and an apparatus for encoding and decoding a multi-layer video signal
KR20150037660A (en) A method and an apparatus for encoding and decoding a multi-layer video signal
KR20150037659A (en) A method and an apparatus for encoding/decoding a multi-layer video signal
KR20150064675A (en) A method and an apparatus for encoding/decoding a multi-layer video signal
KR20150014872A (en) A method and an apparatus for encoding/decoding a scalable video signal
KR20150133685A (en) A method and an apparatus for encoding/decoding a multi-layer video signal
KR20150043989A (en) A method and an apparatus for encoding/decoding a multi-layer video signal
KR20150071653A (en) A method and an apparatus for encoding/decoding a multi-layer video signal
KR20150044394A (en) A method and an apparatus for encoding/decoding a multi-layer video signal
KR20150009470A (en) A method and an apparatus for encoding and decoding a scalable video signal

Legal Events

Date Code Title Description
WITN Withdrawal due to no request for examination