WO2015060508A1

WO2015060508A1 - Video encoding/decoding method and apparatus

Info

Publication number: WO2015060508A1
Application number: PCT/KR2014/003517
Authority: WO
Inventors: 방건; 이광순; 허남호; 박광훈; 허영수; 김경용; 이윤진
Original assignee: 한국전자통신연구원; 경희대학교 산학협력단
Priority date: 2013-10-24
Filing date: 2014-04-22
Publication date: 2015-04-30

Abstract

Disclosed are a video encoding/decoding method and apparatus including a plurality of views. The video decoding method including the plurality of views comprises the steps of: inducing basic combination motion candidates for a current Prediction Unit (PU) to configure a combination motion candidate list; inducing expanded combination motion candidates for the current PU when the current PU corresponds to a depth information map or a dependent view; and adding the expanded combination motion candidates to the combination motion candidate list.

Description

Video encoding / decoding method and apparatus

The present invention relates to a video encoding / decoding method and apparatus, and more particularly, to a method and apparatus for constructing a merge motion candidate list for 3D video coding.

3D video vividly provides a user with a three-dimensional effect as seen and felt in the real world through a three-dimensional display device. Related work includes three-dimensional work in The Joint Collaborative Team on 3D Video Coding Extension Development (JCT-3V), a joint standardization group of ISO / IEC's Moving Picture Experts Group (MPEG) and ITU-T's Video Coding Experts Group (VCEG). The video standard is in progress. The 3D video standard uses an advanced data format that can support the playback of autostereoscopic images as well as stereoscopic images using a texture view and its depth map. Contains standards for technology.

The present invention provides an image encoding / decoding method and apparatus capable of improving image encoding / decoding efficiency.

The present invention provides a 3D video encoding / decoding method and apparatus capable of improving encoding / decoding efficiency.

The present invention provides a method and apparatus for constructing a merge motion candidate list in 3D video encoding / decoding.

According to an embodiment of the present invention, a video decoding method including a plurality of views is provided. The video decoding method may include constructing a merge motion candidate list by deriving a basic merge motion candidate for a current prediction unit (PU), and when the current PU is a depth map or a dependent view, extended merging for the current PU Deriving a motion candidate and adding the extended merge motion candidate to the merge motion candidate list.

In the adding of the extended merge motion candidate, when the extended merge motion candidate is not the same as the basic merge motion candidate in the merge motion candidate list, the extended merge motion candidate may be added to the merge motion candidate list. .

According to another embodiment of the present invention, a video decoding apparatus including a plurality of views is provided. The video decoding apparatus derives a basic merge motion list constructing module for deriving a basic merge motion candidate for a current prediction unit (PU) and configures a merge motion candidate list, and when the current PU is a depth map or a dependent view, the current An additional merge motion list construction module for deriving an extended merge motion candidate for a PU and adding the extended merge motion candidate to the merge motion candidate list.

The additional merge motion list construction module may add the extended merge motion candidate to the merge motion candidate list when the extended merge motion candidate is not the same as the basic merge motion candidate in the merge motion candidate list.

According to another embodiment of the present invention, a video encoding method including a plurality of views is provided. The video encoding method may include constructing a merge motion candidate list by deriving a basic merge motion candidate for a current prediction unit (PU), and when the current PU is a depth map or a dependent view, extended merge for the current PU Deriving a motion candidate and adding the extended merge motion candidate to the merge motion candidate list.

According to another embodiment of the present invention, a video encoding apparatus including a plurality of views is provided. The video encoding apparatus derives a basic merge motion list constructing module for deriving a basic merge motion candidate for a current prediction unit (PU) and configures a merge motion candidate list, and when the current PU is a depth map or a dependent view, the current An additional merge motion list construction module for deriving an extended merge motion candidate for a PU and adding the extended merge motion candidate to the merge motion candidate list.

Modules used for encoding normal images for independent views (view 0) that provide backward compatibility are dependent views (view 1) and view 2 (view 2). Implementation complexity can be reduced by applying the same to the general image and the depth maps for the C).

In addition, the coding efficiency may be improved by additionally applying the partial encoder to the general image and the depth maps of the dependent view (View 1 and View 2).

1 is an example schematically showing the basic structure and data format of a 3D video system.

2 is a diagram illustrating an example of an actual image and a depth map image of a “balloons” image.

3 is an example illustrating the structure of inter view prediction in a 3D video codec.

FIG. 4 illustrates an example of a process of encoding / decoding a texture view and a depth view in a 3D video encoder / decoder.

5 shows an example of a prediction structure of a 3D video codec.

6 is a schematic structural diagram of an encoder of a 3D video codec.

FIG. 7 is a diagram illustrating a merge motion method used in an HEVC-based 3D video codec (3D-HEVC).

8 shows an example of neighboring blocks used to construct a merge motion list for a current block.

9 is a diagram illustrating an example of a hardware implementation of a method of constructing a merge motion candidate list.

10 is a diagram schematically illustrating a 3D video codec according to an embodiment of the present invention.

11 is a conceptual diagram schematically illustrating a merge motion method according to an embodiment of the present invention.

12 is a diagram illustrating an example of hardware implementation of the merge motion method of FIG. 11 according to an embodiment of the present invention.

13 is a conceptual diagram illustrating a method of constructing a merge motion candidate list of FIGS. 11 and 12 according to an embodiment of the present invention.

14 illustrates a method of constructing an extended merge motion candidate list according to an embodiment of the present invention.

15 is a diagram for describing a method of constructing an extended merge motion candidate list according to another embodiment of the present invention.

16 is a flowchart schematically illustrating a method of constructing a merge motion candidate list according to an embodiment of the present invention.

17A to 17F are flowcharts illustrating a method of adding an extended merge motion candidate to a merge motion candidate list according to an embodiment of the present invention.

18 is a flowchart schematically illustrating a method of constructing a merge motion candidate list in video encoding / decoding including a plurality of viewpoints according to an embodiment of the present invention.

EMBODIMENT OF THE INVENTION Hereinafter, embodiment of this invention is described concretely with reference to drawings. In describing the embodiments of the present specification, when it is determined that a detailed description of a related well-known configuration or function may obscure the gist of the present specification, the description may be omitted.

When a component is referred to herein as being “connected” or “connected” to another component, it may mean that it is directly connected to or connected to that other component, or another component in between. It may also mean that an element exists. In addition, the description "includes" a specific configuration in this specification does not exclude a configuration other than the configuration, it means that additional configuration may be included in the scope of the technical spirit of the present invention or the present invention.

Terms such as first and second may be used to describe various configurations, but the configurations are not limited by the terms. The terms are used to distinguish one configuration from another. For example, without departing from the scope of the present invention, the first configuration may be referred to as the second configuration, and similarly, the second configuration may also be referred to as the first configuration.

In addition, the components shown in the embodiments of the present invention are independently shown to represent different characteristic functions, and do not mean that each component is made of separate hardware or one software component unit. In other words, each component is listed as a component for convenience of description, and at least two of the components may form one component, or one component may be divided into a plurality of components to perform a function. The integrated and separated embodiments of each component are also included in the scope of the present invention without departing from the spirit of the present invention.

In addition, some of the components may not be essential components for performing essential functions in the present invention, but may be optional components for improving performance. The present invention can be implemented including only the components essential for implementing the essentials of the present invention except for the components used for improving performance, and the structure including only the essential components except for the optional components used for improving performance. Also included in the scope of the present invention.

1 is an example schematically showing the basic structure and data format of a 3D video system. The 3D video system of FIG. 1 may be a basic 3D video system under consideration in the 3D video standard.

Referring to FIG. 1, a 3D video (3D video) system decodes a video content received from a sender and a transmitter that generates a multi-view video content, and thus generates a multiview video. It may include a receiver for providing a.

The transmitter may generate video information using a stereo camera and a multiview camera, and generate a depth map or a depth view using the depth camera. In addition, the transmitter may convert a 2D image into a 3D image using a converter. The transmitter may generate image content of an N (N≥2) view using the generated video information, the depth map, and the like.

The image content of the N view may include video information of the N view, depth-map information thereof, and additional information related to a camera. The video content of N views may be compressed using a multiview video encoding method in a 3D video encoder, and the compressed video content (bitstream) may be transmitted to a terminal on the receiving side through a network.

The receiving side may decode the received bitstream by using a multiview video decoding method in a video decoder (eg, a 3D video decoder, a stereo video decoder, a 2D video decoder, etc.) to restore an image of N views.

The reconstructed N-view image may be generated as virtual view images of N or more views through a depth-image-based rendering (DIBR) process. The generated virtual viewpoint images of the N viewpoints or more are reproduced for various stereoscopic display devices (for example, N-view displays, stereo displays, 2D displays, etc.) to provide a user with a three-dimensional image.

The depth map is used to generate a virtual viewpoint image, and represents the distance between the camera and the real object (depth information corresponding to each pixel at the same resolution as the actual image) in a certain number of bits in the real world. .

FIG. 2 (a) shows “balloons” images being used in the 3D video coding standard of MPEG, which is an international standardization organization. FIG. 2B illustrates a depth map image of the “balloons” image shown in FIG. 2A. The depth map image illustrated in FIG. 2B expresses depth information displayed on the screen at 8 bits per pixel.

A method of encoding a real image and its depth map may be, for example, using H.264 / AVC (MPEG-4 Part 10 Advanced Video Coding), or moving picture experts group (MPEG) and video coding experts group (VCEG). You can also use the HEVC (High Efficiency Video Coding) international video standard jointly standardized by

The real image and its depth map may be images obtained from not only one camera but also several cameras. Images obtained from multiple cameras may be encoded independently and may be encoded using a general two-dimensional video encoding codec. In addition, since images obtained from multiple cameras have correlations between viewpoints, images obtained from multiple cameras may be encoded using different inter-view predictions to increase encoding efficiency.

Referring to FIG. 3, View 1 is an image obtained from a camera located on the left side with respect to View 0, and View 2 is on the right side with respect to View 0. Image obtained from the camera located.

Also, view 1 and view 2 perform inter-view prediction using view 0 as a reference image, and the encoding order is view 1 and view 2. View 0 should be coded before.

In this case, view 0 is called an independent view because it may be independently encoded regardless of other views. On the other hand, view 1 and view 2 are referred to as dependent views because they are encoded using view 0 as a reference image. Independent viewpoint images may be encoded using a general two-dimensional video codec. On the other hand, since the dependent view image needs to perform inter-view prediction, it may be encoded using a 3D video codec including an inter-view prediction process.

In addition, in order to increase the encoding efficiency of the view 1 and the view 2, the view 1 and the view 2 may be encoded using the depth map. For example, when the real image and the depth map thereof are encoded, the real image and the depth map may be encoded / decoded independently of each other. Alternatively, when the real image and the depth map are encoded, the real image and the depth map may be encoded / decoded depending on each other as shown in FIG. 4.

Referring to FIG. 4, the 3D video encoder may include a real image encoder for encoding a texture view and a depth map encoder for encoding a depth view. .

For example, the real image encoder may encode the real image using a depth map that is already encoded by the depth map encoder. In contrast, the depth map encoder may encode the depth map by using the real image that is already encoded by the real image encoder.

The 3D video decoder may include a real image decoder for decoding an actual image and a depth map decoder for decoding a depth map.

For example, the real image decoder may decode the real image using the depth map already decoded by the depth map decoder. On the contrary, the depth map decoder may decode the depth map using the real image that is already decoded by the real image decoder.

5 shows an example of a prediction structure of a 3D video codec.

5 is a diagram illustrating an encoding prediction structure for encoding a real image obtained from three cameras and a depth map of the real image for convenience of description.

Referring to FIG. 5, three real images acquired by three cameras are represented by T0, T1, and T2 according to viewpoints, and three depth maps of the same position as the actual image are represented by D0, D1, and D2 according to viewpoints. Indicated. Here, T0 and D0 are images acquired at View 0, T1 and D1 are images acquired at View 1, and T2 and D2 are images acquired at View 2.

5 shows an image (picture).

Each picture (picture) is divided into an I picture (Intra Picture), a P picture (Uni-prediction Picture), and a B picture (Bi-prediction Picture) according to an encoding type, and may be encoded according to an encoding type of each picture. The I picture encodes the image itself without inter-picture prediction, the P picture predicts and encodes the picture using the reference picture only in the forward direction, and the B picture encodes the picture inter prediction using the reference picture in both the forward and reverse directions.

Arrows in FIG. 5 indicate prediction directions. That is, the real image and its depth map may be encoded / decoded depending on the prediction direction.

In order to predict the current block in the real image, a method for inferring motion information of the current block is largely divided into temporal prediction and inter view prediction. Temporal prediction is a prediction method using temporal correlation within the same viewpoint, and inter-view prediction is a prediction method using inter-view correlation at adjacent viewpoints. Such temporal prediction and inter-view prediction may be mixed with each other in a picture.

Here, the current block refers to a block in which the current prediction is performed in the real image. The motion information may mean only a motion vector or may mean a motion vector, a reference picture number, unidirectional prediction, bidirectional prediction, inter-view prediction, temporal prediction, or another prediction.

On the other hand, a large amount of three-dimensional image content should be efficiently compressed in order to reduce the amount of bitstream. Correlation between different viewpoints may be used to increase encoding efficiency, and correlation between a texture view and a depth view or depth map may be used. However, due to this, more coding algorithms are required than when encoding two-dimensional images, and there is a problem in that implementation complexity of hardware and software for implementing the same increases and computational complexity increases.

6 is a schematic structural diagram of an encoder of a 3D video codec.

Referring to FIG. 6, the 3D video codec 600 receives and encodes different viewpoint images (eg, view 0, view 1, and view 2) as inputs. In addition, the encoded integrated bitstream may be output.

In this case, the images may include not only a texture view but also a depth view.

The 3D video codec 600 may encode input images by different encoders according to view information (View ID information).

For example, the image of view 0 may be encoded by the existing 2D video codec for backward compatibility, and thus may be encoded by the base layer encoder 610. Images of View 1 and View 2 must be encoded with a 3D video codec that includes an inter-view prediction algorithm and an algorithm using the correlation between the general image and the depth map, and thus an enhancement layer. It may be encoded by the encoder 620 (view 1 or view 2 encoder).

In addition, in the case of the depth map other than the normal image, since the encoded information may be encoded by using the encoded information of the general image, it may be encoded by the enhancement layer encoder 620. Therefore, a more complicated encoder is required when encoding the images of the view 1 and the view 2 than the encoding of the view 0 and the view of the base layer. More complex encoders are required when encoding depth maps than when encoding normal images.

Meanwhile, in HEVC, a merge or merge motion method is used as one of encoding methods of motion information used for inter prediction during image encoding / decoding. Here, in order to increase the coding efficiency in the enhancement layer, the enhancement layer uses an improved merge motion method by modifying the merge motion method in the base layer.

Referring to FIG. 7, in the 3D-HEVC 700, a method 710 of merging motion configuration for view 0 and other remaining views (view 1 and view 2) may be used. The merge motion configuration method 720 is performed separately from each other.

When an image including a current PU (Prediction Unit, PB (Prediction Block), or any size block) is input to the 3D-HEVC 700, the 3D-HEVC 700 determines whether the input image is a normal image or depth information. Based on the information (Texture / Depth information) whether it is a map image and the view information (ViewID information) of the input image, the merging motion configuration method 710 for the view 0 (View 0) and the other views (view 1 One of the method 720 for configuring merge motion for View 1) and View 2) may be selected. The 3D-HEVC 700 may output a merge motion candidate list for the current PU using the selected merge motion configuration method.

Here, the current PU refers to a current block in which prediction within the current image is performed to encode / decode the current image.

The general image for view 0 constructs a merge motion candidate list using a merge motion configuration method for the base layer for backward compatibility. On the other hand, the general image and the depth map of the view 1 and the view 2 constitute a merge motion candidate list using the merge motion configuration method for the enhancement layer.

The merge motion construction method for the enhancement layer is performed by adding a new candidate or modifying the candidate list order to the merge motion construction method for the base layer. That is, as shown in FIG. 7, the merge motion configuration method for the enhancement layer (the other views (view 1 and view 2) and the depth map) includes the merge motion for the base layer. Configuration methods are included.

Therefore, it can be seen that the merge motion configuration method for the enhancement layer is more complicated than the merge motion configuration for the base layer, and the computational complexity is large. In addition, in terms of hardware or software implementation, it is necessary to implement both the merge motion configuration method for the base layer and the merge motion configuration method for the enhancement layer.

The merge motion method refers to a method of using motion information of a neighboring block of the current block as motion information (for example, a motion vector, a reference picture list, a reference picture index, etc.) of the current block (current PU). A merge motion candidate list for the current block is constructed based on the motion information of the block.

As shown in FIG. 8, the neighboring blocks correspond to neighboring blocks A, B, C, D, and E, which are spatially adjacent to the current block, and temporally with the current block. May include a co-located block (H or M) in the same position. A candidate block at the same position refers to a block at the same position in a co-located picture corresponding to the current picture including the current block in time. If the H block in the picture at the same position is available, the H block is determined as a candidate block at the same position. If the H block is not available, the M block in the picture at the same position is determined as the candidate block at the same position.

In order to construct a merge motion candidate list, first, motion information of a candidate block (H or M) at the same position as neighboring blocks A, B, C, D, and E constitutes a merge motion candidate list of the current block. It is determined whether it can be used as a merge candidate, and the motion information of the next available block is determined as a merge motion candidate. The merge motion candidate may be added to the merge motion candidate list.

Referring to FIG. 9, input parameters for constructing a merge motion list used in a general image for view 0, and general image and depth information for

views

1 and 2 The input parameters for constructing the merge motion list used in the map are the same. The only difference is the input parameters (“Additional Motion F” and “Additional Motion G”) for constructing the merged motion list used in the normal image and depth maps for View 1 and View 2. Is added.

Therefore, the parts constituting the merge motion candidate list are changed due to the added motion information. That is, in order to include the added motion information in the merge motion candidate list (in order to increase the encoding efficiency), the merge motion list for the general image and the depth map for the view 1 and the view 2 The new configuration module must be implemented. This can increase the implementation complexity of the hardware.

In order to solve these problems, in the present invention, the implementation complexity and calculation of the encoding algorithm and the video codec for the enhancement layer (for example, the general image and the depth map of the view 1 and the view 2) We propose a method for reducing complexity. For example, in the present invention, the "merge motion candidate list construction" module for the base layer (general image for view 0), which is already implemented in the form of a hardware chip, is reused as it is, and thus an enhancement layer (for example, view 1 By applying to the general image and the depth map of (View 1) and View (2), hardware implementation complexity can be reduced. According to the present invention, if a consumer having an encoder / decoder (specifically, a "merged motion candidate list construction" module) module for a base layer used for two-dimensional video service wants to receive three-dimensional video service, only an additional module ( Specifically, attaching only a "merged motion candidate list construction" module for the enhancement layer enables easy 3D video service.

Hereinafter, a method of improving data throughput by reducing implementation complexity and computational complexity of a video codec will be described.

[Default method]

Referring to FIG. 10, the 3D video codec 1000 receives and encodes different viewpoint images (eg, view 0, view 1, and view 2) as inputs. In addition, one encoded bitstream may be output.

In this case, the images may include not only a texture view but also a depth view. The images may include an image of an independent view that may be independently encoded regardless of another viewpoint, and an image of a dependent view that is encoded using an image of an independent viewpoint as a reference image. Can be. For example, view 0 may be an independent view, and view 1 and view 2 may be dependent views encoded with reference to view 0. FIG.

The 3D video codec 1000 may include an encoder 1010 capable of encoding a general image and a depth map for all views (eg, view 0, view 1, and view 2). For example, the encoder 1010 capable of encoding a general image and a depth map for all viewpoints is MPEG-1, MPEG-2, MPEG-4 Part 2 Visual, H.264 / AVC, VC-1, AVS, KTA. , HEVC (H.265 / HEVC), and the like.

The 3D video codec 1000 may include a partial encoder 1020 to increase encoding efficiency with respect to the general image and the depth map for the dependent view instead of the independent view. For example, the partial encoder 1020 may encode a general image and a depth map of

views

1 and 2, or may encode depth maps of all views.

The 3D video codec 1000 may include a multiplexer 1030 for multiplexing the images encoded by the

encoders

1010 and 1020. The multiplexer 1030 may generate a bitstream of a general image of view 0 and a bitstream of general image and depth maps of other views (view 1 and view 2). Multiplexing may be performed to output one bitstream.

As described above, the 3D video codec 1000 according to an embodiment of the present invention is a module used for encoding a general image with respect to an independent viewpoint (eg, View 0) providing backward compatibility ( 1010 may be applied to the general image and the depth maps of the dependent view (eg, View 1 and View 2) as it is, thereby reducing implementation complexity. In addition, the 3D video codec 1000 according to an exemplary embodiment of the present invention may perform partial encoders on general image and depth maps of dependent viewpoints (eg, View 1 and View 2). By further applying 1020, it is possible to improve coding efficiency.

The 3D video codec described with reference to FIG. 10 may be applied to the entire encoding / decoding process, and may be applied to each step of encoding / decoding.

[Detailed method]

11 is a conceptual diagram schematically illustrating a merge motion method according to an embodiment of the present invention. The merge motion method illustrated in FIG. 11 may be performed by an HEVC-based 3D video codec (3D-HEVC), and 3D-HEVC may be implemented based on the 3D video codec of FIG. 10 described above.

The merge motion method according to an embodiment of the present invention derives a spatial merge motion candidate and a temporal merge motion candidate for the current PU, and information about the current PU (eg, the current PU). Additional merge motion candidates may be additionally derived based on the view information of the UE, image type information of the current PU, etc.), and a merge candidate list for the current PU may be configured based on the derived merge motion candidates. have.

Referring to FIG. 11, in the merge motion method according to an embodiment of the present invention, an input includes information on whether current PU information (or current image information), a current PU image is a normal image, or a depth map image. / Depth information), view information of the current PU (ViewID information), and an output is a merge motion candidate list for the current PU.

In the merge motion method of the present invention, a step of “constructing a basic merge motion list” 1110 is basically performed on the current PU to output a “basic merge motion candidate list”. For example, the “basic merge motion list construction” 1110 may use the merge motion candidate list construction method in HEVC as it is.

Next, in the merge motion method of the present invention, according to the information (Texture / Depth information) on whether the current PU image is a normal image or a depth map image, and the view information (ViewID information) of the current PU, “additional merge motion” is performed. List construction ”1120 may be additionally performed. In this case, in step 1120 of configuring an additional merge motion list, an input is a “default merge motion candidate list” output in step “constituting a basic merge motion list” 1110, and an output is an extended merge motion candidate list. . The “additional merge motion list construction” step 1120 may be performed on the general image and the depth maps for the dependent viewpoints (eg, View 1 and View 2).

More details on the merge motion method according to the present invention will be described later.

Referring to FIG. 12, an apparatus (hereinafter, referred to as a merge movement apparatus) 1200 that performs a merge movement method according to an embodiment of the present invention may include a basic merge movement list construction module 1210 and an additional merge movement list construction module 1220. ).

Inputs of the merge motion device 1200 are spatial merge motion candidates, temporal merge motion candidates, and additional merge motion candidates. The output of the merge motion device 1200 is a basic merge motion candidate list in the case of a normal image for an independent view, and an extended merge motion candidate list in the case of a normal image and a depth map for a dependent view.

As described above, the independent view refers to a view that can be encoded independently regardless of other views, and may be a base view. The dependent view refers to a view that is encoded by referring to an independent view. For convenience of description, for example, the independent view may be a view 0, and the dependent view is described as including a view 1 and a view 2.

The basic merge motion list construction module 1210 may construct a basic merge motion candidate list by deriving a spatial merge motion candidate and a temporal merge motion candidate for the current PU.

The spatial merge motion candidate may be derived from neighboring blocks A, B, C, D, E spatially adjacent to the current PU, as shown in FIG.

The basic merge motion list constructing module 1210 determines whether neighboring blocks A, B, C, D, and E are available, and uses the motion information of the available neighboring blocks for the spatial merge motion candidate for the current PU. Can be determined. At this time, when determining whether the neighboring blocks (A, B, C, D, E) are available, whether or not the availability of the neighboring blocks (A, B, C, D, E) in a predetermined order or in any order You can judge. For example, it may proceed in the order of A, B, C, D, and E.

As shown in FIG. 8, the temporal merge motion candidate is a co-located block (col block) (H, M) in a co-located picture (col picture) with respect to the current PU. Can be derived from. The block H of the same position may be a PU block located at the bottom right based on the block X ′ of the position corresponding to the current PU in the picture of the same position. The block M of the same position may be a PU block located at the center of the X ′ block based on the block X ′ of the position corresponding to the current PU in the picture of the same position.

The basic merge motion list construction module 1210 may determine whether the blocks H and M at the same location are available and determine the motion information of the blocks at the same location as the temporal merge motion candidate for the current PU. In this case, the order of determining availability of the blocks H and M in the same position may be in the order of H blocks, M blocks, or vice versa.

The additional merge motion list construction module 1220 may determine whether the current PU image is a normal image or a depth map image (Texture / Depth information), and a current PU based on view information (ViewID information) of the current PU. Additional merge motion candidates may be derived to construct an extended merge motion candidate list.

If the image of the current PU is a normal image and a depth map for the dependent view (eg, view 1 and view 2), the additional merge motion list construction module 1220 additionally relies on the current PU. A process of constructing a merge motion candidate list for the general image and the depth map for the in-view may be performed.

In this case, the inputs of the additional merge motion list constructing module 1220 are the basic merge motion candidate list configured by the basic merge motion list constructing module 1210 and the additional merge motion candidates F and G. The output of the additional merge motion list construction module 1220 is an extended merge motion candidate list.

In order to construct a merge motion candidate list for a general image and a depth map for a dependent view (eg, view 1 and view 2), as described above, according to an embodiment of the present invention The merge motion device can reduce the implementation complexity of the hardware by implementing only additional partial modules without implementing new modules. That is, the "merge motion candidate list construction" module for the base layer (for example, the general image of the view 0), which has already been implemented in the form of a hardware chip, is reused as it is and the enhancement layer (for example, the view 1 (View 1). ) And the complexity of hardware implementation can be reduced.

13 is a conceptual diagram illustrating a method of constructing a merge motion candidate list of FIGS. 11 and 12 according to an embodiment of the present invention. The method of constructing the merge motion candidate list of FIG. 13 may be performed by the 3D video codec of FIG. 10 or the merge motion apparatus of FIG. 12.

Referring to FIG. 13, in order to construct a merge motion candidate list according to an embodiment of the present invention, an input includes current PU information and information on whether a current PU image is a normal image or a depth map image (Texture / Depth information). It is view information (ViewID information) for the current PU, and the output is a merge motion candidate list for the current PU.

First, a basic merge motion candidate list 1310 is configured for the current PU. For example, the basic merge motion candidate list may use the merge motion candidate list construction method in HEVC as it is, and may be configured based on the spatial merge motion candidate and the temporal merge motion candidate for the current PU as described above.

Next, an extended merge motion candidate list 1320 is configured based on information on whether the current PU image is a normal image or a depth map image (Texture / Depth information) and view information on the current PU image (ViewID information). . In this case, the extended merge motion candidate list may be configured for general image and depth maps for dependent viewpoints (eg, view 1 and view 2). As described above, the additional merge motion candidate list may be configured. Candidates may be added.

If the current PU is a general image of an independent view (eg, View 0), a basic merge motion candidate list may be output. Otherwise, when the current PU is a general image and depth maps for a dependent view (eg, view 1 and view 2), an extended merge motion candidate list may be output.

In this case, the number of candidates in the extended merge motion candidate list may be larger than the number of candidates in the basic merge motion candidate list.

Referring to FIG. 14, an extended merge motion candidate list according to an embodiment of the present invention may include an additional merge motion candidate (eg, motion information F), which is additional motion information, in a first index (or arbitrary) of the extended merge motion candidate list. Can be inserted into the item corresponding to the position of.

At this time, before inserting the additional merge motion candidate, the additional merge motion candidate (eg, motion information F) and the first merge motion candidate (eg, motion information A) of the basic merge motion candidate list are compared with each other, and the two candidates are not the same. If not, an additional merge motion candidate (eg, motion information F) may be inserted into the first item of the extended merge motion candidate list, and vice versa. For example, when comparing motion information of two candidates (eg, motion information F and A), if the difference between the motion vectors of the two candidates is within an arbitrary threshold, an additional merged motion candidate (eg, motion information F) is selected. It may not be inserted into the extended merge motion candidate list, and vice versa. Alternatively, when the reference images of the two candidates are not the same, an additional merge motion candidate (eg, motion information F) may be inserted into the extended merge motion candidate list, and vice versa.

Referring to FIG. 15, the extended merge motion candidate list according to an embodiment of the present invention inserts additional merge motion candidate (eg, motion information F), which is additional motion information, into a first item of the extended merge motion candidate list, and another An additional merge motion candidate (eg, motion information G), which is additional motion information, may be inserted into a third item (or an item corresponding to an arbitrary position) of the extended merge motion candidate list.

At this time, before inserting the additional merge motion candidate, the original items (first item and third item) in the basic merge motion candidate list and the additional merge motion candidates (eg, motion information F and G) are compared with each other, and the two candidates (eg, If motion information A and F or motion information C and G) are not the same, additional merge motion candidates may be inserted into the first and third items of the extended merge motion candidate list, and vice versa. For example, when comparing motion information of two candidates (eg, motion information A and F, or motion information C and G), if the difference between the motion vectors of the two candidates is within a certain threshold, the additional merge motion candidate ( For example, the motion information F or G) may not be inserted into the extended merge motion candidate list, and vice versa. Alternatively, when the reference images of the two candidates are not the same, additional merge motion candidates (eg, motion information F or G) may be inserted into the extended merge motion candidate list, and vice versa.

[Additional Methods]

10 to 13 may be variously applied as follows.

1. In one embodiment, the basic encoder (or basic module) is applied to the general image and depth maps for view 1 and view 2 as well as the general image for view 0. can do.

2. In another embodiment, the base encoder (or base module) may be applied only to small blocks of high complexity (eg, 8x8 units or arbitrary block sizes). In this case, for the general image and the depth maps of the view 1 and the view 2, the data is encoded using a basic encoder (or a basic module) below the small block size and larger than the small block size. Can be encoded using a basic encoder (or base module) and a partial encoder (or extension module). Here, the basic encoder (or basic module) may perform the step of “constituting the basic merge motion list” in FIGS. 11 and 13, and the partial encoder (or the extension module) may configure the “additional merge motion list” in FIGS. 11 and 13. Step can be performed.

16 is a flowchart schematically illustrating a method of constructing a merge motion candidate list according to an embodiment of the present invention. The method of FIG. 16 may be performed by the apparatus shown in FIGS. 10 and 12 described above, or may be performed by applying to 3D-HEVC. For convenience of explanation, the method of FIG. 16 is described as being performed by the merge motion device.

Referring to FIG. 16, the merge motion apparatus adds basic merge motion candidates to the merge motion candidate list for the current PU (S1600).

Here, the basic merge motion candidates may include a spatial merge motion candidate and a temporal merge motion candidate for the current PU as described above, and may be candidates for a general image of an independent view.

The merge motion device determines whether the current picture including the current PU is a depth map or a dependent view (S1610).

If the current picture including the current PU is a depth map or a dependent view, the merge motion device adds an extended merge motion candidate to the merge motion candidate list (S1620).

Here, the extended merge motion candidates may be candidates for a depth map or an image (normal image and depth map) of a dependent view.

The technical formats of Tables 1-6 are Joint Collaborative Team on 3D Video Coding Extensions of ITU-T SG 16, currently being standardized jointly by the Moving Picture Experts Group (MPEG) and the Video Coding Experts Group (VCEG). WP 3 and ISO / IEC JTC 1 / SC 29 / WG 11).

Table 1 shows an example of the input and output of the process including the addition of the existing extended merge motion candidate, Table 2 shows an example of the input and output of the process including the addition of the extended merge motion candidate according to an embodiment of the present invention Indicates.

Table 1

TABLE 2

In the process of adding an extended merge motion candidate according to an embodiment of the present invention shown in Table 2, a merge motion candidate list (mergCandList) and a flag (availableFlagN) indicating whether a default merge motion candidate has been added are used as additional inputs. .

In Table 2, N is A0, A1, B0, B1, B2, meaning that it is a candidate for left, above, above-right, bottom-left, above-left positions. Can be replaced with In the received merge motion candidate list (mergCandList), basic merge motion candidates are stored in an arbitrary order according to a conventional method. For example, the left candidate, the top candidate, the top right candidate, the bottom left candidate, the top left candidate, the temporal (prediction) candidate, the combined bi-predictive candidate, and the candidate with zero motion may be stored. The output is a merge motion candidate list in which additional work on extended merge motion candidates is completed.

Table 3 shows an existing extended merge motion candidate addition process, and Table 4 shows an extended merge motion candidate addition process according to an embodiment of the present invention.

Table 4 processes the list to which the basic merge motion candidates are already added, and thus processes only the processes for the extended merge motion candidates. Therefore, in the existing 3D-HEVC, it can be omitted so that the processing for the merge motion candidates used in the HEVC is not repeatedly implemented.

TABLE 3

Table 4

In the process of adding extended merge motion candidates according to an embodiment of the present invention described in Table 4, when deriving a combined bi-predictive candidate, the existing method using additional extended motion merge candidates is not used. By using a method of mixing only the basic motion merging candidates used in the HEVC standard, the same coding efficiency can be obtained while reducing the operation compared to the conventional method.

An extended merge motion candidate addition process according to an embodiment of the present invention described in Table 4 will be described in detail with reference to FIGS. 17A to 17F.

17A, 17B, 17C, 17D, 17E, and 17F are flowcharts illustrating a method of adding an extended merge motion candidate to a merge motion candidate list according to an embodiment of the present invention.

17A to 17F are constructed based on the process of adding the extended merge motion candidate of Table 4 described above. The method of FIGS. 17A to 17F may be performed by the apparatus shown in FIGS. 10 and 12 described above, or may be applied to 3D-HEVC.

1. The flag iv_mv_pred_flag [nuh_layer_id] means whether the current PU can attempt inter view prediction. If the flag iv_mv_pred_flag [nuh_layer_id] is 1, an inter-view merge candidate (IvMC), an inter-view disparity merge candidate (IvDC), and a shifted inter-view merge candidate (shifted inter-view merge) The availability of candidate (IvMCShift) is checked and stored in the flags availableFlagIvMC, availableIvMCShift, and availableFlagIvDC, respectively, to derive motion information of available candidates.

2. The flag view_synthesis_pred_flag [nuh_layer_id] indicates whether the current PU can try view synthesis prediction. If the flag view_synthesis_pred_flag [nuh_layer_id] is 1, whether or not the inter-view synthesis merge candidate is available is stored in the flag availableFlagVSP, and if available, the motion information is derived.

3. The flag mpi_flag [nuh_layer_id] indicates whether the current PU is a depth map and can attempt motion prediction from a texture block. If the flag mpi_flag [nuh_layer_id] is 1, whether the texture merge candidate is available or not is stored in the flag availableFlagT, and if available, the motion information is derived.

4. To date, a merge motion candidate list (mergeCandList) consisting of only basic merge motion candidates and the inter-view prediction flag (mergeCandIsVspFlag) for each candidate are reconstructed as follows.

a. numMergeCand is the total number of merge motion candidates, numA1B1B0 is the number of candidates corresponding to the left, upper, and right-right positions among the default merge motion candidates, and numA0B2 is the bottom left of the default merge motion candidates. -left), the number of candidates corresponding to the above-left position. Initialize numMergeCand, numA1B1B0, numA0B2 to 0.

b. It is determined whether the left candidate A1 is available. If left candidates are available, numA1B1B0 is increased by one. In addition, it stores as a flag whether the left candidate uses View Synthesis Prediction (VSP).

c. It is determined whether the award candidate B1 is available. If phase candidates are available, numA1B1B0 is increased by one. In addition, it stores as a flag whether the prize candidate used the VSP.

d. It is determined whether the idol candidate B0 is available. If the idol candidate is available, numA1B1B0 is increased by one. In addition, it stores as a flag whether the idol candidate used the VSP.

e. It is determined whether the lower left candidate A0 is available. If left lower candidate is available, numA0B2 is increased by one. It also stores as a flag whether the lower left candidate has used VSP.

f. It is determined whether the upper left candidate B2 is available. If the top left candidate is available, numA0B2 is increased by one. It also stores as a flag whether the upper left candidate used the VSP.

g. If the flag availableFlagT is 1, the following process is performed.

Set pruneFlagA1 and pruneFlagB1 to 0.

If the left candidate is available and the motion information of the texture candidate is the same as the left candidate, pruneFlagA1 is set to 1.

If the prize candidate is available and the motion information of the texture candidate is the same as the prize candidate, pruneFlagB1 is set to one.

If pruneFlagA1 and pruneFlagB1 are both 0, it creates a new space at numMergeCand position in the list. In this case, creating a new space means moving all values from the numMergeCand position in the list one space to the right.

-In other cases, carry out the following procedure.

Decreases numA1B1B0 by 1

If both left and top candidates are available and pruneFlagA1 is 0, the second value in the list is set as the first value.

Set the first value in the list as a texture candidate and increment numMergeCand by one.

h. If the flag availableFlagIvMC is 1, the following process is performed.

Set pruneFlagA1, pruneFlagB1, pruneFlagT, and addIvMC to 0.

If the current picture is a texture (DepthFlag = 1), the following process is performed.

If the left candidate is available and the motion information of the inter-view candidate is equal to the left candidate, pruneFlagA1 is set to 1.

If the prize candidate is available and the motion information of the inter-view candidate is the same as the prize candidate, pruneFlagB1 is set to one.

If the current picture is a depth map (DepthFlag = 0), a texture candidate is available, and the motion information of the inter-view candidate is the same as the texture candidate, pruneFlagT is set to 1.

If pruneFlagA1, pruneFlagB1, and pruneFlagT are all zeros, a new space is created at the numMergeCand position in the list, and addIvMC is set to one.

Otherwise, if the current picture is a texture (DepthFlag = 0) and the motion information of the inter-view candidate is equal to the left candidate or the top candidate, the following process is performed.

Decreases numA1B1B0 by 1

Set addIvMC to 1.

If addIvMC is 1, set the first value in the list as a texture candidate and increment numMergeCand by one.

i. Add numA1B1B0 to numMergeCand.

j. If the flag availableFlagIvDC is 1, the following process is performed.

The motion information of the disparity merging candidate (IvDC) is compared with the available left and top candidates. As a result, if the motion information of the left candidate and the top candidate is different from each other, a new space is created at the numMergeCand position of the merge list and then a parallax merge candidate (IvDC) is added.

-increase numMergeCand by 1

k. If the flag availableFlagVSP is 1, the brightness compensation flag (ic_flag) is 0, the residual error signal prediction coefficient (iv_res_pred_weight_idx) is 0, and numMergeCand is less than 5 + the number of additional merge candidates (NumExtraMergeCand), then the start synthesis merge candidates are added to the list. Add and increase numMergeCand3DV, numMergeCand by 1.

l. Add numA0B2 to numMergeCand.

m. If the flag availableFlagIvMCShift is 1 and numMergeCand is smaller than the maximum length of the list (for example, 6), the following process is performed.

If the inter-view merge candidate (IvMC) is available, compare the inter-view merge candidate with the moved inter-view merge candidate (IvMCShift), and if they are different, create a new space at the numMergeCand position in the list, and then move the inter-view merge candidate. Add

-increase numMergeCand by 1

n. If the flag availableFlagIvMCShift is 0, the current PU is not located on the depth map, and numMergeCand is smaller than the maximum length (for example, 6) of the list, the following process is performed.

If a moved parallax prediction candidate (IvDCShift) is available, a new space is created at the numMergeCand position of the list, and then the merged inter-view merge candidate is added.

-increase numMergeCand by 1

In the processes h and j, when comparing the inter-view merge candidate and the disparity merge candidate with existing candidates in the list, the complexity may be reduced by using a method of comparing the candidates with some of them without comparing them with all the candidates in the list. For example, only the left candidate and the top candidate may be used for comparison.

Table 5 shows an example of a process of deriving an existing mixed bidirectional prediction candidate, and Table 6 shows an example of reusing a process of deriving a HEVC mixed bidirectional prediction candidate in 3D-HEVC according to an embodiment of the present invention.

Table 5

Table 6

In the process of deriving the mixed bi-prediction candidate according to the embodiment of the present invention described in Table 6, the existing HEVC process is used as it is, without creating a new module only for the dependent view or depth map as shown in Table 5. Therefore, in the method of Table 6, all the procedures in Table 5 are removed.

When the mixed bidirectional prediction candidate derivation process according to the embodiment of the present invention described in Table 6 is implemented, the result shown in Table 7 may be obtained. The video sequences used in the experiments are test video sequences officially used in the JCT-3V standardization.

Table 7 shows the results of the coding efficiency and coding time comparison between the existing method (method of Table 5) and the method proposed in the present invention (method of Table 6).

TABLE 7

As shown in Table 7, even though the HEVC mixed bidirectional prediction candidate derivation process is reused in 3D-HEVC according to an embodiment of the present invention, the comparison result shows that the bitrate increase is less than 0.1% compared to the conventional method. It shows coding efficiency.

The above-described method may use High Efficiency Video Coding (HEVC), which is currently being jointly standardized by a Moving Picture Experts Group (MPEG) and a Video Coding Experts Group (VCEG). Therefore, the above-described method may vary the application range according to the block size, the coding unit depth (CU) depth, or the transform unit (TU) depth, as shown in the example of Table 8. The variable that determines the coverage (i.e., size or depth information) may be set so that the encoder and decoder use a predetermined value, or may cause the encoder to use a predetermined value according to a profile or level, and the encoder may bite the variable value. If described in the stream, the decoder may obtain this value from the bitstream and use it. When varying the application range according to the CU depth, as shown in Table 8, Method A) Applies only to a depth above a given depth, Method B) Applies only to a given depth below, Method C) Applies only to a given depth There may be a way.

According to Table 8, when a given CU (or TU) depth is 2, the methods of the present invention can be applied. In Table 8, the O marks apply to that depth and the X marks do not apply to that depth.

Table 8

When the methods of the present invention are not applied to all depths, they may be represented by using an arbitrary flag, or may be represented by signaling a value greater than one of the maximum value of the CU depth as a CU depth value indicating an application range. have.

As an additional feature of the present invention, whether the above-described methods of the present invention are applied may be included in the bitstream and signaled. For example, the information on whether the above-described methods of the present invention are applied may be signaled by being included in a syntax of a sequence parameter set (SPS), a picture parameter set (PPS), and a slice header.

Table 9 shows an example of a method of signaling using SPS whether or not the above-described methods of the present invention are applied.

Table 9

Table 10 shows an example of a method of signaling using PPS whether or not the above-described methods of the present invention are applied.

Table 10

Table 11 shows an example of a method of signaling using a slice header whether or not the above-described methods of the present invention are applied.

Table 11

Table 12 shows another example of a method of signaling using a slice header whether or not the above-described methods of the present invention are applied.

Table 12

In Tables 9 to 12, “reuse_enabled_flag” indicates whether or not the above-described methods of the present invention are applied. Here, "reuse_enabled_flag" becomes "1" when the above-described methods of the present invention are applied, and "reuse_enabled_flag" becomes "0" when the above-described methods of the present invention are not applied. The reverse is also possible.

"Reuse_disabled_info" is a syntax that is activated when the above-described methods of the present invention are applied (or "reuse_enabled_flag" is true). This is the depth of a CU (or the size of a CU or the size or sub-block of a CU). Whether or not the above-described methods of the present invention are applied according to the size of the macro block or the size of the block).

For example, when "reuse_disabled_info" is "0", the above-described methods may be applied to all block sizes. When "reuse_disabled_info" is "1", the above-described methods may be applied only to a unit larger than a 4x4 block size.

As another example, when "reuse_disabled_info" is "2", the above-described methods may be applied only to a unit larger than 8x8 block size. Or vice versa. For example, when "reuse_disabled_info" is "1", the above-described methods may be applied only to a unit smaller than a 4x4 block size. The usage of the “reuse_disabled_info” syntax can be variously applied.

Using the above method, it is possible to determine whether to apply in units of pictures (or frames). In addition, the method of the present specification may be applied only to a P picture (or a frame), and the method of the present specification may be applied only to a B picture (or a frame).

The above-described methods of the present invention can be applied not only to the 3D video codec but also to the scalable video codec. For example, the encoding / decoding module used in the base layer of the scalable video codec may be applied to the enhancement layer as it is, and then the enhancement layer may be encoded / decoded using the partial encoding / decoding module. As another example, apply the basic merge motion list module used in the base layer of the scalable video codec to the enhancement layer, and then construct the basic merge motion candidate list, and then additionally use the additional merge motion list module. By reconfiguring (changing) the "basic merge motion candidate list", an "extended merge motion candidate list" for the enhancement layer can be constructed.

The method of FIG. 18 may be performed by the apparatus shown in FIGS. 10 and 12 described above, or may be applied to 3D-HEVC. For convenience of explanation, the method of FIG. 18 is described as being performed by the merge motion device.

Referring to FIG. 18, the merge motion apparatus derives a basic merge motion candidate for the current PU and constructs a merge motion candidate list based on the derived basic merge motion candidate (S1800).

As described above, the basic merge motion candidate may include a spatial merge motion candidate and a temporal merge motion candidate for the current PU.

For example, as shown in FIG. 8, the merge motion device includes a left block, an above block, an right-right block, a bottom-left block, and located spatially adjacent to a current PU; A spatial merge motion candidate may be derived from at least one of the above-left blocks. The merge motion device then merges temporally from a co-located block (eg, bottom right block, center block) in a co-located picture for the current PU. The motion candidate can be derived.

As described above, the merge motion apparatus may configure the merge motion candidate list based on availability of the spatial merge motion candidate and the temporal merge motion candidate.

If the current PU is a depth map or a dependent view, the merge motion device derives an extended merge motion candidate for the current PU (S1810).

The extended merge motion candidate refers to a merge motion candidate used for prediction of the dependent view image or the depth map image. The extended merge motion candidate may include at least one of an inter-view merge candidate (IvMC), a view synthesis prediction merge candidate, and a texture merge candidate.

For example, an inter-view disparity merge candidate (IvDC) and a shifted inter-view merge candidate (IvMCShift) depending on whether the current PU performs inter-view prediction. ), A shifted inter-view disparity merge candidate (IvDCShift) may be derived. A view synthesis merging candidate may be derived according to whether the current PU performs view synthesis prediction. A texture merge candidate may be derived according to whether the depth map of the current PU performs motion prediction from the texture block.

The merge motion apparatus may finally reconstruct the merge motion candidate list by adding the derived extended merge motion candidate to the merge motion candidate list (S1820).

If the extended merge motion candidate to be added is not the same as the default merge motion candidate in the merge motion list, the merge motion device adds the extended merge motion candidate to the merge motion candidate list. The extended merge motion candidate may be added at any position in the merge motion candidate list (eg, the first item in the list).

In addition, when the combined number of extended merge motion candidates added to the merge motion candidate list and the number of basic merge motion candidates is smaller than the maximum candidate number of the merge motion candidate list, the merge motion device adds the extended merge motion candidate to the merge motion candidate list. Add.

For example, when the depth map of the current PU performs motion prediction from a texture block, a texture merge candidate may be derived. In this case, when the derived texture merge candidate is not the same as the default merge motion candidate in the merge motion list, the texture merge candidate may be added to the first item in the merge motion list.

If the current PU performs inter-view prediction, an inter-view merge candidate may be derived. In this case, when the derived inter-view merge candidate is not the same as the default merge motion candidate in the merge motion list, the inter-view merge candidate may be added to the first item in the merge motion list.

When the current PU performs view synthesis prediction, a view synthesis merge candidate may be derived. In this case, when the value of the number of extended merge motion candidates added to the merge motion candidate list and the number of basic merge motion candidates is smaller than the maximum candidate number of the merge motion candidate list, the derived view synthesis merge candidate is added to the merge motion candidate list. Can be.

Since the detailed process of adding the extended merge motion candidate to the merge motion candidate list has been described in detail in the embodiments of the present specification, a detailed description thereof will be omitted.

The motion information on the current PU may be obtained based on the merge motion candidate list described above, and the prediction sample value of the current PU may be obtained by performing prediction on the current PU using the motion information.

Accordingly, the encoder may obtain a residual sample value of the current PU based on the predicted sample value of the current PU, transform, quantize, and entropy encode the residual sample value and transmit the same to the decoder. The decoder may obtain a reconstructed sample value of the current PU based on the predicted sample value of the current PU and the residual sample value of the current PU transmitted by the encoder.

In the above embodiments, the methods are described based on a flowchart as a series of steps or blocks, but the present invention is not limited to the order of steps, and certain steps may occur in a different order or at the same time than other steps described above. Can be. Also, one of ordinary skill in the art appreciates that the steps shown in the flowcharts are not exclusive, that other steps may be included, or that one or more steps in the flowcharts may be deleted without affecting the scope of the present invention. I can understand.

The above description is merely illustrative of the technical idea of the present invention, and those skilled in the art to which the present invention pertains may make various modifications and changes without departing from the essential characteristics of the present invention. Therefore, the embodiments disclosed in the present invention are not intended to limit the technical idea of the present invention but to describe the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. The protection scope of the present invention should be interpreted by the claims, and all technical ideas within the equivalent scope should be interpreted as being included in the scope of the present invention.

Claims

In a video decoding method comprising a plurality of views,

Constructing a merge motion candidate list by deriving a default merge motion candidate for a current prediction unit (PU);

Deriving an extended merge motion candidate for the current PU when the current PU is a depth map or a dependent view; And

Adding the extended merge motion candidate to the merge motion candidate list,

In the step of adding the extended merge motion candidate,

And if the extended merge motion candidate is not the same as the basic merge motion candidate in the merge motion candidate list, adding the extended merge motion candidate to the merge motion candidate list.
The method of claim 1,

In the step of adding the extended merge motion candidate,

And adding the extended merge motion candidate to an arbitrary position in the merge motion candidate list.
The method of claim 1,

The default merge motion candidate is

At least one of a spatial merge motion candidate and a temporal merge motion candidate of the current PU,

The spatial merge motion candidate,

At least one of a left block, an above block, an above-right block, a bottom-left block, and an above-left block located spatially adjacent to the current PU Include,

The temporal merge motion candidate is

And a co-located block in a co-located picture for the current PU.
The method of claim 1,

The extended merge motion candidate is

A video decoding method comprising at least one of an inter-view merge candidate (IvMC), a view synthesis prediction merge candidate, and a texture merge candidate.
The method of claim 4, wherein

In the step of deriving the extended merge motion candidate,

And deriving the inter-view merging candidate according to whether the current PU performs inter-view prediction.
The method of claim 4, wherein

In the step of deriving the extended merge motion candidate,

And deriving the view synthesis merging candidate according to whether the current PU performs view synthesis prediction.
The method of claim 4, wherein

In the step of deriving the extended merge motion candidate,

And deriving the texture merge candidate according to whether the depth map of the current PU performs motion prediction from a texture block.
The method of claim 2,

And wherein the arbitrary position is a first index in the merge motion candidate list.
The method of claim 1,

In the step of adding the extended merge motion candidate,

If the sum of the number of extended merge motion candidates added to the merge motion candidate list and the number of basic merge motion candidates is smaller than the maximum number of candidates in the merge motion candidate list, the extended merge motion candidate is added to the merge motion candidate list. And a video decoding method.
The method of claim 7, wherein

When the depth map of the current PU performs motion prediction from a texture block,

In the step of adding the extended merge motion candidate,

And when the texture merge candidate is not the same as the basic merge motion candidate in the merge motion candidate list, adding the texture merge candidate to a first item in the merge motion candidate list.
The method of claim 5,

If the current PU performs inter-view prediction,

In the step of adding the extended merge motion candidate,

And when the inter-view merge candidate is not the same as the basic merge motion candidate in the merge motion candidate list, adding the inter-view merge candidate to a first item in the merge motion candidate list.
The method of claim 6,

If the current PU performs view synthesis prediction,

In the step of adding the extended merge motion candidate,

If the value obtained by adding the number of extended merge motion candidates added to the merge motion candidate list and the number of basic merge motion candidates is smaller than the maximum candidate number of the merge motion candidate list, add the view synthesis merge candidate to the merge motion candidate list. And a video decoding method.
In a video decoding apparatus including a plurality of views,

A basic merge motion list constructing module for constructing a merge motion candidate list by deriving a default merge motion candidate for a current PU; And

An additional merge motion list constructing module for deriving an extended merge motion candidate for the current PU and adding the extended merge motion candidate to the merge motion candidate list when the current PU is a depth map or a dependent view point; ,

The additional merge motion list configuration module,

And if the extended merge motion candidate is not the same as the basic merge motion candidate in the merge motion candidate list, adding the extended merge motion candidate to the merge motion candidate list.
In a video encoding method comprising a plurality of views,

Constructing a merge motion candidate list by deriving a default merge motion candidate for a current prediction unit (PU);

Deriving an extended merge motion candidate for the current PU when the current PU is a depth map or a dependent view; And

Adding the extended merge motion candidate to the merge motion candidate list,

In the step of adding the extended merge motion candidate,

And if the extended merge motion candidate is not the same as the basic merge motion candidate in the merge motion candidate list, adding the extended merge motion candidate to the merge motion candidate list.
In a video encoding apparatus including a plurality of views,

A basic merge motion list constructing module for constructing a merge motion candidate list by deriving a default merge motion candidate for a current PU; And

An additional merge motion list constructing module for deriving an extended merge motion candidate for the current PU and adding the extended merge motion candidate to the merge motion candidate list when the current PU is a depth map or a dependent view point; ,

The additional merge motion list configuration module,

And if the extended merge motion candidate is not the same as the basic merge motion candidate in the merge motion candidate list, adding the extended merge motion candidate to the merge motion candidate list.