WO2013072965A1

WO2013072965A1 - Video processing device and video processing method

Info

Publication number: WO2013072965A1
Application number: PCT/JP2011/006403
Authority: WO
Inventors: 仁尾　寛
Original assignee: パナソニック株式会社
Priority date: 2011-11-17
Filing date: 2011-11-17
Publication date: 2013-05-23

Abstract

A first motion vector is detected from a first video signal by a first motion vector detection unit. First reliability information, which indicates the reliability of the detected first motion vector, is generated by a first reliability generation unit. In addition, a second motion vector is detected from a depth signal by a second motion vector detection unit. Second reliability information, which indicates the reliability of the detected second motion vector, is generated by a second reliability generation unit. A third motion vector is generated from the first and second motion vectors by a motion vector generation unit on the basis of the generated first and second reliability information. First image data and depth data for a frame to be interpolated are generated by a frame interpolation unit using the generated third motion vector.

Description

Video processing apparatus and video processing method

The present invention relates to a video processing apparatus and a video processing method.

In a video processing device, a frame rate conversion process is performed on a video signal for displaying a video (for example, Patent Document 1). In the frame rate conversion process, for example, a motion vector is detected from image data of two frames that are continuous on the time axis, and image data of an interpolation frame is generated using the detected motion vector.
Japanese Patent Laid-Open No. 62-175080 JP 2009-3507 A

In recent years, there are display devices that can present stereoscopic video (three-dimensional video) to the user. Specifically, a video to be viewed with the left eye (hereinafter referred to as a left-eye video) and a video to be viewed with the right eye (hereinafter referred to as a right-eye video) are separately displayed, so that a stereoscopic video (three-dimensional) is displayed to the user. Video). In this case, for example, the frame rate conversion process is performed on each of the video signal for displaying the left-eye video and the video signal for displaying the right-eye video.

However, the above frame rate conversion process may not be able to generate image data of an interpolation frame with high accuracy. For example, when a moving image and a telop such as a character are displayed in an overlapping manner, if a portion of the moving image and the telop have substantially the same brightness and color, one of the moving images based on the image data is displayed. It is difficult to distinguish between parts and telops. Therefore, the motion vector is not accurately detected, and appropriate image data of the interpolation frame cannot be generated. As a result, the quality of the presented 3D video is degraded.

An object of the present invention is to provide a video processing apparatus and a video processing method capable of performing frame rate conversion of a 3D video without degrading the quality of the 3D video.

A video processing apparatus according to the present invention is a video processing apparatus that performs processing for frame rate conversion of a three-dimensional video composed of a plurality of frames, and one of the two-dimensional video for the left eye and the right eye. A first motion vector detector configured to detect a first motion vector from a first video signal including a plurality of frames of first image data for displaying a video; and a first motion vector detection A first reliability generation unit configured to generate first reliability information representing the reliability of the first motion vector detected by the unit, and a plurality of presentation positions of the 3D video in the depth direction A second motion vector detection unit configured to detect a second motion vector from a depth signal including the depth data of the frame, and a second motion vector detected by the second motion vector detection unit A second reliability generation unit configured to generate second reliability information representing the reliability of the vector, first reliability information generated by the first reliability generation unit, and second The first motion vector detected by the first motion vector detector and the second motion vector detected by the second motion vector detector based on the second reliability information generated by the reliability generator Using a motion vector generation unit configured to generate a third motion vector from the motion vector and a third motion vector generated by the motion vector generation unit, the first of at least one frame of the three-dimensional video A frame interpolation unit configured to generate first image data and depth data of a frame to be interpolated from the image data and the depth data.

According to the present invention, it is possible to perform frame rate conversion of 3D video without degrading the quality of 3D video.

FIG. 1 is a block diagram showing a configuration of a video processing apparatus according to the present embodiment. FIG. 2 is a diagram showing one frame of a left-eye image and a right-eye image FIG. 3 shows a stereoscopic image. FIG. 4 is a diagram for explaining the depth data. FIG. 5 is a diagram showing the movement of a stereoscopic image. FIG. 6 is a diagram for explaining a method of calculating the reliability of the left eye motion vector FIG. 7 is a diagram for explaining a method of calculating the reliability of the depth motion vector. FIG. 8 is a flowchart showing the operation of the video processing apparatus. FIG. 9 is a block diagram showing a configuration of a video processing apparatus according to another embodiment of the present invention.

Hereinafter, a video processing apparatus and a video processing method according to an embodiment of the present invention will be described with reference to the drawings.

(1) Configuration of Video Processing Device FIG. 1 is a block diagram showing the configuration of the video processing device according to the present embodiment. As shown in FIG. 1, the video processing apparatus 100 includes a signal conversion unit 1, a left eye motion analysis unit 2, a depth motion analysis unit 3, a comparison unit 4, a selection unit 5, a frame interpolation unit 6, and a signal conversion unit 7.

The video processing apparatus 100 according to the present embodiment is provided with a left-eye video signal LSa and a right-eye video signal RSa for presenting a stereoscopic video (3D video). The left-eye video signal LSa includes image data (hereinafter referred to as left-eye image data) for displaying each frame of a two-dimensional video (hereinafter referred to as a left-eye video) to be viewed with the left eye. The right-eye video signal RSa includes image data (hereinafter referred to as right-eye image data) for displaying each frame of a two-dimensional video (hereinafter referred to as right-eye video) to be viewed with the right eye. The left eye image and the right eye image are images of a common subject observed from the left eye viewpoint and the right eye viewpoint. One frame of stereoscopic video is composed of one frame of right-eye video and one frame of left-eye video.

The left-eye video signal LSa and the right-eye video signal RSa are input to the signal conversion unit 1. The signal converter 1 gives the input left-eye video signal LSa to the left-eye motion analyzer 2 and the frame interpolator 6. Further, the signal conversion unit 1 generates a depth signal DSa based on the input left-eye video signal LSa and right-eye video signal RSa, and provides the generated depth signal DSa to the depth motion analysis unit 3 and the frame interpolation unit 6. The depth signal DSa includes depth data corresponding to each frame of the stereoscopic video. Details of the depth data will be described later.

The left-eye motion analysis unit 2 detects a motion vector (hereinafter referred to as a left-eye motion vector) from the left-eye video data of the left-eye video signal LSa given from the signal conversion unit 1, and uses the detected left-eye motion vector as a vector signal LVS. This is given to the selector 5. Further, the left eye motion analysis unit 2 calculates a left eye motion vector reliability value representing the reliability of the detected left eye motion vector, and gives the calculated left eye motion vector reliability value to the comparison unit 4 as a reliability signal LRS.

The depth motion analysis unit 3 detects a motion vector (hereinafter referred to as a depth motion vector) from the depth data of the depth signal DSa given from the signal conversion unit 1, and sends the detected depth motion vector to the selection unit 5 as a vector signal DVS. give. Further, the depth motion analysis unit 3 calculates a depth motion vector reliability value representing the reliability of the detected depth motion vector, and gives the calculated depth motion vector reliability value to the comparison unit 4 as the reliability signal DRS.

The comparison unit 4 compares the reliability of the left eye motion vector and the depth motion vector based on the reliability signal LRS given from the left eye motion analysis unit 2 and the reliability signal DRS given from the depth motion analysis unit 3. The selection unit 5 selects one of the vector signal LVS given from the left eye motion analysis unit 2 and the vector signal DVS given from the depth motion analysis unit 3 based on the comparison result by the comparison unit 4.

The frame interpolation unit 6 performs frame interpolation processing of the left-eye video signal LSa and the depth signal DSa given from the signal conversion unit 1 based on the vector signal LVS or the vector signal DVS selected by the selection unit 5. The signal converter 7 outputs the left-eye video signal LSb after the frame interpolation process, generates the right-eye video signal RSb based on the left-eye video signal LSb and the depth signal DSb after the frame interpolation process, and generates the generated right-eye video signal RSb is output. The left-eye video signal LSb and the right-eye video signal RSb output from the signal conversion unit 7 are given to an external device such as a display device, for example.

The signal conversion unit 1, the left eye motion analysis unit 2, the depth motion analysis unit 3, the comparison unit 4, the selection unit 5, the frame interpolation unit 6 and the signal conversion unit 7 are, for example, a hardware such as a CPU (Central Processing Unit) and a memory. Hardware and software such as a computer program. In this case, the signal conversion unit 1, the left eye motion analysis unit 2, the depth motion analysis unit 3, the comparison unit 4, the selection unit 5, the frame interpolation unit 6, and the signal conversion unit 7 correspond to computer program modules. When the CPU executes the computer program stored in the memory, the signal conversion unit 1, the left eye motion analysis unit 2, the depth motion analysis unit 3, the comparison unit 4, the selection unit 5, the frame interpolation unit 6, and the signal conversion unit 7. Function is realized. A part or all of the determination control unit 224 may be realized by hardware such as an ASIC (Application Specific Integrated Circuit).

(2) Depth data FIG. 2A is a diagram showing one frame of a left-eye image displayed based on left-eye image data, and FIG. 2B is a right-eye image displayed based on right-eye image data. It is a figure which shows 1 frame of. The frame in FIG. 2A and the frame in FIG. 2B correspond to the same position on the time axis. FIG. 3A and FIG. 3B are a schematic plan view and a schematic side view showing a stereoscopic image presented by the left-eye image in FIG. 2A and the right-eye image in FIG. 2B.

2 and 3, the subject is two spheres B1 and B2. As shown in FIG. 3, the display screen DP is arranged in front of the user U. The left-eye image in FIG. 2A and the right-eye image in FIG. 2B are displayed on the display screen DP. The user U sees the left eye image with the left eye and the right eye image with the right eye. Thereby, as shown in FIG. 3, stereoscopic images of the spheres B <b> 1 and B <b> 2 are presented to the user U. In this example, the stereoscopic video of the sphere B1 is presented at a position closer to the user U than the display screen DP in the direction perpendicular to the display screen DP (hereinafter referred to as the depth direction), and the stereoscopic video of the sphere B2 is displayed. It is presented at a position farther from the user U than the screen DP.

4 (a) and 4 (b) are a schematic plan view and a schematic side view for explaining the depth data. 4 (a) and 4 (b), a black dot indicates a stereoscopic image element (hereinafter referred to as a pixel element) PE represented by one pixel of the left-eye image and one pixel of the right-eye image. A set of pixel elements PE forms a stereoscopic image of the spheres B1 and B2.

The depth data includes a distance DD (hereinafter referred to as a depth distance) DD between each pixel element PE and a predetermined reference position in the depth direction. In this example, the predetermined reference position is a position on a plane where the display screen DP is arranged. For example, when the stereoscopic video is closer to the user U than the display screen DP, the depth distance is a positive value, and when the stereoscopic video is farther from the user U than the display screen DP, the depth distance is a negative value. . In the example of FIG. 4, the depth distance of the pixel element PE constituting the stereoscopic image of the sphere B1 is a positive value, and the depth distance of the pixel element PE constituting the stereoscopic image of the sphere B2 is a negative value.

The signal converter 1 (FIG. 1) generates depth data corresponding to each frame of the stereoscopic video based on the left-eye video signal LSa and the right-eye video signal RSa, and performs depth motion analysis using the generated depth data as the depth signal DSa. This is given to the unit 3 (FIG. 1) and the frame interpolation unit 6 (FIG. 1).

(3) Depth Motion Vector FIGS. 5 (a) and 5 (b) are diagrams illustrating the motion of the stereoscopic video of the sphere B1. In the example of FIGS. 5A and 5B, the stereoscopic image of the sphere B1 moves substantially parallel to the display screen DP. The depth motion analysis unit 3 (FIG. 1) detects the moving direction and moving distance of such a stereoscopic image as a depth motion vector based on the depth signal DSa from the signal conversion unit 1 (FIG. 1).

For example, in the example of FIG. 5, the depth distance DD of each pixel element PE constituting the sphere B1 before movement is substantially equal to the depth distance DD of the pixel element PE constituting the sphere B1 after movement. Therefore, the moving direction of the sphere B1 is detected by detecting the positions of the pixel elements PE having substantially the same depth distance from the depth data of the frame representing the sphere B1 before the movement and the depth data of the frame representing the sphere B1 after the movement. And the moving distance can be detected as a depth motion vector.

It should be noted that it is also possible to individually detect each object presented as a stereoscopic video using the depth motion vector detected as described above.

(4) Reliability of Left Eye Motion Vector and Depth Motion Vector FIG. 6 is a diagram for explaining a method for calculating the reliability of the left eye motion vector, and FIG. 7 explains a method for calculating the reliability of the depth motion vector. It is a figure for doing. In FIG. 6 and FIG. 7, the previous frame and the subsequent frame refer to two frames that are continuous on the time axis in the stereoscopic video before frame interpolation. An interpolation frame is a frame to be interpolated between the previous frame and the subsequent frame.

The left-eye motion analysis unit 2 (FIG. 1) detects a left-eye motion vector from left-eye image data corresponding to the previous frame and the subsequent frame, for example, by a block matching method or a gradient method. As shown in FIG. 6, the left eye motion analysis unit 2 uses the detected left eye motion vector to convert left eye image data corresponding to the previous frame to left eye image data corresponding to the interpolation frame (hereinafter referred to as front left eye image data). ) To generate left eye image data corresponding to the interpolated frame (hereinafter referred to as rear left eye image data) from the left eye image data corresponding to the subsequent frame.

Further, the left eye motion analysis unit 2 calculates a difference between the front left eye image data and the rear left eye image data for each pixel, converts each calculated difference into an absolute value, and sums up the absolute values. The total value is a left eye motion vector reliability value representing the reliability of the left eye motion vector. The smaller the left eye motion vector confidence value, the higher the reliability of the left eye motion vector. The left eye motion analysis unit 2 gives the calculated left eye motion vector reliability value to the comparison unit 4 (FIG. 1) as the reliability signal LRS.

The depth motion analysis unit 3 (FIG. 1) detects the depth motion vector from the depth data corresponding to the previous frame and the rear frame as described above. As shown in FIG. 7, the depth motion analysis unit 3 generates depth data corresponding to the interpolation frame (hereinafter referred to as forward depth data) from the depth data corresponding to the previous frame, using the detected depth motion vector. Then, depth data corresponding to the interpolation frame (hereinafter referred to as rear depth data) is generated from the depth data corresponding to the subsequent frame.

Further, the depth motion analysis unit 3 calculates a difference between the front depth data and the rear depth data for each pixel element PE, converts each calculated difference into an absolute value, and sums up the absolute values. The total value is a depth motion vector reliability value representing the reliability of the depth motion vector. The smaller the depth motion vector confidence value, the higher the reliability of the depth motion vector. The depth motion analysis unit 3 gives the calculated depth motion vector reliability value to the comparison unit 4 (FIG. 1) as the reliability signal DRS.

In this example, the left eye motion vector confidence value is calculated by a calculation including the difference between the front left eye image data and the rear left eye image data, and the depth motion vector confidence value is calculated by a calculation including the difference between the front depth data and the rear depth data. However, it is not limited to this. For example, the left-eye motion vector confidence value may be calculated by a calculation including a ratio between the front left-eye image data and the rear left-eye image data. Similarly, the left-eye motion vector confidence value may be calculated by a calculation including a ratio between the front depth data and the rear depth data.

(5) Comparison Processing The comparison unit 4 compares the reliability of the left eye motion vector and the depth motion vector based on the reliability signal LRS from the left eye motion analysis unit 2 and the reliability signal DRS from the depth motion analysis unit 3. Comparison processing is performed. Specifically, the comparison unit 4 multiplies at least one of the left-eye motion vector reliability value indicated by the reliability signal LRS and the left-eye motion vector reliability value indicated by the reliability signal LRS by a predetermined coefficient (gain), and uses these values. Compare.

When the reliability of the depth motion vector is higher than the reliability of the left eye motion vector, the comparison unit 4 gives a depth selection signal to the selection unit 5 (FIG. 1). In response to the depth selection signal from the comparison unit 4, the selection unit 5 (FIG. 1) selects the vector signal DVS from the vector signal LVS from the left eye motion analysis unit 2 and the vector signal DVS from the depth motion analysis unit 3. To the frame interpolation unit 6.

On the other hand, when the reliability of the depth motion vector is equal to or lower than the reliability of the left eye motion vector, the comparison unit 4 gives the left eye selection signal to the selection unit 5. In response to the left eye selection signal from the comparison unit 4, the selection unit 5 selects the vector signal LVS from the vector signal LVS from the left eye motion analysis unit 2 and the vector signal DVS from the depth motion analysis unit 3, and performs frame interpolation. Part 6 is given.

The comparison process may be performed, for example, for each predetermined number of frames, may be performed every predetermined time, or may be performed according to an instruction from the user. During the period when the comparison process is not performed, the depth selection signal or the left eye selection signal is continuously output from the comparison unit 4 to the selection unit 5 according to the result of the previous comparison process. Further, during the period when the comparison process is not performed, the calculation of the left eye motion vector reliability value by the left eye motion analysis unit 2 and the calculation of the depth motion vector reliability value by the depth motion analysis unit 3 may not be performed.

(6) Frame interpolation When the vector signal LVS is given from the selection unit 5 to the frame interpolation unit 6, the frame interpolation unit 6 (FIG. 1) uses the left-eye motion vector indicated by the given vector signal LVS to generate a left-eye video signal. Frame interpolation processing of LSa and depth signal DSa is performed. Specifically, the frame interpolation unit 6 uses the left eye motion vector to generate left eye image data and depth data corresponding to the interpolation frame from at least one of the previous frame and the subsequent frame (see FIGS. 6 and 7).

On the other hand, when the vector signal DRS is given from the selection unit 5 to the frame interpolation unit 6, the frame interpolation unit 6 uses the depth motion vector indicated by the given vector signal DRS to frame the left-eye video signal LSa and the depth signal DSa. Perform interpolation processing. Specifically, the frame interpolation unit 6 uses the depth motion vector to generate left-eye image data and depth data corresponding to the interpolation frame from at least one of the previous frame and the subsequent frame (see FIGS. 6 and 7).

The signal converter 7 (FIG. 1) outputs the left-eye video signal LSb after the frame interpolation process, and generates and generates the right-eye video signal RSb based on the left-eye video signal LSb and the depth signal DSb after the frame interpolation process. The right eye video signal RSb is output. A display device (not shown) displays a left-eye video based on the left-eye video signal LSb output from the signal conversion unit 7 and a right-eye video based on the right-eye video signal RSb. Thereby, the 3D image after the frame rate conversion is presented to the user.

(7) Operation of Video Processing Device FIG. 8 is a flowchart showing the operation of the video processing device 100. As shown in FIG. 8, first, the signal converter 1 generates a depth signal DSa from the input left-eye video signal LaS and right-eye video signal RSa (step S1).

Next, the left eye motion analysis unit 2 detects a left eye motion vector from the left eye video signal LSa (step S2), and calculates a left eye motion vector reliability value representing the reliability of the left eye motion vector (step S3). In parallel with the processes in steps S2 and S3, the depth motion analysis unit 3 detects a depth motion vector from the depth signal DSa (step S4), and calculates a depth motion vector reliability value representing the reliability of the depth motion vector (step S4). S5).

Next, the comparison unit 4 compares the reliability of the left-eye motion vector and the depth motion vector based on the left-eye motion vector reliability value calculated in step S3 and the depth motion vector reliability value calculated in step S5 (step S3). S6). Next, the selection unit 5 determines whether or not the reliability of the depth motion vector is higher than the reliability of the left eye motion vector (step S7).

When the reliability of the depth motion vector is higher than the reliability of the left eye motion vector, the frame interpolation unit 6 uses the depth motion vector detected by the depth motion analysis unit 3 in step S4 to use the left eye video signal LSa and the depth signal. DSa frame interpolation processing is performed (step S8). On the other hand, when the reliability of the depth motion vector is equal to or lower than the reliability of the left eye motion vector, the frame interpolation unit 6 uses the left eye motion vector detected by the left eye motion analysis unit 2 in step S2 and the left eye video signal LSa and Frame interpolation processing of the depth signal DSa is performed (step S9).

Next, the signal conversion unit 7 generates the right-eye video signal RSb from the left-eye video signal LSb and the depth signal DSb after the frame interpolation processing in Step S8 or Step S9 (Step S10), and the left-eye video signal LSb after the frame interpolation processing And the generated right-eye video signal RSb is output. Thereafter, the processes of steps S1 to S10 are repeated.

(8) Effects of this embodiment In the video processing apparatus 100 according to this embodiment, the depth signal DSa is generated from the input left-eye video signal LSa and right-eye video signal RSa and detected from the left-eye video signal LSa. Frame interpolation processing of the left-eye video signal LSa and the depth signal DSa is performed using one of the depth motion vectors detected from the left-eye motion vector and the depth signal DSa with higher reliability.

Thereby, it is possible to accurately detect the motion of the three-dimensional image as compared with the case where the frame interpolation process is performed using only the image data. For example, when stereoscopic images of a plurality of objects are presented so as to overlap in the depth direction, it is difficult to distinguish between the plurality of objects based on image data if the luminance, color, and the like are substantially the same. is there. On the other hand, by using the depth data, it is possible to easily distinguish these objects based on the difference in depth distance of the stereoscopic video. Therefore, it is possible to accurately detect the movements of the plurality of objects. Therefore, the left-eye image data and depth data corresponding to the interpolation frame can be generated with high accuracy.

Further, the right eye video signal RSb is generated from the left eye video signal LSb and the depth signal DSb after the frame interpolation process. Thereby, the right eye image data corresponding to the interpolation frame can be generated with high accuracy. As a result, the frame rate conversion of the stereoscopic video can be performed without reducing the quality of the stereoscopic video.

Further, in the present embodiment, using the detected left eye motion vector, front left eye image data is generated from left eye image data corresponding to the previous frame, and rear left eye image data is converted from left eye image data corresponding to the rear frame. And a left-eye motion vector confidence value is calculated based on these differences. Thereby, the reliability of the left eye motion vector can be accurately evaluated. In addition, using the detected depth motion vector, forward depth data is generated from depth data corresponding to the previous frame and backward depth data is generated from depth data corresponding to the rear frame, and the depth motion is based on these differences. A vector confidence value is calculated. Thereby, the reliability of the depth motion vector can be accurately evaluated. Therefore, it is possible to accurately select one of the left eye motion vector and the depth motion vector with higher reliability according to the stereoscopic video. As a result, the left-eye image data and depth data corresponding to the interpolation frame can be generated with high accuracy.

(9) Effects of other forms (9-1)
FIG. 9 is a block diagram showing a configuration of a video processing apparatus 100a according to another embodiment of the present invention. The video processing apparatus 100a in FIG. 9 will be described while referring to differences from the video processing apparatus 100 according to the above embodiment.

The video processing apparatus 100a in FIG. 9 includes a synthesis unit 4a instead of the comparison unit 4 and the selection unit 5 in FIG. The left eye motion analysis unit 2 gives the vector signal LVS and the reliability signal LRS to the synthesis unit 4a. The depth motion analysis unit 3 gives the vector signal DVS and the reliability signal DRS to the synthesis unit 4a. The synthesizer 4a generates a vector signal NVS based on the given reliability signal LRS and reliability signal DRS. The vector signal NVS represents a combined vector generated by combining the left eye motion vector and the depth motion vector.

For example, the synthesis unit 4a sets weighting coefficients corresponding to the left eye motion vector and the depth motion vector in accordance with the reliability of the left eye motion vector and the depth motion vector, and sets the left eye motion vector and the set weight. The product of the coefficient and the product of the depth motion vector and the set weight coefficient are added. In this case, when the reliability of the left eye motion vector is higher than the reliability of the depth motion vector, the weight coefficient of the left eye motion vector is set larger than the weight coefficient of the depth motion vector, and the reliability of the depth motion vector is set to the left eye motion vector. Is higher than the weight coefficient of the left-eye motion vector. Further, as the difference between the reliability of the left eye motion vector and the reliability of the depth motion vector is larger, the difference between the weight coefficient of the left eye motion vector and the weight coefficient of the depth motion vector is set larger.

The frame interpolation unit 6 performs frame interpolation processing of the left-eye video signal LSa and the depth signal DSa using the vector signal NVS generated by the synthesis unit 4a.

As described above, by performing the frame interpolation process using the combined vector of the left eye motion vector and the depth motion vector, it is possible to generate the left eye image data and depth data corresponding to the interpolation frame with higher accuracy. As a result, the frame rate conversion of the stereoscopic video can be performed without reducing the quality of the stereoscopic video.

(9-2)
In the above embodiment, frame interpolation processing is performed on the left-eye video signal LSa and depth signal DSa. However, the present invention is not limited to this, and frame interpolation processing may be performed on the right-eye video signal RSa and depth signal DSa. . In this case, a right eye motion analysis unit is provided instead of the left eye motion analysis unit 2.

The signal conversion unit 1 gives the right-eye video signal RSa to the right-eye motion analysis unit and the frame interpolation unit 6. The right-eye motion analysis unit detects a motion vector (hereinafter referred to as a right-eye motion vector) from the right-eye video signal RSa, and calculates a right-eye motion vector reliability value representing the reliability of the detected right-eye motion vector.

The comparison unit 4 compares the reliability of the right eye motion vector and the depth motion vector. The selection unit 5 selects one of the right eye motion vector and the depth motion vector with higher reliability based on the comparison result by the comparison unit 4. The frame interpolation unit 6 performs frame interpolation processing on the right-eye video signal RSa and the depth signal DSa given from the signal conversion unit 1 based on the right-eye motion vector or the depth motion vector selected by the selection unit 5. The signal converter 7 outputs the right-eye video signal RSb after the frame interpolation process, generates the left-eye video signal LSb based on the right-eye video signal RSb and the depth signal DSb after the frame interpolation process, and generates the generated left-eye video signal LSb is output.

Thereby, the right-eye image data and depth data corresponding to the interpolation frame can be generated with high accuracy. Furthermore, the left-eye video signal LSb can be accurately generated from the right-eye video signal RSb and the depth signal DSb after the frame interpolation process. As a result, the frame rate conversion of the stereoscopic video can be performed without reducing the quality of the stereoscopic video.

(10) Correspondence between each constituent element of claim and each element of the embodiment Hereinafter, an example of correspondence between each constituent element of the claim and each element of the embodiment will be described. It is not limited to.

In the above embodiment, the

video processing devices

100 and 100a are examples of video processing devices, the left eye motion analysis unit 2 is an example of a first motion vector detection unit and a first reliability generation unit, and depth motion analysis is performed. The unit 3 is an example of a second motion vector detection unit and a second reliability generation unit, the comparison unit 4 and the selection unit 5 or the synthesis unit 4a are examples of a motion vector generation unit, and the frame interpolation unit 6 is a frame It is an example of an interpolation unit, the signal conversion unit 7 is an example of a first signal conversion unit, and the signal conversion unit 1 is an example of a second signal conversion unit.

The left-eye video is an example of a left-eye two-dimensional video, the right-eye video is an example of a right-eye two-dimensional video, the left-eye image data is an example of first image data, and the right-eye image data is second. The left-eye video signal LSa is an example of the first video signal, the left-eye video signal LSb is an example of the second video signal, and the right-eye video signal RSb is an example of the third video signal. The right-eye video signal RSa is an example of the fourth video signal, the depth signal DSa is an example of the depth signal, the left-eye motion vector is an example of the first motion vector, and the depth motion vector is the second It is an example of a motion vector, a left eye motion vector, a depth motion vector, or a composite vector is an example of a third motion vector, a left eye motion vector confidence value is an example of first reliability information, and a depth motion vector confidence value Is second It is an example of the reliability information.

As the constituent elements of the claims, various other elements having configurations or functions described in the claims can be used.

(11) Comprehensive description of video processing apparatus and video processing method according to embodiment (11-1)
A video processing apparatus according to an embodiment of the present invention is a video processing apparatus that performs processing for frame rate conversion of a three-dimensional video composed of a plurality of frames, and includes a left-eye and a right-eye two-dimensional video. A first motion vector detection unit configured to detect a first motion vector from a first video signal including a plurality of frames of first image data for displaying one two-dimensional video; A first reliability generation unit configured to generate first reliability information representing the reliability of the first motion vector detected by the motion vector detection unit, and presentation of a three-dimensional image in the depth direction A second motion vector detection unit configured to detect a second motion vector from a depth signal including depth data of a plurality of frames representing a position, and detected by the second motion vector detection unit. A second reliability generator configured to generate second reliability information representing the reliability of the second motion vector, and a first reliability generated by the first reliability generator Based on the information and the second reliability information generated by the second reliability generation unit, the first motion vector detected by the first motion vector detection unit and the second motion vector detection unit are detected. A motion vector generation unit configured to generate a third motion vector from the second motion vector, and a third motion vector generated by the motion vector generation unit. A frame interpolator configured to generate first image data and depth data of a frame to be interpolated from the first image data and depth data of the frame.

In this video processing apparatus, first motion vector detection is performed from a first video signal including a plurality of frames of first image data for displaying one of the left-eye and right-eye 2D videos. The first motion vector is detected by the unit. First reliability information representing the reliability of the detected first motion vector is generated by the first reliability generation unit. Further, the second motion vector detection unit detects the second motion vector from the depth signal including depth data of a plurality of frames representing the presentation position of the 3D video in the depth direction. Second reliability information representing the reliability of the detected second motion vector is generated by the second reliability generation unit. The reliability of the first motion vector and the reliability of the second motion vector refer to the degree of accuracy of the first motion vector and the second motion vector representing the motion of the object in the video.

Based on the generated first and second reliability information, a motion vector generation unit generates a third motion vector from the first and second motion vectors. Using the generated third motion vector, the first image data and the depth data of the frame to be interpolated from the first image data and the depth data of at least one frame of the 3D video are generated by the frame interpolation unit Is done.

As described above, the first motion vector detected based on the first image data for displaying the 2D video and the first motion vector detected based on the depth data representing the presentation position of the 3D video in the depth direction. A second motion vector is used to generate a third motion vector. Thereby, the third motion vector accurately represents the motion of the object in the 3D video. Therefore, the first image data and depth data of the frame to be interpolated can be generated with high accuracy. As a result, the frame rate conversion of the 3D video can be performed without degrading the quality of the 3D video.

(11-2)
The first reliability generation unit uses the first motion vector detected by the first motion vector detection unit to interpolate a plurality of first frames of the common frame to be interpolated from the first image data of different frames. Image data is generated, and the first reliability information is generated based on the plurality of first image data of the generated common frame, and the second reliability generation unit is configured to generate the second motion A plurality of depth data of a common frame to be interpolated from depth data of different frames using the second motion vector detected by the vector detection unit, and based on the generated plurality of depth data of the common frame The second reliability information may be generated.

In this case, the first reliability information that accurately represents the reliability of the first motion vector and the second reliability information that accurately represents the reliability of the second motion vector can be generated.

(11-3)
The first reliability generation unit is configured to generate the first reliability information by an operation including a difference or a ratio of the plurality of first image data of the generated common frame, and the second reliability is generated. The generation unit may be configured to generate the second reliability information by an operation including a difference or ratio between a plurality of depth data of the generated common frame.

(11-4)
The motion vector generation unit is configured to output the first and second based on the first reliability information generated by the first reliability generation unit and the second reliability information generated by the second reliability generation unit. One of the motion vectors having higher reliability may be generated as the third motion vector.

In this case, the third motion vector that accurately represents the motion of the object in the three-dimensional image can be easily generated from the first and second motion vectors.

(11-5)
The motion vector generation unit is configured to output the first and second based on the first reliability information generated by the first reliability generation unit and the second reliability information generated by the second reliability generation unit. The third motion vector may be generated by combining the motion vectors.

In this case, a third motion vector that more accurately represents the motion of the object in the three-dimensional image can be generated from the first and second motion vectors.

(11-6)
The video processing device includes: first image data of a plurality of frames of a first video signal; first image data of a frame to be interpolated generated by a frame interpolation unit; depth data of a plurality of frames of a depth signal; From the depth data of the frame to be interpolated generated by the interpolating unit, a plurality of frames for displaying the other two-dimensional video among the two-dimensional video for the left eye and the right eye, and the second image of the frame to be complemented Generating data and generating a second video signal including the first image data of a plurality of frames of the first video signal and the first image data of the frame to be interpolated generated by the frame interpolation unit; Configured to generate a third video signal including the generated plurality of frames and the second image data of the frame to be interpolated Signal conversion unit 1 may further comprise a.

In this case, the left-eye two-dimensional video is displayed using the third video signal, and the right-eye two-dimensional video is displayed using the fourth video signal. As a result, the 3D video after the frame rate conversion can be presented to the user.

(11-7)
The video processing apparatus obtains a depth signal from a first video signal and a fourth video signal including second image data of a plurality of frames for displaying the other two-dimensional video among the left-eye and right-eye two-dimensional videos. A second signal conversion unit that generates the signal may be further included.

In this case, the first image data of the frame to be interpolated and the depth data of the frame to be interpolated can be accurately generated from the first and second video signals.

(11-8)
A video processing method according to an embodiment of the present invention is a method for performing a process for frame rate conversion of a three-dimensional video composed of a plurality of frames, and one of two-dimensional video for a left eye and a right eye. A step of detecting a first motion vector from a first video signal including a plurality of frames of first image data for displaying a two-dimensional video, and a first representing the reliability of the detected first motion vector Generating reliability information, detecting a second motion vector from a depth signal including depth data of a plurality of frames representing a 3D video presentation position in the depth direction, and a detected second motion vector Generating second reliability information representing the reliability of the first and second motion vectors detected based on the generated first and second reliability information Generating a third motion vector from the first motion data, and using the generated third motion vector, the first of the frames to be interpolated from the first image data and depth data of at least one frame of the 3D video Generating image data and depth data.

In this video processing method, a first motion vector is derived from a first video signal including a plurality of frames of first image data for displaying one of the left-eye and right-eye 2D videos. First reliability information representing the reliability of the detected first motion vector is generated. In addition, a second motion vector is detected from a depth signal including depth data of a plurality of frames representing a presentation position of a 3D image in the depth direction, and second reliability representing the reliability of the detected second motion vector. Information is generated.

A third motion vector is generated from the first and second motion vectors based on the generated first and second reliability information. Using the generated third motion vector, first image data and depth data of a frame to be interpolated from the first image data and depth data of at least one frame of the three-dimensional video are generated.

The present invention can be effectively used in a video processing apparatus that performs processing for frame rate conversion of 3D video.

Claims

A video processing apparatus that performs processing for frame rate conversion of a three-dimensional video composed of a plurality of frames,
A first motion vector is detected from a first video signal including the plurality of frames of first image data for displaying one of the left-eye and right-eye two-dimensional videos. A first motion vector detection unit,
A first reliability generator configured to generate first reliability information representing the reliability of the first motion vector detected by the first motion vector detector;
A second motion vector detection unit configured to detect a second motion vector from a depth signal including the plurality of frames of depth data representing a presentation position of a three-dimensional image in the depth direction;
A second reliability generator configured to generate second reliability information representing the reliability of the second motion vector detected by the second motion vector detector;
Based on the first reliability information generated by the first reliability generation unit and the second reliability information generated by the second reliability generation unit, the first motion vector detection unit A motion vector generator configured to generate a third motion vector from the detected first motion vector and the second motion vector detected by the second motion vector detector;
First image data and depth data of a frame to be interpolated from first image data and depth data of at least one frame of a three-dimensional video image using the third motion vector generated by the motion vector generation unit And a frame interpolation unit configured to generate the video processing device.
The first reliability generation unit uses a first motion vector detected by the first motion vector detection unit, and uses a plurality of common frames to be interpolated from first image data of different frames. 1 is generated, and the first reliability information is generated based on the plurality of first image data of the generated common frame,
The second reliability generation unit generates a plurality of depth data of a common frame to be interpolated from the depth data of different frames using the second motion vector detected by the second motion vector detection unit. The video processing device according to claim 1, wherein the video processing device is configured to generate second reliability information based on the generated plurality of depth data of the common frame.
The first reliability generation unit is configured to generate the first reliability information by an operation including a difference or a ratio of a plurality of first image data of the generated common frame,
The said 2nd reliability production | generation part is comprised so that the said 2nd reliability information may be produced | generated by the calculation containing the difference or ratio of several depth data of the produced | generated common frame. Video processing equipment.
The motion vector generator is
Based on the first reliability information generated by the first reliability generation unit and the second reliability information generated by the second reliability generation unit, the first and second motion vectors The video processing apparatus according to claim 1, wherein one of the more reliable ones is generated as the third motion vector.
The motion vector generator is
Based on the first reliability information generated by the first reliability generation unit and the second reliability information generated by the second reliability generation unit, the first and second motion vectors The video processing device according to claim 1, configured to generate a third motion vector by combining the two.
First image data of the plurality of frames of the first video signal, first image data of the frame to be interpolated generated by the frame interpolation unit, depth data of the plurality of frames of the depth signal, and From the depth data of the frame to be interpolated generated by the frame interpolation unit, the plurality of frames for displaying the other two-dimensional video out of the left-eye and right-eye two-dimensional images and the frame to be complemented Second image data is generated, and includes the first image data of the plurality of frames of the first video signal and the first image data of the frame to be interpolated generated by the frame interpolation unit. 2 video signals and second image data of the generated plurality of frames and the frame to be interpolated. Third further comprising a first signal converter configured to generate a video signal, the video processing apparatus according to claim 1 comprising a.
The depth signal from the first video signal and the fourth video signal including the second image data of the plurality of frames for displaying the other two-dimensional video among the left-eye and right-eye two-dimensional videos. The video processing apparatus according to claim 1, further comprising a second signal conversion unit to be generated.
A method for performing processing for frame rate conversion of a three-dimensional video composed of a plurality of frames,
Detecting a first motion vector from a first video signal including the plurality of frames of first image data for displaying one of the left-eye and right-eye two-dimensional videos;
Generating first reliability information representing the reliability of the detected first motion vector;
Detecting a second motion vector from a depth signal including depth data of the plurality of frames representing a presentation position of a 3D image in the depth direction;
Generating second reliability information representing the reliability of the detected second motion vector;
Generating a third motion vector from the detected first and second motion vectors based on the generated first and second reliability information;
Generating first image data and depth data of a frame to be interpolated from the first image data and depth data of at least one frame of a three-dimensional image using the generated third motion vector; A video processing method comprising: