Nothing Special   »   [go: up one dir, main page]

EP2761874A1 - Frame-compatible full resolution stereoscopic 3d video delivery with symmetric picture resolution and quality - Google Patents

Frame-compatible full resolution stereoscopic 3d video delivery with symmetric picture resolution and quality

Info

Publication number
EP2761874A1
EP2761874A1 EP12769865.2A EP12769865A EP2761874A1 EP 2761874 A1 EP2761874 A1 EP 2761874A1 EP 12769865 A EP12769865 A EP 12769865A EP 2761874 A1 EP2761874 A1 EP 2761874A1
Authority
EP
European Patent Office
Prior art keywords
image frame
multiplexed
image
input
spatial frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP12769865.2A
Other languages
German (de)
French (fr)
Other versions
EP2761874B1 (en
Inventor
Tao Chen
Hariharan Ganapathy
Samir N. Hulyalkar
Gopi Lakshminarayanan
Peng Yin
Taoran Lu
Walter J. Husak
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dolby Laboratories Licensing Corp
Original Assignee
Dolby Laboratories Licensing Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby Laboratories Licensing Corp filed Critical Dolby Laboratories Licensing Corp
Publication of EP2761874A1 publication Critical patent/EP2761874A1/en
Application granted granted Critical
Publication of EP2761874B1 publication Critical patent/EP2761874B1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/106Processing image signals
    • H04N13/161Encoding, multiplexing or demultiplexing different image signal components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/10Processing, recording or transmission of stereoscopic or multi-view image signals
    • H04N13/194Transmission of image signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/119Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding

Definitions

  • the present invention relates generally to image data. More particularly, an example embodiment of the present invention relates to image data for stereoscopic 3D images.
  • FCHR Frame-compatible half resolution solutions for 3D content delivery suffer from degraded spatial resolution because the half resolution 3D content only contains half resolution image frames subsampled from full resolution 3D image frames.
  • FCFR frame-compatible full resolution
  • FIG. 1A illustrates a multi-layer video encoder that maintains high spatial frequency content present in input video sequence, in accordance with an embodiment of the invention
  • FIG. IB illustrates a multi-layer video decoder that receives input video signals with high spatial frequency content, in accordance with an embodiment
  • FIG. 1C illustrates a base-layer video decoder, in accordance with an embodiment
  • FIG. 2, FIG. 3, and FIG. 4 illustrate different configurations of demultiplexers, according to some example embodiments
  • FIG. 5 illustrates multiplexing formats, in some example embodiments
  • FIG. 6A and FIG. 6B illustrate interlaced content of a perspective forming image portions in a top-and-bottom and a side-by-side format, in some example embodiments;
  • FIG. 7 illustrates multiplexing formats for carrying interlaced content, in some example embodiments
  • FIG. 8A illustrates a multi-layer video encoder, in accordance with an embodiment of the invention
  • FIG. 8B shows a multi-layer video decoder, in accordance with an embodiment
  • FIG. 9 illustrates a demultiplexer, according to some example embodiments.
  • FIG. 10A and FIG. 10B illustrate process flows, in some example embodiments
  • FIG. 11 illustrates an example hardware platform on which a computer or a computing device as described herein may be implemented, according an example embodiment of the present invention
  • FIG. 12A and FIG. 12B illustrate an FCFR multi-layer video encoder and an FCFR multi-layer video decoder, in accordance with an embodiment of the invention
  • FIG. 13A and FIG. 13B illustrate example embodiments for reconstructing the full resolution signals
  • FIG. 14 illustrates an encoding process flow for generating an enhancement layer according to an embodiment of the invention.
  • FIG. 15 illustrates a filtering process flow in the decoder RPU to generate a carrier image signal according to an embodiment of the invention.
  • Example embodiments which relate to 3D video coding, are described herein.
  • numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are not described in exhaustive detail, in order to avoid unnecessarily occluding, obscuring, or obfuscating the present invention.
  • Video data is currently received mostly through network connections, for example, from internet-based content providers.
  • bitrate allocated to a display application such as a 3D display application on a computing device is limited.
  • 3D image content may be delivered as frame compatible 3D image frames (or pictures) with reduced resolutions.
  • 3D image frames may be subsampled from full resolution 3D image frames to reduced resolution 3D image frames; high spatial frequency content in the full resolution 3D image frames may be removed by low-pass filters to prevent aliasing in the subsampled image frames.
  • Embodiments include encoding and providing symmetric high resolution 3D image data to downstream devices.
  • a first multiplexed 3D image frame with reduced resolution in a horizontal direction and full resolution in a vertical direction is provided in one of a base layer and an enhancement layer to a recipient device
  • a second multiplexed 3D image frame with reduced resolution in a vertical direction and full resolution in a horizontal direction is provided in the other of the base layer and the enhancement layer to the recipient device.
  • Left eye (LE) and right eye (RE) image data in the enhancement layer may be combined by the recipient device with LE and RE image data in the base layer to reconstruct symmetric full resolution LE and RE image frames.
  • One or both of the first multiplexed 3D image frame and the second multiplexed 3D image frame may be frame compatible to support reduced resolution (a less than full resolution, e.g., half resolution) 3D video applications.
  • Codecs implementing techniques as described herein may be configured to include inter-layer prediction capabilities to fully exploit statistical redundancy between a multiplexed 3D image frame in the base layer and input image frames.
  • a multiplexed 3D image frame in the enhancement layer may (possibly only) carry residual or differential image data, instead of carrying a large amount of LE and RE image data without exploiting the statistical redundancy in image data of different layers.
  • the residual or differential image data as provided in the enhancement layers enables downstream devices to construct symmetric full resolution LE and RE image frames by adding the residual or differential image data on top of the frame-compatible multiplexed 3D image frame in the base layer.
  • the codecs may be configured to include inter- view prediction capability used as described in ITU-T Recommendation H.264 and ISO/IEC 14496-10.
  • a RPU reference processing unit
  • a RPU may be used to improve efficiency in inter-layer prediction for enhancement layer compression.
  • the multiplexed 3D image frames in both the base layer and the enhancement layer that comprise complementary high spatial frequency content may be transmitted to and/or rendered for viewing on high-end 3D displays.
  • one (e.g., a frame compatible multiplexed 3D image frame) of the multiplexed 3D image frames may be transmitted to and/or rendered for viewing on relatively lower-end 3D displays.
  • data needed for other applications may also be included in one or more enhancement layers.
  • a wide variety of features, as provided by FCFR technologies commercially available from Dolby Laboratories in San Francisco, California, may be supported by the base and enhancement layers as described herein.
  • Techniques as described herein provide solutions to achieving symmetric high resolution and high picture quality while maintaining backwards compatibility to a variety of relatively low-end video players.
  • a display system implementing techniques as described herein is able to achieve better picture quality of reconstructed 3D pictures than other display systems implementing other FCFR schemes.
  • a display system as described herein is able to retain more high frequencies and reproduce sharper pictures with more details than the other display systems.
  • Techniques as described herein may be used to reduce bandwidth or bitrate usage and preserve frame-compatible 3D image data with reduced resolution, which supports various televisions, displays and other image rendering devices.
  • an option for interlaced 3D content may also be implemented under techniques as described herein.
  • This option for example, may be used to carry 3D broadcast applications such as sports programs.
  • mechanisms as described herein form a part of a media processing system, including but not limited to: a handheld device, game machine, television, laptop computer, netbook computer, tablet computer, cellular radiotelephone, electronic book reader, point of sale terminal, desktop computer, computer workstation, computer kiosk, or various other kinds of terminals and media processing units.
  • FIG. 1A illustrates a multi-layer video encoder (100) that maintains high spatial frequency content present in input video sequence, in accordance with an embodiment of the invention.
  • FIG. IB illustrates a multi-layer video decoder (150) that corresponds to the multi-layer video encoder (100) shown in FIG. 1A, in accordance with the example embodiment.
  • the multiple-layer video encoder (100) is configured to encode an input 3D video sequence.
  • the input 3D video sequence consists of a sequence of 3D input images.
  • a 3D input image in the sequence of 3D images comprises full resolution 3D image data that contains high spatial frequency content.
  • full resolution may refer to a spatial resolution maximally supported by the total number of independently settable pixels in an image frame.
  • the full resolution 3D image data in a 3D input image may be initially decoded by the multiple-layer video encoder (100) into an input LE image frame (102-L) and an input RE image frame (102-R) both of which contain high spatial frequency content.
  • one or more filtering and subsampling mechanisms in the multi-layer video encoder (100) generate LE and RE image data filtered in one of the vertical and horizontal and vertical directions but unfiltered in the other of the vertical and horizontal directions based on the input LE and RE image frames (102-L and 102-R).
  • a filtering and subsampling mechanism may be configured to filter high spatial frequency content in the horizontal direction from the input LE and RE image frames (102-L and 102-R) and horizontally subsample the LE and RE image frames (102-L and 102-R) as filtered in the horizontal direction into corresponding LE and RE portions.
  • a multiplexer may be configured to combine the LE and RE portions in a 3D multiplexed image frame (108-H) in a side-by-side format.
  • a filtering and subsampling mechanism may be configured to filter high spatial frequency content in the vertical direction from the input LE and RE image frames (102-L and 102-R) and vertically subsample the LE and RE image frames (102-L and 102-R) as filtered in the vertical direction into corresponding LE and RE portions.
  • a multiplexer may be configured to combine the LE and RE portions in a 3D multiplexed image frame (108-V) in a top-and-bottom format.
  • the filtering of the LE and RE image frames (102-L and 102-R) may remove all, or a substantial part, of the high spatial frequency content from the input LE and RE image frames (102-L and 102-R) in one of the horizontal and vertical directions. Filtering may be performed with one or more low-pass filters (LPFs) in the filtering and subsampling mechanisms (e.g., 104-H and 104-V).
  • LPFs low-pass filters
  • filtering as described herein removes or substantially dampens any spatial frequency content in the input images above a threshold frequency that corresponds to a fraction (e.g., one half or another fraction) of a spatial resolution supported by a multi-layer video decoder (e.g., 150) in one of the horizontal and vertical directions.
  • high spatial frequency content in a spatial direction (horizontal or vertical) may refer to high spatial frequency image details that exist in an input 3D video sequence along the spatial direction. If the removal of the high spatial frequency content in the spatial direction had occurred, downstream devices would not be able to reproduce high resolution image details with filtered image data in the spatial direction.
  • a subsampler in a filtering and subsampling mechanism (104-H or 104- V) may be configured to preserve the high spatial frequency content in the direction perpendicular to the direction in which the high spatial frequency content has been filtered/removed.
  • a subsampler in the filtering and subsampling mechanism (104-H) may be configured to subsample (e.g., keep every other column) along the same horizontal direction in which the high spatial frequency content has been removed and to avoid subsampling along the vertical direction.
  • a subsampler in the filtering and subsampling mechanism (104-V) may be configured to subsample (e.g., keep every other row) along the same vertical direction in which the high spatial frequency content has been removed and to avoid subsampling along the vertical direction.
  • a multiplexed 3D image frame (one of 108-H and 108-V) comprises both a (e.g., down-sampled) image data portion for the left eye and a (e.g., down-sampled) image data portion for the right eye.
  • the multiplexed 3D image frame may be decoded by a downstream device into a LE image frame and a RE image frame of reduced resolutions (e.g., half resolutions) in one of the horizontal and vertical directions.
  • Such decoded LE and RE image frames of the reduced resolution may be up-sampled to comprise the same number of pixels as a full resolution image frame with a fuzzier look than a full resolution image not obtained by an up-sampling operation.
  • a multiplexed 3D image frame (108-H) comprises LE and RE image data portions, each of which comprises a reduced number (e.g., one half, less than one half, or another fewer than the total number) of the total number of pixels in a full resolution image frame, where the LE and RE image data portions comprise high spatial frequency content in the vertical direction
  • the other multiplexed 3D image frame (108-V) comprises complementary LE and RE image data portions, each of which comprises a reduced number (e.g., one half, less than one half, or another fewer than the total number) of the total number of pixels in a full resolution image frame, where the complementary LE and RE image data portions comprise high spatial frequency content in the horizontal direction.
  • LE and RE image data portions may be multiplexed within a multiplexed 3D image frame (e.g., one of 108-H and 108-V) in a side-by- side format, an over-under format, a quincunx format, a checkerboard format, an interleaved format, a combination of the foregoing formats, or another multiplex format.
  • a multiplexed 3D image frame e.g., one of 108-H and 108-V
  • a side-by- side format e.g., one of 108-H and 108-V
  • an over-under format e.g., a quincunx format, a checkerboard format, an interleaved format, a combination of the foregoing formats, or another multiplex format.
  • a multiplexed 3D image frame e.g., one of 108-H and 108-V
  • One or more enhancement layers may be used to carry a first multiplexed 3D image frame (e.g., one of 108-H and 108-V) that may be combined with a second multiplexed 3D image frame (e.g., the other of 108-H and 108-V) in a base layer.
  • a multi-layer video decoder e.g., 150 as described herein may be configured to produce image frames with high spatial resolution content in both vertical and horizontal directions based on the first and second multiplexed 3D image frame (e.g., 108-H and 108-V).
  • the BL encoder (110) generates, based at least in part on the first multiplexed 3D image frame (e.g., 108-H), a base layer video signal to be carried in a base layer frame compatible video stream (BL FC video stream 112-1), while the EL encoder (116) generates, based at least in part on the second multiplexed 3D image frame (e.g., 108-V), an enhancement layer video signal to be carried in an enhancement layer frame compatible video stream (EL FC video stream 112-3).
  • One or both of the BL encoder (110) and the EL encoder (116) may be implemented using one or more of a plurality of codecs, such as H.264/AVC, VP8, VC-1, and/or others.
  • An enhancement layer video signal as described herein may be generated using a hybrid video coding method (e.g., implemented by video codecs, such as VC-1, H.264/AVC, and/or others).
  • the image data in the multiplexed 3D image frame 108-V may be predicted either from neighboring samples in the same image frame (using intra prediction) or from samples from past decoded image frames (inter prediction) that belong to the same layer and are buffered as motion-compensated prediction references within a prediction reference image frame buffer.
  • Inter-layer prediction may also be at least in part based on decoded information from other layers (e.g., the base layer, etc.).
  • the multi-layer video encoder (100) may comprise a reference processing unit (RPU, 114) to perform operations relating to prediction. Prediction as implemented by the reference processing unit (114) may be used to reduce the redundant data and overhead in constructing multiplexed 3D image frames in the multi-layer video decoder (150).
  • the RPU (114) may receive and make use of BL image data and other prediction-related information from the BL Encoder 110, and generate a prediction reference image frame through intra or inter prediction.
  • the EL encoder (116) generates, based at least in part on the second multiplexed 3D image frame (108-V) and the prediction reference image frame, multiplexed 3D image residuals or differences between the prediction reference image frame and the second multiplexed 3D image frame 108-V and stores the image residuals in the enhancement layer video signal to be carried in the EL FC video stream (112-3). Further, based on the prediction and coding process, the RPU (114) may generate coding information which can be transmitted to a decoder as metadata using an RPU stream (112-2).
  • FIG. IB illustrates a multi-layer video decoder (150) that receives input video signals in which high spatial frequency content from an original video sequence (which may be the input video sequence as discussed in connection with FIG. 1A) in two orthogonal directions has been preserved in complementary image data carried in the enhancement layer and in the base layer, respectively, in accordance with an embodiment.
  • the input video signals are received in multiple layers (or multiple bitstreams).
  • multi-layer or “multiple layers” may refer to two or more bitstreams that carries input video signals having one or more logical dependency relationships between one another (of the input video signals).
  • the multi-layer video decoder (150) is configured to decode one or more input video signals in the BL FC video stream (112-1 of FIG. IB), EL RPU stream (112-2 of FIG. IB), and EL FC video stream (112-3 of FIG. IB) into a sequence of (full resolution) 3D output images.
  • a 3D output image in the sequence of 3D output images as decoded by the multi-layer video decoder (150) comprise high spatial resolution content for both eyes, as high spatial frequency content in the original video sequence that gives rise to the input video signals has been preserved in both horizontal and vertical directions.
  • a BL decoder (152) generates, based at least in part on a BL video signal received from BL FC video stream (112-1 of FIG. IB), a first multiplexed 3D image frame (158-H), while an EL decoder (156) generates, based at least in part on an EL video signal received from EL FC video stream (112-3 of FIG. IB), a second multiplexed 3D image frame (158-V).
  • One or both of the BL decoder (152) and the EL decoder (156) may be implemented using one or more of a plurality of codecs, such as H.264/AVC, VP8, VC-1, and/or others.
  • a decoder-side RPU (154) generates, based at least in part on a reference video signal received from EL RPU stream ( 112-2 of FIG. IB) and/or BL image data from the BL decoder (152), a prediction reference image frame.
  • EL decoder (156) generates, based at least in part on the EL video signal in EL FC video stream (112-3 of FIG. IB) and the prediction reference image frame from the RPU (154), the second multiplexed 3D image frame (158-V).
  • the multi-layer video decoder (150) may combine complementary image data received in one or more enhancement layers (e.g., EL RPU stream 112-2 and EL FC video stream 112-3) with image data received in a base layer (e.g., BL FC video stream 112-1) to produce full resolution LE and RE output image frames (e.g., 162-L and 162-R) that comprise high spatial frequency content in both vertical and horizontal directions.
  • a demultiplexer (DeMux, 160) may be configured to de-multiplex the multiplexed 3D image frames (158-H and 158-V) into the LE and RE output image frames (162-L and 162-R) with high spatial frequency content.
  • each of the LE and RE output image frames (162-L and 162-R) is only for one of left and right eyes.
  • a first LE image data portion in the first multiplexed 3D image frame (158-H) may be combined with a second LE image data portion in the second multiplexed 3D image frame (158-V) to form the LE output image (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
  • a first RE image data portion in the first multiplexed 3D image frame (158-H) may be combined with a second RE image data portion in the second multiplexed 3D image frame (158-V) to form the RE output image (162-R) that comprises high spatial frequency content in both vertical and horizontal directions.
  • the full resolution LE and RE output image frames (162-L and 162-R) both of which comprise high spatial frequency content in both vertical and horizontal directions may be rendered by a display device (which, for example, may comprise the multi-layer video decoder 150) to present a full resolution output 3D image. Rendering the full resolution LE and RE output image frames may, but is not limited to, be in a frame- sequential manner. Because high spatial frequency content has been preserved in the video signals as received by the multi-layer video decoder (150), the full resolution output 3D image contains high spatial frequency image details that may exist in an original 3D image (which may be one of the 3D input images of FIG. 1 A).
  • FIG. 1C illustrates a base-layer video decoder (150-1) that receives one or more input video signals generated from an original video sequence (which may be the input video sequence as discussed in connection with FIG. 1A), in accordance with an embodiment.
  • the base-layer video decoder (150-1) is configured to decode a BL input video signal as received from a base layer (BL FC video stream 112-1 of FIG. 1C) into a sequence of 3D output images, regardless of whether video signals in other layers may be present or not in physical signals received by the decoder.
  • the base-layer video decoder (150-1) is configured to ignore any presence of video signals in other streams other than the BL FC video stream (112-1).
  • a 3D output image in the sequence of 3D output images as produced by the base layer video decoder (150-1) does not comprise full resolution 3D image data, as high spatial frequency content along one of the vertical and horizontal directions in the original video sequence that gives rise to the input video signals has been filtered/removed in the base layer video signal and cannot be recovered by the base-layer video decoder (150-1).
  • a BL decoder (152 of FIG. 1C) generates, based at least in part on the BL input video signal in BL FC video stream (112-1 of FIG. 1C), a multiplexed 3D image frame (e.g., 158-H of FIG. 1C).
  • the BL decoder (152 of FIG. 1C) may be implemented using one or more of a plurality of codecs, such as H.264/AVC, VP8, VC-1, and/or others.
  • an up-sampling unit (170) de-multiplexes and/or separates the multiplexed 3D image frame (158-H) into two image data portions. While the multiplexed 3D image frame (158-H) comprises multiplexed filtered image data for both left and right eyes, the image data portions comprise a filtered LE image data portion and a filtered RE image data portion, each of which is at a reduced resolution below the full resolution.
  • the up-sampling unit (170) up-samples (e.g., expanding along the horizontal direction) the filtered LE image data portion to form an up-sampled LE filtered output image frame (172-L).
  • the up-sampling unit (170) up-samples (e.g., expanding along the horizontal direction) the filtered RE image data portion to form an up-sampled RE filtered output image frame (172-R).
  • each of the up- sampled LE and RE filtered image frames (172-L and -R) may comprise the same number of pixels as a full resolution image frame
  • the rendered 3D image with the up- sampled LE and RE filtered image frames (172-L and -R) has a fuzzier look than a 3D image made up of full resolution LE and RE image frames (162-L and -R of FIG. IB) not obtained by an up-sampling operation.
  • the up-sampled LE and RE filtered image frames (172-L and -R) do not have high spatial frequency image details removed in the encoding process of the BL video signals (which may be derived from, for example, 112-1 of FIG. 1A).
  • the up-sampled LE and RE filtered image frames (172-L and -R) below the full resolution may be rendered by a display device (which for example may comprise the base-layer video decoder 150-1) to present an output 3D image. Rendering the up-sampled LE and RE filtered image frames (172-L and -R) may, but is not limited to, be in a frame- sequential manner.
  • FIG. 2, FIG. 3, and FIG. 4 illustrate different configurations of demultiplexer 160 of FIG. IB, according to some example embodiments.
  • Each of the demultiplexers (160-1 through 160-5) may be configured to accept a first LE portion (202-L) decoded/derived from a first multiplexed 3D image frame (e.g., 158-H) and a second LE portion (204-L) decoded/derived from a second multiplexed 3D image frame (e.g., 158-V).
  • the first LE portion (202-L) may comprise high spatial frequency content in one (for example, vertical direction) of the vertical and horizontal directions, while the second LE portion (204-L) may comprise high spatial frequency content in the other (horizontal direction in the same example) of the vertical and horizontal directions.
  • Each of the demultiplexers (160-1 through 160-5) may be configured to process and combine the first LE portion (202-L) and the second LE portion (204-L) to generate a full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
  • a demultiplexer similar to each of the demultiplexers (160-1 through 160-5) may be configured to accept a first RE portion decoded/derived from the first multiplexed 3D image frame (158-H) and a second RE portion decoded/derived from the second multiplexed 3D image frame (158-V).
  • the first RE portion may comprise high spatial frequency content in the vertical direction
  • the second RE portion may comprise high spatial frequency content in the horizontal direction.
  • the demultiplexer may be configured to process and combine the first RE portion and the second RE portion to generate a full resolution RE image frame (e.g., 162-R of FIG. IB) that comprises high spatial frequency content in both vertical and horizontal directions.
  • an up-sampler (206-H) in the demultiplexer (160-1) may be configured to up-sample the first LE portion (202-L) in the horizontal direction to create a first LE image frame (208-H).
  • an up-sampler 206-V in the demultiplexer (160-1) may be configured to up-sample the second LE portion (204-L) in the vertical direction to create a second LE image frame (208- V).
  • a first low pass filter (210-1) and a first high pass filter (210-2) may be applied to the first LE image frame (208-H) to yield a first low pass LE image frame (212-1) and a first high pass LE image frame (212-2).
  • a second low pass filter (214-1) and a second high pass filter (214-2) may be applied to the second LE image frame (208- V) to yield a second low pass LE image frame (216-1) and a second high pass LE image frame (216-2).
  • An averaging unit (218) in the demultiplexer (160-1) may be configured to accept the first low pass LE image frame (212-1) and the second low pass LE image frame (216-1) as input and to apply an averaging operation on the first low pass LE image frame (212-1) and the second low pass LE image frame (216-1) to generate a low pass averaged LE image frame.
  • An adder (220) in the demultiplexer (160-1) may be configured to accept the first high pass LE image frame (212-2), the second high pass LE image frame (216-2) and the low pass averaged LE image frame as input and to apply an adding operation on the first high pass LE image frame (212-2), the second high pass LE image frame (216-2) and the low pass averaged LE image frame to generate the full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
  • low pass filtering have been applied by a multi-layer video encoder (e.g., 100 of FIG. 1A), for example, for anti-aliasing purposes.
  • the demultiplexer (160-1) of FIG. 2 may be simplified according to some embodiments as illustrated in FIG. 3 A and FIG. 3B. In these embodiments, decoding complexity may be reduced by eliminating low pass filters (e.g., 210-1 and 214-1) and an averaging unit (218) that may be used in the multiplexer 160-1 of FIG. 2.
  • an up-sampler (206-H) in the demultiplexer (160-2) may be configured to up-sample the first LE portion (202-L) in the horizontal direction to create a LE image frame (208-H) that comprises high spatial frequency content in the vertical direction.
  • a high pass filter (214-2) that preserves high spatial frequency content in the horizontal direction may be applied to the second LE portion (202-L) to yield a high pass LE portion.
  • An up-sampler 206-V in the demultiplexer (160-2) may be configured to up-sample the high pass LE portion in the vertical direction to create a high pass LE image frame (216-2) that comprises high spatial frequency content in the horizontal direction.
  • An adder (222) in the demultiplexer (160-2) may be configured to accept the high pass LE image frame (216-2) and the LE image frame (208-H) as input and to apply an adding operation on the high pass LE image frame (216-2) and the LE image frame (208-H) to generate the full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
  • a demultiplexer (160-3) as illustrated in FIG. 3B may be used instead of using a demultiplexer (160-2) as illustrated in FIG. 3 A to derive full resolution LE image frame (e.g., 162-L).
  • a demultiplexer (160-3) as illustrated in FIG. 3B may be used instead of using a demultiplexer (160-2) as illustrated in FIG. 3 A to derive full resolution LE image frame (e.g., 162-L).
  • a demultiplexer (160-3) as illustrated in FIG. 3B may be used.
  • an up-sampler (206-V) in the demultiplexer (160-3) may be configured to up-sample the second LE portion (204-L) in the vertical direction to create a LE image frame (208- V) that comprises high spatial frequency content in the horizontal direction.
  • a high pass filter (210-2) that preserves high spatial frequency content in the vertical direction may be applied to the first LE portion (202-L) to yield a high pass LE portion.
  • An up-sampler 206-H in the demultiplexer (160-3) may be configured to up-sample the high pass LE portion in the horizontal direction to create a high pass LE image frame (212-2) that comprises high spatial frequency content in the vertical direction.
  • An adder (222) in the demultiplexer (160-3) may be configured to accept the high pass LE image frame (212-2) and the LE image frame (208-V) as input and to apply an adding operation on the high pass LE image frame (212-2) and the LE image frame (208-V) to generate the full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
  • a subtraction operation as illustrated in FIG. 4A may be used instead of using a high pass filter (e.g., 214-2 of FIG. 3A or 210-2 of FIG. 3B) in a demultiplexer (160-2 or 160-3) as illustrated in FIG. 3A and FIG. 3B to derive full resolution LE image frame (e.g., 162-L).
  • a subtraction operation as illustrated in FIG. 4A may be used.
  • a subtraction operation such as 402 may require less computational complexity.
  • the first LE portion (202-L) may be decoded from a BL video signal, while the second LE portion (204-L) may be decoded from an EL video signal.
  • a reference LE portion 404 which may be of the same spatial dimensions of the second LE portion (204-L), may be generated based on the first LE portion (202-L) by RPU 406 (which may be, for example, 154 of FIG. IB) during an inter-layer prediction process.
  • a processing path or sub-path comprising a low pass filter (408) in the vertical direction (which removes high spatial frequency content in the vertical direction), a vertical subsampler (410; which may keep, for example, every other row), and a horizontal up-sampler (412) may be used to generate the reference LE portion 404.
  • the subtraction operation (402) may be configured to accept the reference LE portion 404 (which has been removed high spatial frequency content in the vertical direction in addition to the high spatial frequency content in the horizontal direction removed by an upstream multi-layer video encoder (e.g., 100 of FIG. 1A)) and the second LE portion (204-L) as input and to subtract the second LE portion (204-L) by the reference LE portion 404 to generate a high pass LE portion that comprises high spatial frequency content in the horizontal direction only.
  • the high pass LE portion as described herein may be equivalent to the high pass LE portion generated by the high pass filter (214-2) of FIG. 3A.
  • the demultiplexer (160-4) of FIG. 4A may comprise an up-sampler 206- V that may be configured to up-sample the high pass LE portion in the vertical direction to create a high pass LE image frame (216-2) that comprises high spatial frequency content in the horizontal direction.
  • An adder (222) in the demultiplexer (160-4) may be configured to accept the high pass LE image frame (216-2) and a LE image frame (208-H) (generated from the first LE portion 202-L by a horizontal up-sampler 206-H) as input and to apply an adding operation on the high pass LE image frame (216-2) and the LE image frame (208-H) to generate the full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
  • a reference LE portion 424 which may be of the same spatial dimensions of the first LE portion (202-L), may be generated based on the second LE portion
  • RPU 426 which may be, for example, 154 of FIG. IB
  • a processing path or sub-path comprising a low pass filter (428) in the vertical direction (which removes high spatial frequency content in the vertical direction), a vertical subsampler (430; which may keep, for example, every other row), and a horizontal up-sampler (432) may be used to generate the reference LE portion 404.
  • a subtraction operation (422) in the demultiplexer (160-5) may be configured to accept the reference LE portion 424 (which has been removed high spatial frequency content in the horizontal direction in addition to the high spatial frequency content in the vertical direction removed by an upstream multi-layer video encoder (e.g., 100 of FIG. 1A)) and the first LE portion (202-L) as input and to subtract the first LE portion (202-L) by the reference LE portion 424 to generate a high pass LE portion that comprises high spatial frequency content in the vertical direction only.
  • the high pass LE portion as described herein may be equivalent to the high pass LE portion generated by the high pass filter (210-2) of FIG. 3B.
  • the demultiplexer (160-5) of FIG. 4B may comprise an up-sampler 206-H that may be configured to up-sample the high pass LE portion in the horizontal direction to create a high pass LE image frame (212-2) that comprises high spatial frequency content in the vertical direction.
  • An adder (222) in the demultiplexer (160-5) may be configured to accept the high pass LE image frame (212-2) and a LE image frame (208-V) (generated from the second LE portion 204-L by a vertical up-sampler 206- V) as input and to apply an adding operation on the high pass LE image frame (212-2) and the LE image frame (208-V) to generate the full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
  • FIG. 5 illustrates multiplexing formats, in some example embodiments.
  • an LE image frame e.g., 102-L or an LE image derived there from
  • an RE image frame e.g., 102-R or an RE image derived there from
  • LE and RE pixel values for a plurality of pixels (a;, bj, Cj, dj, etc., wherein i may be a positive integer) in a 3D image frame.
  • a side-by-side multiplexed image frame such as the multiplexed image frame 108-H of FIG.
  • a top-and-bottom multiplexed image frame such as the multiplexed image frame 108-V of FIG. 1A may use any of a plurality of top-and-bottom multiplexing formats (108-V-l, 108-V-2, 108-V-3, or other side-by-side multiplexing formats) to host image data vertically subsampled from the LE and RE image frames (102-L and 102-R).
  • a multiplexer such as 106-H or 106-V of FIG. 1 may make one or more selections of multiplexing formats from the pluralities of multiplexing formats based on one or more factors (e.g., related to subsampling methods adopted by the multiplexer) and may signal the selections of multiplexing formats to a multi-layer video decoder (e.g., 150 of FIG. IB) as metadata using, for example, an RPU stream 112-2.
  • An RPU unit (e.g., 406 of FIG. 4A or 426 of FIG. 4B) in the multi-layer video decoder (150) may construct an inter-layer reference frame or image portion (e.g., 404 of FIG. 4A or 424 of FIG.
  • FIG. 6A illustrates interlaced content (602) of the same perspective (either left eye or right eye) forming an image portion (606) in a top-and-bottom format, in some example embodiments.
  • the interlaced content (602) may be first demultiplexed into a (e.g., 1080i) top field (604-T) for a first time equal to t and a (e.g., 1080i) bottom field (604-B) for a second time equal to t + 1.
  • a (e.g., 1080i) top field (604-T) for a first time equal to t
  • a (e.g., 1080i) bottom field (604-B) for a second time equal to t + 1.
  • each of the top field (604-T) and the bottom field (604-B) may be vertically filtered and vertically subsampled to a first half field (608-1) (e.g., with one half, or with less than one half of the full spatial resolution or with another lower than the full spatial resolution, in the vertical direction) and a second half field (608-2) (e.g., with one half, or with less than one half of the full spatial resolution or with another lower than the full spatial resolution, in the vertical direction).
  • the first half field (608-1) and the second half field (608-2) may be interleaved into a top or bottom field (606) in a first interlaced image frame.
  • FIG. 6B illustrates interlaced content (602) of the same perspective (either left eye or right eye) forming an image portion (626) in a side-by-side format, in some example embodiments.
  • the interlaced content (602) is not required to be demultiplexed into two separate fields before horizontal filtering and horizontal subsampling and then interlaced into the image portion (626). Instead, the interlaced content (602) may be directly horizontally filtered and horizontally subsampled into the image portion (626).
  • the operations illustrated in FIG. 6B may be performed for each of the LE and RE perspectives, and may constitute one half of the filtering and sampling mechanism (104-H).
  • Operations described in FIG. 6A and FIG. 6B may be applied to each of the left eye and right eye perspective, thereby forming the first interleaved image frame in a top-and-bottom format (as illustrated in part in FIG. 6A) and the second interleaved image frame in a side-by-side bottom format (as illustrated in part in FIG. 6B).
  • One of the first interleaved frame and the second interleaved frame may be carried in the BL video signal, while the other (either differentially or non-differentially encoded) interleaved frame may be carried in the EL video signal.
  • one (e.g., the first interleaved image frame in the present example) of the first interleaved image frame and the second interleaved image frame carries high spatial frequency content in the horizontal direction
  • the other (e.g., the second interleaved image frame in the present example) of the first interleaved image frame and the second interleaved image frame carries high spatial frequency content in the vertical direction.
  • FIG. 7 illustrates multiplexing formats for carrying interlaced content, in some example embodiments.
  • low pass filtering and subsampling in the horizontal or vertical direction may be field based for interlaced content.
  • a left image field (708-L) and a right image field (708-R) may correspond to LE and RE perspectives of the interlaced content for a time equal to t, respectively.
  • Pixel values of one of the left image field (708-L) and the right image field (708-R) may be used to populate a left or right side of any of a plurality of multiplexing (or subsampling) formats (708-H-l, 708-H-2, 708-H-3, etc.) as shown in FIG.
  • pixel values of one of the left image field (708-L) and the right image field (708-R) may be used to populate a top or bottom of any of a plurality of multiplexing (or subsampling) formats (708-V-l, 708- V-2, 708- V-3, etc.) as shown in FIG. 7 when vertical filtering and horizontal subsampling are applied as illustrated in FIG. 6A.
  • subsampled image data from all image fields, of both LE and RE perspectives and corresponding to both of the first time equals to t and the second time equals to t+1 is present in a top-and-bottom format applied to interlaced content.
  • FIG. 8 A illustrates a multi-layer video encoder (100-1) that maintains high spatial frequency content present in input video sequence, in accordance with an embodiment of the invention.
  • FIG. 8B shows a multi-layer video decoder (150-2) corresponding to the multi-layer video encoder (100-1) shown in FIG. 8 A, in accordance with the example embodiment.
  • the multiple-layer video encoder (100-1) is configured to encode an input 3D video sequence that consists of a sequence of 3D input images.
  • a 3D input image in the sequence of 3D images comprises full resolution 3D image data that contains high spatial frequency content.
  • the full resolution 3D image data in a 3D input image may be initially decoded by the multiple-layer video encoder (100) into an input LE image frame (102-L) and an input RE image frame (102-R) both of which contain high spatial frequency content.
  • a first filtering and subsampling mechanism in the multi-layer video encoder (100) generates LE and RE image data filtered in one of the vertical or horizontal directions but unfiltered in the other of the vertical or horizontal directions based on the input LE and RE image frames (102-L and 102-R).
  • the first filtering and subsampling mechanism may be 104-H of FIG. 8 A configured to filter high spatial frequency content in the horizontal direction from the input LE and RE image frames (102-L and 102-R) and horizontally subsample the LE and RE image frames (102-L and 102-R) as filtered in the horizontal direction into corresponding LE and RE portions.
  • a multiplexer (106-H) may be configured to combine the LE and RE portions in a 3D multiplexed image frame (108-H) in a side-by- side format.
  • LE and RE residual image frames (806-L and 806-R) may be processed by the EL processing sub-path.
  • an RPU (114) may be configured to generate LE and RE reference image portions based on the multiplexed 3D image frame (108-H) as provided by the BL encoder (110).
  • the LE and RE reference image portions from the RPU (114) may be up-sampled in the same direction as the subsampling direction of the BL processing sub-path.
  • each of the LE and RE reference image portions generated by the RPU (114) may be up-sampled in the horizontal direction to form an up-sampled LE image frame (804-L) and an up-sampled RE image frame (804-R).
  • An addition operation (810-L) may be configured to accept the complement of the up-sampled LE image frame 804-L (which has been removed high spatial frequency content in the horizontal direction by the BL processing sub-path) and the input LE image frame (102-L) as input and to add the input LE image frame (102-L) and the complement of the up-sampled LE image frame 804-L to generate an LE residual image frame (806-L) that comprises high spatial frequency content in the horizontal direction only.
  • An addition operation (810-R) may be configured to accept the complement of the up-sampled RE image frame 804-R (which has been removed high spatial frequency content in the horizontal direction by the BL processing sub-path) and the input RE image frame (102-R) as input and to add the input RE image frame (102-R) and the complement of the up-sampled RE image frame 804-R to generate an RE residual image frame (806-R) that comprises high spatial frequency content in the horizontal direction only.
  • a second filtering and subsampling mechanism may be 104-V of FIG. 8 A configured to vertically subsample the LE and RE residual image frames (806-L and 806-R) into corresponding LE and RE residual portions. Additionally and optionally, the second filtering and subsampling mechanism may comprise a vertical filter configured to vertically filter the LE and RE residual image frames (806-L and 806-R), for example, before the above-mentioned subsampling operation on the LE and RE residual image frames (806-L and 806-R).
  • a multiplexer (106-V) may be configured to combine the LE and RE residual portions in a 3D multiplexed image frame (808-V) in a top-and-bottom format.
  • the BL encoder (110) generates, based at least in part on the first multiplexed 3D image frame (e.g., 108-H), a base layer video signal to be carried in a base layer frame compatible video stream (BL FC video stream 112-1), while the EL encoder (116) generates, based at least in part on the second multiplexed 3D image frame (e.g., 808-V), an enhancement layer video signal to be carried in an enhancement layer frame compatible video stream (EL FC video stream 112-3).
  • One or both of the BL encoder (110) and the EL encoder (116) may be implemented using one or more of a plurality of codecs, such as H.264/AVC, VP8, VC-1, and/or others.
  • FIG. 8B shows a multi-layer video decoder (150-2) that receives input video signals in which high spatial frequency content from an original video sequence (which may be the input video sequence as discussed in connection with FIG. 8A) in two orthogonal directions has been preserved in complementary image data carried in the enhancement layer and in the base layer, respectively, in accordance with an embodiment.
  • the multi-layer video decoder (150-2) is configured to decode one or more input video signals in the BL FC video stream (112-1 of FIG. 8B), EL RPU stream (112-2 of FIG. 8B), and EL FC video stream (112-3 of FIG. 8B) into a sequence of 3D output images.
  • a 3D output image in the sequence of 3D output images as decoded by the multi-layer video decoder (150-2) comprises high spatial frequency content for both eyes, as high spatial frequency content in the original video sequence that gives rise to the input video signals has been preserved in both the horizontal and vertical directions.
  • a BL decoder (152) generates, based at least in part on a BL video signal received from BL FC video stream (112-1 of FIG. 8B), a first multiplexed 3D image frame (158-H), while an EL decoder (156) generates, based at least in part on an EL video signal received from EL FC video stream (112-3 of FIG. 8B), a second multiplexed 3D image frame (858-V).
  • One or both of the BL decoder (152) and the EL decoder (156) may be implemented using one or more of a plurality of codecs, such as H.264/AVC, VP8, VC-1, and/or others.
  • the EL decoder (156) generates, based at least in part on the EL video signal in EL FC video stream (112-3 of FIG. 8B) without a prediction reference image frame from the RPU (154), the second multiplexed 3D image frame (858-V).
  • the multi-layer video decoder (150-2) may combine residual image data received in one or more enhancement layers (e.g., EL FC video stream 112-3) with image data received in a base layer (e.g., BL FC video stream 112-1) to produce full resolution LE and RE output image frames (e.g., 162-L and 162-R) that comprise high spatial frequency content in both vertical and horizontal directions.
  • a demultiplexer (DeMux, 160) may be configured to de-multiplex the multiplexed 3D image frames (158-H and 858-V) into LE and RE output image frames (162-L and 162-R) with high spatial frequency content.
  • each of the LE and RE output image frames (162-L and 162-R) is only for one of left and right eyes.
  • a first LE image data portion in the first multiplexed 3D image frame (158-H) may be combined with a second LE image data portion in the second multiplexed 3D image frame (158-V) to form the LE output image (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
  • a first RE image data portion in the first multiplexed 3D image frame (158-H) may be combined with a second RE image data portion in the second multiplexed 3D image frame (158-V) to form the RE output image (162-R) that comprises high spatial frequency content in both vertical and horizontal directions.
  • the full resolution LE and RE output image frames (162-L and 162-R) both of which comprise high spatial frequency content in both vertical and horizontal directions may be rendered by a display device (which, for example, may comprise the multi-layer video decoder 150) to present a full resolution output 3D image. Rendering the full resolution LE and RE output image frames may, but is not limited to, be in a frame- sequential manner. Because high spatial frequency content has been preserved in the video signals as received by the multi-layer video decoder (150), the full resolution output 3D image contains high spatial frequency image details that may exist in an original 3D image (which may be one of the 3D input images of FIG. 8A).
  • inter-layer prediction is not required.
  • Decoding of BL and EL image data by a multi-layer video decoder in these embodiments may be independently performed to a greater extent than that in other embodiments (e.g., as illustrated in FIG. 1 A and FIG.
  • a simpler demultiplexer e.g., 160-6 of FIG. 9 than those illustrated in FIG. 2 through FIG. 4 may be used in a multi-layer video decoder such as 150-2 of FIG. 8B.
  • residual image data may be subsampled using any of subsampling formats as illustrated in FIG. 5 in progressive video applications, and any of subsampling formats as illustrated in FIG. 7 in interlaced video applications.
  • An RPU signal as described herein may be used to signal, to a downstream video decoder, selected subsampling formats of BL and EL image data as encoded by an upstream multi-layer video encoder.
  • FIG. 10A illustrates an example process flow according to an embodiment of the present invention.
  • one or more computing devices or hardware components may perform this process flow.
  • a multi-layer video encoder e.g., 100 receives an input 3D image, the input 3D image comprising a left eye (LE) input image frame and a right eye (RE) input image frame.
  • the multi-layer video encoder (100) generates, based on the LE input image frame and the RE input image frame, a first multiplexed image frame comprising first high spatial frequency content in a horizontal direction and first reduced resolution content in a vertical direction.
  • the multi-layer video encoder (100) generates, based on the LE input image frame and the RE input image frame, a second multiplexed image frame comprising second high spatial frequency content in the vertical direction and second reduced resolution content in the horizontal direction.
  • the multi-layer video encoder (100) encodes and outputs the first multiplexed image frame and the second multiplexed image frame to represent the input 3D image.
  • the 3D input image is a first 3D input image in a sequence of 3D input images comprising a second different 3D input image having a second LE input image frame and a second LE input image frame.
  • the multi-layer video encoder (100) is further configured to perform: generating, based on the second LE input image frame and the second RE input image frame, a third multiplexed image frame comprising third high spatial frequency content in the horizontal direction and third reduced resolution content in the vertical direction; generating, based on the second LE input image frame and the second input image frame, a fourth multiplexed image frame comprising fourth high spatial frequency content in the vertical direction and fourth reduced resolution content in the horizontal direction; and encoding and outputting the third multiplexed image frame and the fourth multiplexed image frame to represent the second input 3D image.
  • the first multiplexed image frame comprises a first LE image data portion and a first RE image data portion.
  • the first LE image data portion and the first RE image data portion are of a same spatial resolution along both horizontal and vertical directions.
  • the second multiplexed image frame comprises a second LE image data portion and a second RE image data portion.
  • the second LE image data portion and the second RE image data portion are of a same spatial resolution along both horizontal and vertical directions.
  • each of the first LE image data portion and the first RE image data portion represents a subsampled version (e.g., one half, less than one half, or another reduced number, of the full resolution) of a whole image frame; the first multiplexed image frame adopts a side-by-side format to carry the first LE image data portion and the first RE image data portion.
  • Each of the second LE image data portion and the second RE image data portion represents a subsampled version (e.g., one half, less than one half, or another reduced number, of the full resolution) of a whole image frame; the second multiplexed image frame adopts a top-and-bottom format to carry the second LE image data portion and the second RE image data portion.
  • the first multiplexed image frame adopts a first multiplexing format that preserves the high spatial frequency content in the horizontal direction.
  • the second multiplexed image frame adopts a second multiplexing format that preserves the high spatial frequency content in the vertical direction.
  • one of the first multiplexed image frame or the second multiplexed image frame is outputted in a base layer bitstream in a plurality of bit streams, while the other of the first multiplexed image frame or the second multiplexed image frame is outputted in an enhancement layer bitstream in the plurality of bit streams.
  • the multi-layer video encoder (100) is further configured to perform: generating, based at least in part on the first multiplexed image frame, prediction reference image data; and encoding an enhancement layer video signal based on differences between the prediction reference image data and the second input image frame.
  • the multi-layer video encoder (100) is further configured to perform: applying one or more first operations comprising at least one of (a) spatial frequency filtering operations or (b) spatial subsampling operations in the second direction to the first input image frame and the second input image frame in generating the first multiplexed image frame, wherein the one or more first operations removes high spatial frequency content in the second direction and preserves high spatial frequency content in the first direction; and applying one or more second operations comprising at least one of (a) spatial frequency filtering operations or (b) spatial subsampling operations in the first direction to the first input image frame and the second input image frame in generating the second multiplexed image frame, wherein the one or more second operations removes high spatial frequency content in the first direction and preserves high spatial frequency content in the first direction.
  • one of first multiplexed image frame or the second multiplexed image frame comprises residual image data generated by subtracting reference image data generated based on the other of the first multiplexed image frame or the second multiplexed image frame from input image data derived from the LE input image frame and the RE input image frame.
  • the multi-layer video encoder (100) is further configured to convert one or more 3D input images represented, received, transmitted, or stored with one or more input video signals into one or more 3D output images represented, received, transmitted, or stored with one or more output video signals.
  • the input 3D image comprises image data encoded in one of a high dynamic range (HDR) image format, a RGB color space associated with the Academy Color Encoding Specification (ACES) standard of the Academy of Motion Picture Arts and Sciences (AMPAS), a P3 color space standard of the Digital Cinema Initiative, a Reference Input Medium Metric/Reference Output Medium Metric (RIMM/ROMM) standard, an sRGB color space, a RGB color space associated with the BT.709 Recommendation standard of the International Telecommunications Union (ITU), etc.
  • HDR high dynamic range
  • RGB associated with the Academy Color Encoding Specification
  • AMPAS Academy of Motion Picture Arts and Sciences
  • RIMM/ROMM Reference Input Medium Metric/Reference Output Medium Metric
  • sRGB color space a RGB color space associated with the BT.709 Recommendation standard of the International Telecommunications Union (ITU), etc.
  • FIG. 10B illustrates another example process flow according to an example embodiment of the present invention.
  • one or more computing devices may perform this process flow.
  • a multi-layer video decoder e.g., 150 receives a 3D image represented by a first multiplexed image frame and second multiplexed image frame, the first multiplexed image frame comprising first high spatial frequency content in a horizontal direction and first reduced resolution content in a vertical direction, and the second multiplexed image frame comprising second high spatial frequency content in the vertical direction and second reduced resolution content in the horizontal direction.
  • the multi-layer video decoder (150) generates, based on the first multiplexed image frame and the second multiplexed image frame, a left eye (LE) image frame and a right eye (RE) image frame, the LE image frame comprising LE high spatial frequency content in both horizontal and vertical directions, and the RE image frame comprising RE high spatial frequency content in both horizontal and vertical directions.
  • LE left eye
  • RE right eye
  • the multi-layer video decoder (150) renders the 3D image by rendering the LE image frame and the RE image frame.
  • the 3D image is a first 3D image in a sequence of 3D images comprising a second different 3D image having third multiplexed image frame and a fourth multiplexed image frame, the third multiplexed image frame comprising third high spatial frequency content in the horizontal direction and third reduced resolution content in the vertical direction, and the fourth multiplexed image frame comprising fourth high spatial frequency content in the vertical direction and fourth reduced resolution content in the horizontal direction.
  • the multi-layer video decoder (150) is further configured to perform:
  • At least one of the first multiplexed image frame or the second multiplexed image frame comprises an LE image data portion and an RE image data portion.
  • the LE image data portion and the RE image data portion are of a same spatial resolution.
  • each of the LE image data portion and the RE image data portion represents a subsampled version (e.g., one half, less than one half, or another reduced number, of the full resolution) of a whole image frame.
  • the LE image data portion and the RE image data portion forms a single image frame in one of a side-by-side format or a top-and-bottom format.
  • one of the first multiplexed image frame or the second multiplexed image frame is decoded from a base layer bitstream in a plurality of bit streams, while the other of the first multiplexed image frame or the second multiplexed image frame is decoded from an enhancement layer bitstream in the plurality of bit streams.
  • the multi-layer video decoder (150) is further configured to perform: generating, based at least in part on one of the first multiplexed image frame or the second multiplexed image frame, prediction reference image data; and generating, based on enhancement layer (EL) data decoded from an EL video signal and the prediction reference image data, one of the LE image frame or the RE image frame.
  • EL enhancement layer
  • the multi-layer video decoder (150) is further configured to perform: applying one or more first operations comprising at least one of (a) spatial frequency filtering operations or (b) demultiplexing operations in generating the LE image frame, wherein the one or more first operations combine LE high spatial frequency content, as derived from the first multiplexed image frame and the second multiplexed image frame, of both horizontal and vertical directions into the LE image frame; and applying one or more second operations comprising at least one of (a) spatial frequency filtering operations or (b) demultiplexing operations in generating the RE image frame, wherein the one or more second operations combine RE high spatial frequency content, as derived from the first multiplexed image frame and the second multiplexed image frame, of both horizontal and vertical directions into the RE image frame.
  • the one or more first operations and the one or more second operations comprise at least a high pass filtering operation.
  • the one or more first operations and the one or more second operations comprise a processing sub-path that replaces at least one high pass filtering operation.
  • the processing sub-path comprises at least one subtraction operation and no high pass filtering operation.
  • one of the first multiplexed image frame or the second multiplexed image frame comprises residual image data.
  • the multi-layer video decoder (150) is further configured to perform: decoding and processing enhancement layer image data without generating prediction reference data from the other of the first multiplexed image frame or the second multiplexed image frame.
  • the multi-layer video decoder (150) is further configured to process one or more 3D images represented, received, transmitted, or stored with one or more input video signals.
  • the 3D image comprises image data encoded in one of a high dynamic range (HDR) image format, a RGB color space associated with the Academy Color Encoding Specification (ACES) standard of the Academy of Motion Picture Arts and Sciences (AMPAS), a P3 color space standard of the Digital Cinema Initiative, a Reference Input Medium
  • HDR high dynamic range
  • AVS Academy Color Encoding Specification
  • AMPAS Academy of Motion Picture Arts and Sciences
  • P3 color space standard of the Digital Cinema Initiative
  • Metric/Reference Output Medium Metric (RIMM/ROMM) standard, an sRGB color space, a RGB color space associated with the BT.709 Recommendation standard of the International Telecommunications Union (ITU), etc.
  • an encoder performs any or a part of the foregoing methods as described.
  • the enhancement layer (EL) stream 112-3 comprises image residuals (e.g., 806-L and 806-R) multiplexed as top-and-bottom (TaB) frames (808-V).
  • image residuals e.g., 806-L and 806-R
  • TaB top-and-bottom
  • improved compression may be achieved by combining the residual signal with a "carrier" image signal to form a new EL signal.
  • FIG. 12A illustrates an example FCFR encoder according to an embodiment that utilizes a carrier image signal in the enhancement layer.
  • the processing of the base layer follows the processing steps discussed earlier, e.g., as depicted in FIG. 8A.
  • Step 1205 represents a simplified representation of steps 104-H and 106-H depicted in FIG. 8A.
  • BL signal 1207 is then compressed with a base layer encoder 1210 (e.g., an H.264/AVC encoder) to generate compressed BL stream 1240-1.
  • a base layer encoder 1210 e.g., an H.264/AVC encoder
  • BL signal 1207 may be used to regenerate full resolution (FR) versions of the left and right views (Left FR and Right FR) using horizontal up-sampling 804.
  • the original (102-L and 102-R) and the reconstructed views are then subtracted (e.g., in 810) to generate residuals 806-L and 806-R.
  • Multiplexer 1215 multiplexes these residuals in a frame format (e.g., TaB) that is orthogonal to the frame format being used in the base layer (e.g., SbS) to generate residual signal 808-V.
  • residual 808-V is added to a carrier signal 1222 to generate an EL signal 1237.
  • Carrier signal 1222 may be generated using a Carrier RPU 1220 in response to the SbS BL 1207 signal.
  • Carrier RPU may perform both horizontal up-sampling and vertical down-sampling to generate a carrier TaB signal 1222 that matches the format and resolution of the residual signal (e.g., 808V).
  • vertical down-sampling is performed before the horizontal up-sampling.
  • the carrier signal may be generated in response to a decoded version of the BL stream, e.g., decoded BL signal 1212.
  • similar processing may be applied when the base layer is in the TaB format and the enhancement layer is in the SbS format (see FIG. 14). Processing related to the Carrier RPU 1220 and the Codec RPU 1225 may be performed by the same processor or different processors.
  • the range of the carrier signal may be reduced, e.g., by dividing each pixel by a constant, e.g., 2. Such a division process may be combined with the filtering process in the carrier RPU.
  • the carrier signal may have a fixed pixel value for all of its pixels, e.g., 128.
  • the value of the carrier signal may be defined adaptively based on the properties of the residual signal. Such decisions may be signaled to the decoder using metadata (e.g., as part of an RPU data stream).
  • EL signal 1237 is compressed using multi-view coding (MVC) as specified in the H.264 (AVC) specification to generate a coded or compressed EL stream 1240-2.
  • MVC multi-view coding
  • AVC H.264
  • a Reference Processing Unit (RPU) 1225 may be employed to convert the decoded SbS BL signal 1212 into a TaB signal (1227) that can be used as a reference by the MVC encoder 1230.
  • This TaB picture (1227) is a newly generated inter- view prediction reference picture and is inserted into the reference picture list for encoding an EL picture.
  • Codec RPU 1225 may apply additional partitioning, processing and filtering to match an inter- view reference picture from the BL signal to the EL input signal, as described in PCT application PCT/US2010/040545, filed June 30, 2010, by A. Tourapis, et al., and incorporated herein by reference in its entirety.
  • the choices of filters and partitions being used in the RPUs can be adapted at multiple levels of resolution, e.g., at the slice, picture, Group of Picture (GOP), scene, or sequence level.
  • the coded EL and BL streams (1240) may be multiplexed with RPU data (e.g., 1240-3) and other auxiliary data (not shown) to be transmitted to a decoder.
  • FIG. 12B illustrates a decoding example process flow according to an example embodiment of the present invention.
  • the incoming stream is
  • BL Decoder 1250 corresponds to the BL encoder 1210.
  • BL decoder 1250 is an AVC decoder.
  • BL decoder 1250 will generate a decoded (e.g., SbS) BL image 1252.
  • the codec RPU 1255 may generate signal 1257 to be used by MVC decoder 1260.
  • Signal 1257 comprises a predicted enhancement layer signal which may be used as an additional sequence of reference frames for the MVC decoder 1260 to generate a decoded EL signal 1262.
  • the decoder After decompressing the BL and EL streams, the decoder needs to reconstruct the left and right views at full resolution.
  • One example embodiment of such a method is depicted in FIG. 13 A.
  • the vertical or horizontal frequencies that are missing in the base layer can be constructed as a pixel-wise difference of the enhancement layer (1262) and the carrier signal 1302.
  • Carrier signal 1302 can be reconstructed using decoded BL signal 1252 and the decoder RPU (e.g., 1255) using processing that matches the processing of the encoder Carrier RPU (1220).
  • An example of such a process is depicted in FIG. 15.
  • Residue signal 1317 is then up-sampled (e.g., in 1320) and merged by pixel- wise addition with the up-sampled frame-compatible (FC) reconstructed base layer (FC-L and FC-R) to reconstruct the full resolution (FR) left and right views (FR-LE and FR-RE).
  • FC frame-compatible
  • FC-L and FC-R up-sampled frame-compatible reconstructed base layer
  • FR-LE and FR-RE full resolution left and right views
  • legacy receivers may still decode a pair of
  • FIG. 13B depicts another example embodiment for reconstructing the full resolution signal, the method to be referred to as "the high pass method.”
  • the decoded TaB EL signal 1262 is first processed by a horizontal high-pass filter 1330.
  • Such a filter removes the low-frequency components of the carrier signal in the EL signal, thus generating a carrier-free residual signal (e.g., 808-V).
  • the output of the high-pass filter is up-sampled vertically (1320) to generate residual signals 1322-L and 1322-R, which are added to the horizontally up-sampled, reconstructed frame-compatible layer (FC-L and FC-R), to generate full-resolution estimates of the original views (e.g., FR-RE and FR-LE).
  • the top field and bottom field are processed independently during the vertical filtering (e.g., down-sampling or up-sampling). Then the fields are merged (e.g., by line interleaving) to create a frame.
  • the interlaced signal can be coded either in frame coding mode or field coding mode.
  • the codec RPU should also be instructed whether to process an inter- view reference picture from BL as a frame or fields. In AVC coding, there is no indication of the scan type of a coded sequence in the mandated bitstream, since it is out of the scope of decoding.
  • SEI Supplemental Enhancement Information
  • a SEI message is not required for decoding.
  • a high level syntax is proposed to indicate if the RPU should apply frame or field processing.
  • the RPU may always process the picture as separate fields.
  • the RPU may follow how the BL signal is coded. Hence, if the BL signal is coded as fields, the RPU applies field processing, otherwise it applies frame processing.
  • Embodiments of this invention comprise a variety of filters, which can be categorized as: multiplexing (or muxing) filters, RPU filters, and de-multiplexing (or de-muxing) filters.
  • muxing filters When designing muxing filters, the goal is to maintain as much information as possible from the original filter, but without causing aliasing.
  • a muxing filter For down-sampling, a muxing filter may be designed to have very flat passband response and strong attenuation at the midpoint of the spectrum, where the signal is folded during down-sampling, to avoid aliasing, In an embodiment, (in Matlab notation) an example of such a filter has coefficients:
  • the down-sampling filter and up-sampling filters should have a very low cutoff frequency, as the high frequencies in carrier image are not used for reconstruction and having such low passed signal would help to increase coding efficiency for the EL signal.
  • the RPU down-sampling and up-sampling filters should also be of as low order as possible since these exact filters are used in decoders for real time decoding. Examples of such filters are depicted in Table 7.
  • the high-pass filter (1330) should be complementary of the combined frequency responses of the muxing down- sampling filter and de-muxing up- sampling filter.
  • the order of such a filter will be high, which may not be suitable for certain real-time decoder applications.
  • High-pass filters with similar pass band characteristics, but lower stop band attenuation, can also be derived with a much lower filter order, making them better suited for real-time decoder applications. Examples of such filters are depicted in Table 8.
  • Some implementations may have a very low bit rate requirement for the EL stream.
  • one may remove all chroma information from the EL stream.
  • one may set chroma values in the EL signal to be a constant value, for example, 128.
  • the color components of an inter-view reference picture processed by the Codec RPU needs to be set in the same way.
  • one may select and transmit only those regions of the input signal with the most high frequencies in the EL signal and gray out the remaining areas, for example, by setting them to a constant value (e.g., 128). The location and size of such regions may be signaled from an encoder to the decoder using metadata, e.g., through the
  • the residue signal 808-V is added to the carrier signal 1222 directly to generate EL 1237.
  • linear or non-linear quantization method may be applied to the residue signal before adding it with the carrier.
  • additional enhancement layers may be employed to restore additional frequencies, such as diagonal.
  • the input signal (102-L and 102-R) may be down-sampled across a diagonal direction. That information (or another residual signal based on that diagonal information) may be transmitted to a decoder as a second enhancement layer (EL2).
  • EL2 second enhancement layer
  • a decoder could merge BL, EL, and EL2 signals to generate an FCFR signal.
  • luma will be coded in full resolution, but chroma will be coded in half resolution.
  • a coding standard such as H.264, typically defines only the syntax of the coded bistream and the decoding process. This section presents examples of a proposed syntax for a new FCFR profile in H.264 or other compression standard that supports the methods of this invention.
  • the first part is called the RPU header, rpu_header(), and includes the static information that most likely is not going to change during the transmission of the signal.
  • the second part is called the RPU data payload, rpu_data_payload(), and includes the dynamic information which might be updated more frequently.
  • the RPU data payload signals to the decoder the filters that will be used to update the inter- view reference pictures prior to their use for prediction.
  • the syntax can be sent at the slice level, the picture level, the GOP level, the scene level, or at the sequence level. It can be included at the NAL unit header, the Sequence Parameter Set (SPS) and its extension, the SubSPS, the Picture Parameter Set (PPS), the sheer header, the SEI message, or a new NAL unit, and the like.
  • the RPU syntax is only updated at the sequence level.
  • a new NAL unit for Coded slice extension for MFC, slice_layer_extension_rbsp( ) is also defined.
  • a new profile denoted in Table 2 as profile 134, is assigned for an embodiment of a FCFR 3D system using orthogonal multiplexing (OM).
  • RPU Data Header Semantics rpu_type specifies the prediction type purpose for the RPU signal. If not present, then its value is assumed to be 0.
  • rpu_format specifies the prediction process format, given the rpu_type, that will be used when processing the video data for prediction and/or final reconstruction. If not present, then its value is assumed to be 0. Table 5 depicts examples of rpujype and rpu_format values.
  • default_grid_position signals whether viewO and viewl grid position information should be explicitly signaled. If default_grid_position is set to 1, or not present, then default values are obtained as follows:
  • view0_grid_position_x 4 ;
  • view0_grid_position_y 8 ;
  • viewO_grid_position_y is same as the frameO_grid_position_y as defined in Frame packing arrangement SEI message semantics section in the H.264 specification.
  • viewl_grid_position_x is same as the frame l_grid_position_x as defined in Frame packing arrangement SEI message semantics section in the H.264 specification.
  • viewl_grid_position_y is same as the frame l_grid_position_y as defined in Frame packing arrangement SEI message semantics section in the H.264 specification.
  • interlace_processing_flag signals whether reference processing will be applied on a frame or a field basis. If it is set to zero, processing will take place in the frame domain. If this flag is set to 1 then processing shall be performed separately for each field.
  • disable_part_symmetry_flag (when present) signals whether filter selection for spatially collocated partitions belonging to different views is constrained or unconstrained.
  • this flag is not set, both collocated partitions in either view are processed with the same RPU filter to derive the enhancement layer prediction. Hence half as many filters are signaled.
  • this flag is set, a filter is signaled for each partition in the processed picture. If not present, then all partitions use the same filtering method (NULL).
  • num_x_partitions_minusl signals the number of partitions that are used to subdivide the processed picture in the horizontal dimension during filtering. It can take any non-negative integer value. If not present, then the value of num_x_partitions_minusl is set equal to 0. The value of num_x_partitions_minus 1 is between 0 and Clip3(0, 15, (PicWidthlnMbs » 1) - 1), where PicWidthlnMbs is specified in the H.264 specification.
  • num_y_partitions_minusl signals the number of partitions that are used to subdivide the processed picture in the vertical dimension during filtering. It can take any non-negative integer value. If not present, then the value of num_y_partitions_minus 1 is set equal to 0. The value of num_y_partitions_minus l is between 0 and Clip3(0, 7,
  • PicHeightlnMapUnits » 1) 1
  • PicHeightlnMapUnits is specified in the H.264 specification.
  • filter_idx_down[ y ][ x ][ cmp ] contains an index that corresponds to the down-sampling processing filter that is to be used for the partition with vertical coordinate y and horizontal coordinate x, corresponding to color component cmp. This index may take any non-negative value, each corresponding to a unique processing filter or scheme.
  • filter_idx_up[ y ][ x ][ cmp ] contains an index that corresponds to the up-sampling processing filter that is to be used for the partition with vertical coordinate y and horizontal coordinate x, corresponding to color component cmp. This index may take any non-negative value, each corresponding to a unique processing filter or scheme.
  • FIG. 14 depicts an example process flow for generating the EL signal (1237).
  • the right-half of FIG. 14 depicts the case where the base layer is coded in side-by-side (SbS) format, hence the EL layer is coded in top-and-bottom (TaB) format.
  • the process flow in this half matches the process flow as depicted in FIG. 12A.
  • the left-half of FIG. 14 depicts the case when the base layer is coded in top-and-bottom (TaB) format (1405) and the enhancement layer is coded in SbS format (1415).
  • TaB multiplexing step 1405 may follow the processing depicted in steps 104-V and 106-V in FIG.1, while SbS multiplexing 1415 may follow the processing depicted in steps 104-H and 106-H in FIG. 1.
  • FIG. 15 depicts an embodiment of an example process in the RPU during the decoding of an FCFR stream to generate carrier signal 1302.
  • a similar process may also be applied to generate the carrier signal 1222 in the encoder, e.g., using Carrier RPU 1220.
  • the process operates on all partitions and all color components of an input sequence.
  • the data flow may be the same regardless whether the BL signal is multiplexed in SbS format or in TaB format; however, the filtering orientations depend on the format of the base layer.
  • the process also assumes that down-sampling precedes up-sampling; however, in another embodiment up-sampling may precede down-sampling.
  • the RPU Using the RPU data stream (e.g., 1240-3), and a filter identification look-up table (e.g., Table 6), in step 1510 the RPU identifies the down-sampling filter to be used to down-sample the decoded BL signal. If it is an F0 or Fl filter (1515- 1), then it proceeds to perform down-sampling (1520- 1). If it is an F2 filter, then it simply creates a carrier signal with all pixel values set to a constant (e.g., 128) (1520-2). If the BL layer is coded in the SbS format, then down-sampling (1520) is performed in the vertical direction.
  • a filter identification look-up table e.g., Table 6
  • down-sampling (1520) is performed in the horizontal direction.
  • the two original halves (or views) are de-multiplexed and then multiplexed again in an orthogonal orientation, e.g., from SbS to TaB or from TaB to SbS, to form an intermediate result that matches the multiplexing format of the residual signal.
  • This intermediate signal is then up-sampled so that the final carrier signal matches the resolution of the residual signal.
  • the up-sampling filter is FO or Fl (1525-1), then the intermediate result is up-sampled to generate the final carrier signal 1237.
  • Table 1 NAL Unit type, syntax element categories, and NAL unit type classes
  • the techniques described herein are implemented by one or more special-purpose computing devices.
  • the special-purpose computing devices may be hard- wired to perform the techniques, or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination.
  • ASICs application-specific integrated circuits
  • FPGAs field programmable gate arrays
  • Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques.
  • the special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard- wired and/or program logic to implement the techniques.
  • FIG. 11 is a block diagram that illustrates a computer system 1100 upon which an example embodiment of the invention may be implemented.
  • Computer system 1100 includes a bus 1102 or other communication mechanism for communicating information, and a hardware processor 1104 coupled with bus 1102 for processing information.
  • Hardware processor 1104 may be, for example, a general purpose microprocessor.
  • Computer system 1100 also includes a main memory 1106, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 1102 for storing information and instructions to be executed by processor 1104.
  • Main memory 1106 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 1104.
  • Such instructions when stored in non-transitory storage media accessible to processor 1104, render computer system 1100 into a special-purpose machine that is customized to perform the operations specified in the instructions.
  • Computer system 1100 further includes a read only memory (ROM) 1108 or other static storage device coupled to bus 1102 for storing static information and instructions for processor 1104.
  • ROM read only memory
  • a storage device 1110 such as a magnetic disk or optical disk, is provided and coupled to bus 1102 for storing information and instructions.
  • Computer system 1100 may be coupled via bus 1102 to a display 1112, such as a liquid crystal display, for displaying information to a computer user.
  • a display 1112 such as a liquid crystal display
  • An input device 1114 is coupled to bus 1102 for communicating information and command selections to processor 1104.
  • cursor control 1116 is Another type of user input device
  • cursor control 1116 such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 1104 and for controlling cursor movement on display 1112.
  • This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
  • Computer system 1100 may implement the techniques described herein using customized hard- wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 1100 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 1100 in response to processor 1104 executing one or more sequences of one or more instructions contained in main memory 1106. Such instructions may be read into main memory 1106 from another storage medium, such as storage device 1110. Execution of the sequences of instructions contained in main memory 1106 causes processor 1104 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.
  • Non- volatile media includes, for example, optical or magnetic disks, such as storage device 1110.
  • Volatile media includes dynamic memory, such as main memory 1106.
  • Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.
  • Storage media is distinct from but may be used in conjunction with transmission media.
  • Transmission media participates in transferring information between storage media.
  • transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 1102.
  • transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
  • Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor 1104 for execution.
  • the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer.
  • the remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem.
  • a modem local to computer system 1100 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal.
  • An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 1102.
  • Bus 1102 carries the data to main memory 1106, from which processor 1104 retrieves and executes the instructions.
  • the instructions received by main memory 1106 may optionally be stored on storage device 1110 either before or after execution by processor 1104.
  • Computer system 1100 also includes a communication interface 1118 coupled to bus 1102.
  • Communication interface 1118 provides a two-way data communication coupling to a network link 1120 that is connected to a local network 1122.
  • communication interface 1118 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line.
  • ISDN integrated services digital network
  • communication interface 1118 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN.
  • LAN local area network
  • Wireless links may also be implemented.
  • communication interface 1118 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
  • Network link 1120 typically provides data communication through one or more networks to other data devices.
  • network link 1120 may provide a connection through local network 1122 to a host computer 1124 or to data equipment operated by an Internet Service
  • ISP 1126 ISP 1126.
  • ISP 1126 provides data communication services through the world wide packet data communication network now commonly referred to as the "Internet” 1128.
  • Internet 1128 uses electrical, electromagnetic or optical signals that carry digital data streams.
  • the signals through the various networks and the signals on network link 1120 and through communication interface 1118, which carry the digital data to and from computer system 1100, are example forms of transmission media.
  • Computer system 1100 can send messages and receive data, including program code, through the network(s), network link 1120 and communication interface 1118.
  • a server 1130 might transmit a requested code for an application program through Internet 1128, ISP 1126, local network 1122 and communication interface 1118.
  • the received code may be executed by processor 1104 as it is received, and/or stored in storage device 1110, or other non-volatile storage for later execution.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A high resolution 3D image may be encoded into a first multiplexed image frame and a second multiplexed image frame in a base layer (BL) video signal and an enhancement layer (EL) video signal. The first multiplexed image frame may comprise horizontal high resolution image data for both eyes, while the second multiplexed image frame may comprise vertical high resolution image data for both eyes. Encoded symmetric-resolution image data for the 3D image may be distributed to a wide variety of devices for 3D image processing and rendering. A recipient device may reconstruct reduced resolution 3D image from one of the first multiplexed image frame or the second multiplexed image frame. A recipient device may also reconstruct high resolution 3D image by combining high resolution image data from both of the first multiplexed image frame and the second multiplexed image frame.

Description

FRAME-COMPATIBLE FULL-RESOLUTION STEREOSCOPIC 3D VIDEO DELIVERY WITH SYMMETRIC PICTURE RESOLUTION AND QUALITY
CROSS-REFERENCE TO RELATED APPLICATIONS This application claims priority to United States Provisional patent application Ser. No.
61/541,005 filed on Sept. 29, 2011 and United States Provisional patent application Ser. No. 61/583,081 filed on Jan. 04, 2012, which are incorporated by reference in their entirety.
TECHNOLOGY
The present invention relates generally to image data. More particularly, an example embodiment of the present invention relates to image data for stereoscopic 3D images.
BACKGROUND
Frame-compatible half resolution (FCHR) solutions for 3D content delivery suffer from degraded spatial resolution because the half resolution 3D content only contains half resolution image frames subsampled from full resolution 3D image frames.
Under some techniques, frame-compatible full resolution (FCFR) solutions may be used to produce full resolution 3D image frames by sending half resolution 3D image frames through a base layer and sending complementary half resolution 3D image frames through an enhancement layer. The half resolution 3D image frames and the complementary half resolution 3D image frames may be combined by a recipient device into 3D image frames at full resolution.
However, these techniques implement low-pass filtering to reduce/remove aliasing in the half resolution image frames. As high frequency content in the image frames is removed by low-pass filtering, it is not possible for a downstream device to recover all the fine details and textures that were in the high spatial frequency content. While full resolution 3D image frames might still be constructed, the pixels in the 3D image frames would have been irreversibly altered by low-pass filtering and could not be used to reproduce the original resolution and sharpness in original 3D content that gives rise to the 3D image frames.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, issues identified with respect to one or more approaches should not assume to have been recognized in any prior art on the basis of this section, unless otherwise indicated. BRIEF DESCRIPTION OF DRAWINGS
The present invention is illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:
FIG. 1A illustrates a multi-layer video encoder that maintains high spatial frequency content present in input video sequence, in accordance with an embodiment of the invention;
FIG. IB illustrates a multi-layer video decoder that receives input video signals with high spatial frequency content, in accordance with an embodiment;
FIG. 1C illustrates a base-layer video decoder, in accordance with an embodiment;
FIG. 2, FIG. 3, and FIG. 4 illustrate different configurations of demultiplexers, according to some example embodiments;
FIG. 5 illustrates multiplexing formats, in some example embodiments;
FIG. 6A and FIG. 6B illustrate interlaced content of a perspective forming image portions in a top-and-bottom and a side-by-side format, in some example embodiments;
FIG. 7 illustrates multiplexing formats for carrying interlaced content, in some example embodiments;
FIG. 8A illustrates a multi-layer video encoder, in accordance with an embodiment of the invention;
FIG. 8B shows a multi-layer video decoder, in accordance with an embodiment;
FIG. 9 illustrates a demultiplexer, according to some example embodiments;
FIG. 10A and FIG. 10B illustrate process flows, in some example embodiments;
FIG. 11 illustrates an example hardware platform on which a computer or a computing device as described herein may be implemented, according an example embodiment of the present invention;
FIG. 12A and FIG. 12B illustrate an FCFR multi-layer video encoder and an FCFR multi-layer video decoder, in accordance with an embodiment of the invention;
FIG. 13A and FIG. 13B illustrate example embodiments for reconstructing the full resolution signals;
FIG. 14 illustrates an encoding process flow for generating an enhancement layer according to an embodiment of the invention; and
FIG. 15 illustrates a filtering process flow in the decoder RPU to generate a carrier image signal according to an embodiment of the invention.
DESCRIPTION OF EXAMPLE EMBODIMENTS
Example embodiments, which relate to 3D video coding, are described herein. In the following description, for the purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the present invention. It will be apparent, however, that the present invention may be practiced without these specific details. In other instances, well-known structures and devices are not described in exhaustive detail, in order to avoid unnecessarily occluding, obscuring, or obfuscating the present invention.
Example embodiments are described herein according to the following outline:
1. GENERAL OVERVIEW
2. MULTI-LAYERED VIDEO DELIVERY
3. DEMULTIPLEXERS
4. SAMPLING FORMATS
5. INTERLACED VIDEO APPLICATIONS
6. RESIDUAL IMAGE CODING
7. EXAMPLE PROCESS FLOWS
8. RESIDUAL IMAGE CODING WITH CARRIER SIGNAL
9. IMPLEMENTATION MECHANISMS - HARDWARE OVERVIEW
10. EQUIVALENTS, EXTENSIONS, ALTERNATIVES AND
MISCELLANEOUS
1. GENERAL OVERVIEW
This overview presents a basic description of some aspects of an example embodiment of the present invention. It should be noted that this overview is not an extensive or exhaustive summary of aspects of the example embodiment. Moreover, it should be noted that this overview is not intended to be understood as identifying any particularly significant aspects or elements of the example embodiment, nor as delineating any scope of the example embodiment in particular, nor the invention in general. This overview merely presents some concepts that relate to the example embodiment in a condensed and simplified format, and should be understood as merely a conceptual prelude to a more detailed description of example embodiments that follows below.
Video data is currently received mostly through network connections, for example, from internet-based content providers. However, the bitrate allocated to a display application such as a 3D display application on a computing device is limited.
To support a widest possible variety of 3D image rendering devices, 3D image content may be delivered as frame compatible 3D image frames (or pictures) with reduced resolutions. As discussed, 3D image frames may be subsampled from full resolution 3D image frames to reduced resolution 3D image frames; high spatial frequency content in the full resolution 3D image frames may be removed by low-pass filters to prevent aliasing in the subsampled image frames. Embodiments include encoding and providing symmetric high resolution 3D image data to downstream devices. In some example embodiments, a first multiplexed 3D image frame with reduced resolution in a horizontal direction and full resolution in a vertical direction is provided in one of a base layer and an enhancement layer to a recipient device, while a second multiplexed 3D image frame with reduced resolution in a vertical direction and full resolution in a horizontal direction is provided in the other of the base layer and the enhancement layer to the recipient device. Left eye (LE) and right eye (RE) image data in the enhancement layer may be combined by the recipient device with LE and RE image data in the base layer to reconstruct symmetric full resolution LE and RE image frames. One or both of the first multiplexed 3D image frame and the second multiplexed 3D image frame may be frame compatible to support reduced resolution (a less than full resolution, e.g., half resolution) 3D video applications.
Codecs implementing techniques as described herein may be configured to include inter-layer prediction capabilities to fully exploit statistical redundancy between a multiplexed 3D image frame in the base layer and input image frames. A multiplexed 3D image frame in the enhancement layer may (possibly only) carry residual or differential image data, instead of carrying a large amount of LE and RE image data without exploiting the statistical redundancy in image data of different layers. The residual or differential image data as provided in the enhancement layers enables downstream devices to construct symmetric full resolution LE and RE image frames by adding the residual or differential image data on top of the frame-compatible multiplexed 3D image frame in the base layer.
In some example embodiments, the codecs may be configured to include inter- view prediction capability used as described in ITU-T Recommendation H.264 and ISO/IEC 14496-10. In some example embodiments, a RPU (reference processing unit) may be used to improve efficiency in inter-layer prediction for enhancement layer compression.
In some embodiments, the multiplexed 3D image frames in both the base layer and the enhancement layer that comprise complementary high spatial frequency content may be transmitted to and/or rendered for viewing on high-end 3D displays. In addition, one (e.g., a frame compatible multiplexed 3D image frame) of the multiplexed 3D image frames may be transmitted to and/or rendered for viewing on relatively lower-end 3D displays.
[0010] In some example embodiments, data needed for other applications may also be included in one or more enhancement layers. In some example embodiments, a wide variety of features, as provided by FCFR technologies commercially available from Dolby Laboratories in San Francisco, California, may be supported by the base and enhancement layers as described herein. [0011] Techniques as described herein provide solutions to achieving symmetric high resolution and high picture quality while maintaining backwards compatibility to a variety of relatively low-end video players. A display system implementing techniques as described herein is able to achieve better picture quality of reconstructed 3D pictures than other display systems implementing other FCFR schemes. Particularly, a display system as described herein is able to retain more high frequencies and reproduce sharper pictures with more details than the other display systems.
[0012] Techniques as described herein may be used to reduce bandwidth or bitrate usage and preserve frame-compatible 3D image data with reduced resolution, which supports various televisions, displays and other image rendering devices.
[0013] In addition, an option for interlaced 3D content may also be implemented under techniques as described herein. This option, for example, may be used to carry 3D broadcast applications such as sports programs.
[0014] In some embodiments, reuse, adaptation and improvement of some available system components allows relatively low cost implementation as compared with other approaches without using techniques as described herein.
[0015] In some example embodiments, mechanisms as described herein form a part of a media processing system, including but not limited to: a handheld device, game machine, television, laptop computer, netbook computer, tablet computer, cellular radiotelephone, electronic book reader, point of sale terminal, desktop computer, computer workstation, computer kiosk, or various other kinds of terminals and media processing units.
[0016] Various modifications to the preferred embodiments and the generic principles and features described herein will be readily apparent to those skilled in the art. Thus, the disclosure is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features described herein.
2. MULTI-LAYERED VIDEO DELIVERY
[0017] FIG. 1A illustrates a multi-layer video encoder (100) that maintains high spatial frequency content present in input video sequence, in accordance with an embodiment of the invention. FIG. IB illustrates a multi-layer video decoder (150) that corresponds to the multi-layer video encoder (100) shown in FIG. 1A, in accordance with the example embodiment.
[0018] In an example embodiment, the multiple-layer video encoder (100) is configured to encode an input 3D video sequence. The input 3D video sequence consists of a sequence of 3D input images. A 3D input image in the sequence of 3D images comprises full resolution 3D image data that contains high spatial frequency content. As used herein, the term "full resolution" may refer to a spatial resolution maximally supported by the total number of independently settable pixels in an image frame. The full resolution 3D image data in a 3D input image may be initially decoded by the multiple-layer video encoder (100) into an input LE image frame (102-L) and an input RE image frame (102-R) both of which contain high spatial frequency content.
[0019] In an example embodiment, one or more filtering and subsampling mechanisms (e.g., 104-H and 104-V) in the multi-layer video encoder (100) generate LE and RE image data filtered in one of the vertical and horizontal and vertical directions but unfiltered in the other of the vertical and horizontal directions based on the input LE and RE image frames (102-L and 102-R).
[0020] For example, a filtering and subsampling mechanism (104-H) may be configured to filter high spatial frequency content in the horizontal direction from the input LE and RE image frames (102-L and 102-R) and horizontally subsample the LE and RE image frames (102-L and 102-R) as filtered in the horizontal direction into corresponding LE and RE portions. A multiplexer (106-H) may be configured to combine the LE and RE portions in a 3D multiplexed image frame (108-H) in a side-by-side format.
[0021] Similarly, a filtering and subsampling mechanism (104-V) may be configured to filter high spatial frequency content in the vertical direction from the input LE and RE image frames (102-L and 102-R) and vertically subsample the LE and RE image frames (102-L and 102-R) as filtered in the vertical direction into corresponding LE and RE portions. A multiplexer (106-V) may be configured to combine the LE and RE portions in a 3D multiplexed image frame (108-V) in a top-and-bottom format.
[0022] The filtering of the LE and RE image frames (102-L and 102-R) may remove all, or a substantial part, of the high spatial frequency content from the input LE and RE image frames (102-L and 102-R) in one of the horizontal and vertical directions. Filtering may be performed with one or more low-pass filters (LPFs) in the filtering and subsampling mechanisms (e.g., 104-H and 104-V). In an example embodiment, filtering as described herein removes or substantially dampens any spatial frequency content in the input images above a threshold frequency that corresponds to a fraction (e.g., one half or another fraction) of a spatial resolution supported by a multi-layer video decoder (e.g., 150) in one of the horizontal and vertical directions.
As used herein, the term "high spatial frequency content" in a spatial direction (horizontal or vertical) may refer to high spatial frequency image details that exist in an input 3D video sequence along the spatial direction. If the removal of the high spatial frequency content in the spatial direction had occurred, downstream devices would not be able to reproduce high resolution image details with filtered image data in the spatial direction. As used herein, a subsampler in a filtering and subsampling mechanism (104-H or 104- V) may be configured to preserve the high spatial frequency content in the direction perpendicular to the direction in which the high spatial frequency content has been filtered/removed. For example, a subsampler in the filtering and subsampling mechanism (104-H) may be configured to subsample (e.g., keep every other column) along the same horizontal direction in which the high spatial frequency content has been removed and to avoid subsampling along the vertical direction. Similarly, a subsampler in the filtering and subsampling mechanism (104-V) may be configured to subsample (e.g., keep every other row) along the same vertical direction in which the high spatial frequency content has been removed and to avoid subsampling along the vertical direction.
A multiplexed 3D image frame (one of 108-H and 108-V) comprises both a (e.g., down-sampled) image data portion for the left eye and a (e.g., down-sampled) image data portion for the right eye. The multiplexed 3D image frame may be decoded by a downstream device into a LE image frame and a RE image frame of reduced resolutions (e.g., half resolutions) in one of the horizontal and vertical directions. Such decoded LE and RE image frames of the reduced resolution may be up-sampled to comprise the same number of pixels as a full resolution image frame with a fuzzier look than a full resolution image not obtained by an up-sampling operation.
In an example embodiment, a multiplexed 3D image frame (108-H) comprises LE and RE image data portions, each of which comprises a reduced number (e.g., one half, less than one half, or another fewer than the total number) of the total number of pixels in a full resolution image frame, where the LE and RE image data portions comprise high spatial frequency content in the vertical direction, while the other multiplexed 3D image frame (108-V) comprises complementary LE and RE image data portions, each of which comprises a reduced number (e.g., one half, less than one half, or another fewer than the total number) of the total number of pixels in a full resolution image frame, where the complementary LE and RE image data portions comprise high spatial frequency content in the horizontal direction. LE and RE image data portions may be multiplexed within a multiplexed 3D image frame (e.g., one of 108-H and 108-V) in a side-by- side format, an over-under format, a quincunx format, a checkerboard format, an interleaved format, a combination of the foregoing formats, or another multiplex format.
One or more enhancement layers may be used to carry a first multiplexed 3D image frame (e.g., one of 108-H and 108-V) that may be combined with a second multiplexed 3D image frame (e.g., the other of 108-H and 108-V) in a base layer. A multi-layer video decoder (e.g., 150) as described herein may be configured to produce image frames with high spatial resolution content in both vertical and horizontal directions based on the first and second multiplexed 3D image frame (e.g., 108-H and 108-V). In an example embodiment, the BL encoder (110) generates, based at least in part on the first multiplexed 3D image frame (e.g., 108-H), a base layer video signal to be carried in a base layer frame compatible video stream (BL FC video stream 112-1), while the EL encoder (116) generates, based at least in part on the second multiplexed 3D image frame (e.g., 108-V), an enhancement layer video signal to be carried in an enhancement layer frame compatible video stream (EL FC video stream 112-3). One or both of the BL encoder (110) and the EL encoder (116) may be implemented using one or more of a plurality of codecs, such as H.264/AVC, VP8, VC-1, and/or others.
An enhancement layer video signal as described herein may be generated using a hybrid video coding method (e.g., implemented by video codecs, such as VC-1, H.264/AVC, and/or others). The image data in the multiplexed 3D image frame 108-V may be predicted either from neighboring samples in the same image frame (using intra prediction) or from samples from past decoded image frames (inter prediction) that belong to the same layer and are buffered as motion-compensated prediction references within a prediction reference image frame buffer. Inter-layer prediction may also be at least in part based on decoded information from other layers (e.g., the base layer, etc.).
Additionally and/or optionally, the multi-layer video encoder (100) may comprise a reference processing unit (RPU, 114) to perform operations relating to prediction. Prediction as implemented by the reference processing unit (114) may be used to reduce the redundant data and overhead in constructing multiplexed 3D image frames in the multi-layer video decoder (150). The RPU (114) may receive and make use of BL image data and other prediction-related information from the BL Encoder 110, and generate a prediction reference image frame through intra or inter prediction.
In those example embodiments that make use of such predictions, the EL encoder (116) generates, based at least in part on the second multiplexed 3D image frame (108-V) and the prediction reference image frame, multiplexed 3D image residuals or differences between the prediction reference image frame and the second multiplexed 3D image frame 108-V and stores the image residuals in the enhancement layer video signal to be carried in the EL FC video stream (112-3). Further, based on the prediction and coding process, the RPU (114) may generate coding information which can be transmitted to a decoder as metadata using an RPU stream (112-2).
FIG. IB illustrates a multi-layer video decoder (150) that receives input video signals in which high spatial frequency content from an original video sequence (which may be the input video sequence as discussed in connection with FIG. 1A) in two orthogonal directions has been preserved in complementary image data carried in the enhancement layer and in the base layer, respectively, in accordance with an embodiment. In an example embodiment, the input video signals are received in multiple layers (or multiple bitstreams). As used herein, the term
"multi-layer" or "multiple layers" may refer to two or more bitstreams that carries input video signals having one or more logical dependency relationships between one another (of the input video signals).
In an example embodiment, the multi-layer video decoder (150) is configured to decode one or more input video signals in the BL FC video stream (112-1 of FIG. IB), EL RPU stream (112-2 of FIG. IB), and EL FC video stream (112-3 of FIG. IB) into a sequence of (full resolution) 3D output images. A 3D output image in the sequence of 3D output images as decoded by the multi-layer video decoder (150) comprise high spatial resolution content for both eyes, as high spatial frequency content in the original video sequence that gives rise to the input video signals has been preserved in both horizontal and vertical directions.
In an example embodiment, a BL decoder (152) generates, based at least in part on a BL video signal received from BL FC video stream (112-1 of FIG. IB), a first multiplexed 3D image frame (158-H), while an EL decoder (156) generates, based at least in part on an EL video signal received from EL FC video stream (112-3 of FIG. IB), a second multiplexed 3D image frame (158-V). One or both of the BL decoder (152) and the EL decoder (156) may be implemented using one or more of a plurality of codecs, such as H.264/AVC, VP8, VC-1, and/or others.
In the embodiments that make use of prediction, a decoder-side RPU (154) generates, based at least in part on a reference video signal received from EL RPU stream ( 112-2 of FIG. IB) and/or BL image data from the BL decoder (152), a prediction reference image frame. Further, EL decoder (156) generates, based at least in part on the EL video signal in EL FC video stream (112-3 of FIG. IB) and the prediction reference image frame from the RPU (154), the second multiplexed 3D image frame (158-V).
The multi-layer video decoder (150) may combine complementary image data received in one or more enhancement layers (e.g., EL RPU stream 112-2 and EL FC video stream 112-3) with image data received in a base layer (e.g., BL FC video stream 112-1) to produce full resolution LE and RE output image frames (e.g., 162-L and 162-R) that comprise high spatial frequency content in both vertical and horizontal directions. For example, a demultiplexer (DeMux, 160) may be configured to de-multiplex the multiplexed 3D image frames (158-H and 158-V) into the LE and RE output image frames (162-L and 162-R) with high spatial frequency content. While the multiplexed 3D image frames (158-H and 158-V) each comprise image data for both left and right eyes, each of the LE and RE output image frames (162-L and 162-R) is only for one of left and right eyes. A first LE image data portion in the first multiplexed 3D image frame (158-H) may be combined with a second LE image data portion in the second multiplexed 3D image frame (158-V) to form the LE output image (162-L) that comprises high spatial frequency content in both vertical and horizontal directions. Similarly, a first RE image data portion in the first multiplexed 3D image frame (158-H) may be combined with a second RE image data portion in the second multiplexed 3D image frame (158-V) to form the RE output image (162-R) that comprises high spatial frequency content in both vertical and horizontal directions.
The full resolution LE and RE output image frames (162-L and 162-R) both of which comprise high spatial frequency content in both vertical and horizontal directions may be rendered by a display device (which, for example, may comprise the multi-layer video decoder 150) to present a full resolution output 3D image. Rendering the full resolution LE and RE output image frames may, but is not limited to, be in a frame- sequential manner. Because high spatial frequency content has been preserved in the video signals as received by the multi-layer video decoder (150), the full resolution output 3D image contains high spatial frequency image details that may exist in an original 3D image (which may be one of the 3D input images of FIG. 1 A).
FIG. 1C illustrates a base-layer video decoder (150-1) that receives one or more input video signals generated from an original video sequence (which may be the input video sequence as discussed in connection with FIG. 1A), in accordance with an embodiment. In an example embodiment, the base-layer video decoder (150-1) is configured to decode a BL input video signal as received from a base layer (BL FC video stream 112-1 of FIG. 1C) into a sequence of 3D output images, regardless of whether video signals in other layers may be present or not in physical signals received by the decoder. In an example embodiment, the base-layer video decoder (150-1) is configured to ignore any presence of video signals in other streams other than the BL FC video stream (112-1).
A 3D output image in the sequence of 3D output images as produced by the base layer video decoder (150-1) does not comprise full resolution 3D image data, as high spatial frequency content along one of the vertical and horizontal directions in the original video sequence that gives rise to the input video signals has been filtered/removed in the base layer video signal and cannot be recovered by the base-layer video decoder (150-1).
In an example embodiment, a BL decoder (152 of FIG. 1C) generates, based at least in part on the BL input video signal in BL FC video stream (112-1 of FIG. 1C), a multiplexed 3D image frame (e.g., 158-H of FIG. 1C). The BL decoder (152 of FIG. 1C) may be implemented using one or more of a plurality of codecs, such as H.264/AVC, VP8, VC-1, and/or others.
In an example embodiment, an up-sampling unit (170) de-multiplexes and/or separates the multiplexed 3D image frame (158-H) into two image data portions. While the multiplexed 3D image frame (158-H) comprises multiplexed filtered image data for both left and right eyes, the image data portions comprise a filtered LE image data portion and a filtered RE image data portion, each of which is at a reduced resolution below the full resolution. In an example embodiment, the up-sampling unit (170) up-samples (e.g., expanding along the horizontal direction) the filtered LE image data portion to form an up-sampled LE filtered output image frame (172-L). Similarly, the up-sampling unit (170) up-samples (e.g., expanding along the horizontal direction) the filtered RE image data portion to form an up-sampled RE filtered output image frame (172-R). Even though each of the up- sampled LE and RE filtered image frames (172-L and -R) may comprise the same number of pixels as a full resolution image frame, the rendered 3D image with the up- sampled LE and RE filtered image frames (172-L and -R) has a fuzzier look than a 3D image made up of full resolution LE and RE image frames (162-L and -R of FIG. IB) not obtained by an up-sampling operation. In addition, the up-sampled LE and RE filtered image frames (172-L and -R) do not have high spatial frequency image details removed in the encoding process of the BL video signals (which may be derived from, for example, 112-1 of FIG. 1A).
The up-sampled LE and RE filtered image frames (172-L and -R) below the full resolution may be rendered by a display device (which for example may comprise the base-layer video decoder 150-1) to present an output 3D image. Rendering the up-sampled LE and RE filtered image frames (172-L and -R) may, but is not limited to, be in a frame- sequential manner.
3. DEMULTIPLEXERS
FIG. 2, FIG. 3, and FIG. 4 illustrate different configurations of demultiplexer 160 of FIG. IB, according to some example embodiments. Each of the demultiplexers (160-1 through 160-5) may be configured to accept a first LE portion (202-L) decoded/derived from a first multiplexed 3D image frame (e.g., 158-H) and a second LE portion (204-L) decoded/derived from a second multiplexed 3D image frame (e.g., 158-V). The first LE portion (202-L) may comprise high spatial frequency content in one (for example, vertical direction) of the vertical and horizontal directions, while the second LE portion (204-L) may comprise high spatial frequency content in the other (horizontal direction in the same example) of the vertical and horizontal directions.
Each of the demultiplexers (160-1 through 160-5) may be configured to process and combine the first LE portion (202-L) and the second LE portion (204-L) to generate a full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
A demultiplexer similar to each of the demultiplexers (160-1 through 160-5) may be configured to accept a first RE portion decoded/derived from the first multiplexed 3D image frame (158-H) and a second RE portion decoded/derived from the second multiplexed 3D image frame (158-V). The first RE portion may comprise high spatial frequency content in the vertical direction, while the second RE portion may comprise high spatial frequency content in the horizontal direction. The demultiplexer may be configured to process and combine the first RE portion and the second RE portion to generate a full resolution RE image frame (e.g., 162-R of FIG. IB) that comprises high spatial frequency content in both vertical and horizontal directions.
As illustrated in FIG. 2, an up-sampler (206-H) in the demultiplexer (160-1) may be configured to up-sample the first LE portion (202-L) in the horizontal direction to create a first LE image frame (208-H). Similarly, an up-sampler 206-V in the demultiplexer (160-1) may be configured to up-sample the second LE portion (204-L) in the vertical direction to create a second LE image frame (208- V).
A first low pass filter (210-1) and a first high pass filter (210-2) may be applied to the first LE image frame (208-H) to yield a first low pass LE image frame (212-1) and a first high pass LE image frame (212-2). Similarly, a second low pass filter (214-1) and a second high pass filter (214-2) may be applied to the second LE image frame (208- V) to yield a second low pass LE image frame (216-1) and a second high pass LE image frame (216-2).
An averaging unit (218) in the demultiplexer (160-1) may be configured to accept the first low pass LE image frame (212-1) and the second low pass LE image frame (216-1) as input and to apply an averaging operation on the first low pass LE image frame (212-1) and the second low pass LE image frame (216-1) to generate a low pass averaged LE image frame.
An adder (220) in the demultiplexer (160-1) may be configured to accept the first high pass LE image frame (212-2), the second high pass LE image frame (216-2) and the low pass averaged LE image frame as input and to apply an adding operation on the first high pass LE image frame (212-2), the second high pass LE image frame (216-2) and the low pass averaged LE image frame to generate the full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
In some embodiments, low pass filtering have been applied by a multi-layer video encoder (e.g., 100 of FIG. 1A), for example, for anti-aliasing purposes. The demultiplexer (160-1) of FIG. 2 may be simplified according to some embodiments as illustrated in FIG. 3 A and FIG. 3B. In these embodiments, decoding complexity may be reduced by eliminating low pass filters (e.g., 210-1 and 214-1) and an averaging unit (218) that may be used in the multiplexer 160-1 of FIG. 2.
As illustrated in FIG. 3 A, an up-sampler (206-H) in the demultiplexer (160-2) may be configured to up-sample the first LE portion (202-L) in the horizontal direction to create a LE image frame (208-H) that comprises high spatial frequency content in the vertical direction. A high pass filter (214-2) that preserves high spatial frequency content in the horizontal direction may be applied to the second LE portion (202-L) to yield a high pass LE portion. An up-sampler 206-V in the demultiplexer (160-2) may be configured to up-sample the high pass LE portion in the vertical direction to create a high pass LE image frame (216-2) that comprises high spatial frequency content in the horizontal direction.
An adder (222) in the demultiplexer (160-2) may be configured to accept the high pass LE image frame (216-2) and the LE image frame (208-H) as input and to apply an adding operation on the high pass LE image frame (216-2) and the LE image frame (208-H) to generate the full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
In some embodiments, instead of using a demultiplexer (160-2) as illustrated in FIG. 3 A to derive full resolution LE image frame (e.g., 162-L), a demultiplexer (160-3) as illustrated in FIG. 3B may be used. In FIG. 3B, an up-sampler (206-V) in the demultiplexer (160-3) may be configured to up-sample the second LE portion (204-L) in the vertical direction to create a LE image frame (208- V) that comprises high spatial frequency content in the horizontal direction.
A high pass filter (210-2) that preserves high spatial frequency content in the vertical direction may be applied to the first LE portion (202-L) to yield a high pass LE portion. An up-sampler 206-H in the demultiplexer (160-3) may be configured to up-sample the high pass LE portion in the horizontal direction to create a high pass LE image frame (212-2) that comprises high spatial frequency content in the vertical direction.
An adder (222) in the demultiplexer (160-3) may be configured to accept the high pass LE image frame (212-2) and the LE image frame (208-V) as input and to apply an adding operation on the high pass LE image frame (212-2) and the LE image frame (208-V) to generate the full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
In some embodiments, instead of using a high pass filter (e.g., 214-2 of FIG. 3A or 210-2 of FIG. 3B) in a demultiplexer (160-2 or 160-3) as illustrated in FIG. 3A and FIG. 3B to derive full resolution LE image frame (e.g., 162-L), a subtraction operation as illustrated in FIG. 4A may be used. In contrast to high pass filtering operations, a subtraction operation such as 402 may require less computational complexity.
For the purpose of illustration only, the first LE portion (202-L) may be decoded from a BL video signal, while the second LE portion (204-L) may be decoded from an EL video signal. As illustrated in FIG. 4A, a reference LE portion 404, which may be of the same spatial dimensions of the second LE portion (204-L), may be generated based on the first LE portion (202-L) by RPU 406 (which may be, for example, 154 of FIG. IB) during an inter-layer prediction process. In some embodiments, a processing path or sub-path comprising a low pass filter (408) in the vertical direction (which removes high spatial frequency content in the vertical direction), a vertical subsampler (410; which may keep, for example, every other row), and a horizontal up-sampler (412) may be used to generate the reference LE portion 404.
The subtraction operation (402) may be configured to accept the reference LE portion 404 (which has been removed high spatial frequency content in the vertical direction in addition to the high spatial frequency content in the horizontal direction removed by an upstream multi-layer video encoder (e.g., 100 of FIG. 1A)) and the second LE portion (204-L) as input and to subtract the second LE portion (204-L) by the reference LE portion 404 to generate a high pass LE portion that comprises high spatial frequency content in the horizontal direction only. In some embodiments, the high pass LE portion as described herein may be equivalent to the high pass LE portion generated by the high pass filter (214-2) of FIG. 3A.
Similar to the demultiplexer (160-2) of FIG. 3 A, the demultiplexer (160-4) of FIG. 4A may comprise an up-sampler 206- V that may be configured to up-sample the high pass LE portion in the vertical direction to create a high pass LE image frame (216-2) that comprises high spatial frequency content in the horizontal direction.
An adder (222) in the demultiplexer (160-4) may be configured to accept the high pass LE image frame (216-2) and a LE image frame (208-H) (generated from the first LE portion 202-L by a horizontal up-sampler 206-H) as input and to apply an adding operation on the high pass LE image frame (216-2) and the LE image frame (208-H) to generate the full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
As illustrated in FIG. 4B, a reference LE portion 424, which may be of the same spatial dimensions of the first LE portion (202-L), may be generated based on the second LE portion
(204-L) by RPU 426 (which may be, for example, 154 of FIG. IB) during an inter-layer prediction process. In some embodiments, a processing path or sub-path comprising a low pass filter (428) in the vertical direction (which removes high spatial frequency content in the vertical direction), a vertical subsampler (430; which may keep, for example, every other row), and a horizontal up-sampler (432) may be used to generate the reference LE portion 404.
A subtraction operation (422) in the demultiplexer (160-5) may be configured to accept the reference LE portion 424 (which has been removed high spatial frequency content in the horizontal direction in addition to the high spatial frequency content in the vertical direction removed by an upstream multi-layer video encoder (e.g., 100 of FIG. 1A)) and the first LE portion (202-L) as input and to subtract the first LE portion (202-L) by the reference LE portion 424 to generate a high pass LE portion that comprises high spatial frequency content in the vertical direction only. In some embodiments, the high pass LE portion as described herein may be equivalent to the high pass LE portion generated by the high pass filter (210-2) of FIG. 3B.
Similar to the demultiplexer (160-3) of FIG. 3B, the demultiplexer (160-5) of FIG. 4B may comprise an up-sampler 206-H that may be configured to up-sample the high pass LE portion in the horizontal direction to create a high pass LE image frame (212-2) that comprises high spatial frequency content in the vertical direction.
An adder (222) in the demultiplexer (160-5) may be configured to accept the high pass LE image frame (212-2) and a LE image frame (208-V) (generated from the second LE portion 204-L by a vertical up-sampler 206- V) as input and to apply an adding operation on the high pass LE image frame (212-2) and the LE image frame (208-V) to generate the full resolution LE image frame (162-L) that comprises high spatial frequency content in both vertical and horizontal directions.
4. SAMPLING FORMATS
FIG. 5 illustrates multiplexing formats, in some example embodiments. As illustrated, an LE image frame (e.g., 102-L or an LE image derived there from) and an RE image frame (e.g., 102-R or an RE image derived there from) may comprise LE and RE pixel values for a plurality of pixels (a;, bj, Cj, dj, etc., wherein i may be a positive integer) in a 3D image frame. In some embodiments, a side-by-side multiplexed image frame such as the multiplexed image frame 108-H of FIG. 1A may use any of a plurality of side-by-side multiplexing formats (108-H-l, 108-H-2, 108-H-3, or other side-by-side multiplexing formats) to host image data horizontally subsampled from the LE and RE image frames (102-L and 102-R). In some embodiments, a top-and-bottom multiplexed image frame such as the multiplexed image frame 108-V of FIG. 1A may use any of a plurality of top-and-bottom multiplexing formats (108-V-l, 108-V-2, 108-V-3, or other side-by-side multiplexing formats) to host image data vertically subsampled from the LE and RE image frames (102-L and 102-R).
A multiplexer such as 106-H or 106-V of FIG. 1 may make one or more selections of multiplexing formats from the pluralities of multiplexing formats based on one or more factors (e.g., related to subsampling methods adopted by the multiplexer) and may signal the selections of multiplexing formats to a multi-layer video decoder (e.g., 150 of FIG. IB) as metadata using, for example, an RPU stream 112-2. An RPU unit (e.g., 406 of FIG. 4A or 426 of FIG. 4B) in the multi-layer video decoder (150) may construct an inter-layer reference frame or image portion (e.g., 404 of FIG. 4A or 424 of FIG. 4B) based on an multiplexed 3D image frame (e.g., corresponding to 108-H of FIG. 1A) decoded from a BL video signal, taking into consideration the adopted subsampling methods used by the upstream multi-layer video encoder in generating the multiplexed 3D image frames (108-H and 108-V) (differentially or non-differentially) encoded in both BL and EL layers.
5. INTERLACED VIDEO APPLICATIONS
FIG. 6A illustrates interlaced content (602) of the same perspective (either left eye or right eye) forming an image portion (606) in a top-and-bottom format, in some example embodiments. In FIG. 6A, the interlaced content (602) may be first demultiplexed into a (e.g., 1080i) top field (604-T) for a first time equal to t and a (e.g., 1080i) bottom field (604-B) for a second time equal to t + 1. In FIG. 6 A, each of the top field (604-T) and the bottom field (604-B) may be vertically filtered and vertically subsampled to a first half field (608-1) (e.g., with one half, or with less than one half of the full spatial resolution or with another lower than the full spatial resolution, in the vertical direction) and a second half field (608-2) (e.g., with one half, or with less than one half of the full spatial resolution or with another lower than the full spatial resolution, in the vertical direction). Further, in FIG. 6A, the first half field (608-1) and the second half field (608-2) may be interleaved into a top or bottom field (606) in a first interlaced image frame.
FIG. 6B illustrates interlaced content (602) of the same perspective (either left eye or right eye) forming an image portion (626) in a side-by-side format, in some example embodiments. In some embodiments, the interlaced content (602) is not required to be demultiplexed into two separate fields before horizontal filtering and horizontal subsampling and then interlaced into the image portion (626). Instead, the interlaced content (602) may be directly horizontally filtered and horizontally subsampled into the image portion (626). The operations illustrated in FIG. 6B may be performed for each of the LE and RE perspectives, and may constitute one half of the filtering and sampling mechanism (104-H).
Operations described in FIG. 6A and FIG. 6B may be applied to each of the left eye and right eye perspective, thereby forming the first interleaved image frame in a top-and-bottom format (as illustrated in part in FIG. 6A) and the second interleaved image frame in a side-by-side bottom format (as illustrated in part in FIG. 6B). One of the first interleaved frame and the second interleaved frame may be carried in the BL video signal, while the other (either differentially or non-differentially encoded) interleaved frame may be carried in the EL video signal. As a result, one (e.g., the first interleaved image frame in the present example) of the first interleaved image frame and the second interleaved image frame carries high spatial frequency content in the horizontal direction, while the other (e.g., the second interleaved image frame in the present example) of the first interleaved image frame and the second interleaved image frame carries high spatial frequency content in the vertical direction.
FIG. 7 illustrates multiplexing formats for carrying interlaced content, in some example embodiments. As illustrated in FIG. 6A and FIG. 6B, low pass filtering and subsampling in the horizontal or vertical direction may be field based for interlaced content. A left image field (708-L) and a right image field (708-R) may correspond to LE and RE perspectives of the interlaced content for a time equal to t, respectively. Pixel values of one of the left image field (708-L) and the right image field (708-R) may be used to populate a left or right side of any of a plurality of multiplexing (or subsampling) formats (708-H-l, 708-H-2, 708-H-3, etc.) as shown in FIG. 7 when horizontal filtering and horizontal subsampling are applied as illustrated in FIG. 6B. Similarly, pixel values of one of the left image field (708-L) and the right image field (708-R) may be used to populate a top or bottom of any of a plurality of multiplexing (or subsampling) formats (708-V-l, 708- V-2, 708- V-3, etc.) as shown in FIG. 7 when vertical filtering and horizontal subsampling are applied as illustrated in FIG. 6A. In some embodiments, subsampled image data from all image fields, of both LE and RE perspectives and corresponding to both of the first time equals to t and the second time equals to t+1, is present in a top-and-bottom format applied to interlaced content.
6. RESIDUAL IMAGE CODING
FIG. 8 A illustrates a multi-layer video encoder (100-1) that maintains high spatial frequency content present in input video sequence, in accordance with an embodiment of the invention. FIG. 8B shows a multi-layer video decoder (150-2) corresponding to the multi-layer video encoder (100-1) shown in FIG. 8 A, in accordance with the example embodiment.
In an example embodiment, the multiple-layer video encoder (100-1) is configured to encode an input 3D video sequence that consists of a sequence of 3D input images. A 3D input image in the sequence of 3D images comprises full resolution 3D image data that contains high spatial frequency content. The full resolution 3D image data in a 3D input image may be initially decoded by the multiple-layer video encoder (100) into an input LE image frame (102-L) and an input RE image frame (102-R) both of which contain high spatial frequency content.
In an example embodiment, a first filtering and subsampling mechanism (e.g., 104-H) in the multi-layer video encoder (100) generates LE and RE image data filtered in one of the vertical or horizontal directions but unfiltered in the other of the vertical or horizontal directions based on the input LE and RE image frames (102-L and 102-R). For the purpose of illustration, the first filtering and subsampling mechanism may be 104-H of FIG. 8 A configured to filter high spatial frequency content in the horizontal direction from the input LE and RE image frames (102-L and 102-R) and horizontally subsample the LE and RE image frames (102-L and 102-R) as filtered in the horizontal direction into corresponding LE and RE portions. A multiplexer (106-H) may be configured to combine the LE and RE portions in a 3D multiplexed image frame (108-H) in a side-by- side format.
In some embodiments, instead of processing the LE and RE image frames (102-L and 102-R) through an EL processing sub-path comprising filtering, subsampling, multiplexing and compressing into an EL video signal, LE and RE residual image frames (806-L and 806-R) may be processed by the EL processing sub-path. As illustrated in FIG. 8A, an RPU (114) may be configured to generate LE and RE reference image portions based on the multiplexed 3D image frame (108-H) as provided by the BL encoder (110). The LE and RE reference image portions from the RPU (114) may be up-sampled in the same direction as the subsampling direction of the BL processing sub-path. For example, if the BL processing sub-path comprising horizontal filtering, horizontal subsampling, side-by-side multiplexing and BL encoding performs subsampling of the input LE and RE image frames (102-L and 102-R) in the horizontal direction, each of the LE and RE reference image portions generated by the RPU (114) may be up-sampled in the horizontal direction to form an up-sampled LE image frame (804-L) and an up-sampled RE image frame (804-R).
An addition operation (810-L) may be configured to accept the complement of the up-sampled LE image frame 804-L (which has been removed high spatial frequency content in the horizontal direction by the BL processing sub-path) and the input LE image frame (102-L) as input and to add the input LE image frame (102-L) and the complement of the up-sampled LE image frame 804-L to generate an LE residual image frame (806-L) that comprises high spatial frequency content in the horizontal direction only. An addition operation (810-R) may be configured to accept the complement of the up-sampled RE image frame 804-R (which has been removed high spatial frequency content in the horizontal direction by the BL processing sub-path) and the input RE image frame (102-R) as input and to add the input RE image frame (102-R) and the complement of the up-sampled RE image frame 804-R to generate an RE residual image frame (806-R) that comprises high spatial frequency content in the horizontal direction only.
A second filtering and subsampling mechanism may be 104-V of FIG. 8 A configured to vertically subsample the LE and RE residual image frames (806-L and 806-R) into corresponding LE and RE residual portions. Additionally and optionally, the second filtering and subsampling mechanism may comprise a vertical filter configured to vertically filter the LE and RE residual image frames (806-L and 806-R), for example, before the above-mentioned subsampling operation on the LE and RE residual image frames (806-L and 806-R). A multiplexer (106-V) may be configured to combine the LE and RE residual portions in a 3D multiplexed image frame (808-V) in a top-and-bottom format.
In an example embodiment, the BL encoder (110) generates, based at least in part on the first multiplexed 3D image frame (e.g., 108-H), a base layer video signal to be carried in a base layer frame compatible video stream (BL FC video stream 112-1), while the EL encoder (116) generates, based at least in part on the second multiplexed 3D image frame (e.g., 808-V), an enhancement layer video signal to be carried in an enhancement layer frame compatible video stream (EL FC video stream 112-3). One or both of the BL encoder (110) and the EL encoder (116) may be implemented using one or more of a plurality of codecs, such as H.264/AVC, VP8, VC-1, and/or others.
FIG. 8B shows a multi-layer video decoder (150-2) that receives input video signals in which high spatial frequency content from an original video sequence (which may be the input video sequence as discussed in connection with FIG. 8A) in two orthogonal directions has been preserved in complementary image data carried in the enhancement layer and in the base layer, respectively, in accordance with an embodiment.
In an example embodiment, the multi-layer video decoder (150-2) is configured to decode one or more input video signals in the BL FC video stream (112-1 of FIG. 8B), EL RPU stream (112-2 of FIG. 8B), and EL FC video stream (112-3 of FIG. 8B) into a sequence of 3D output images. A 3D output image in the sequence of 3D output images as decoded by the multi-layer video decoder (150-2) comprises high spatial frequency content for both eyes, as high spatial frequency content in the original video sequence that gives rise to the input video signals has been preserved in both the horizontal and vertical directions.
In an example embodiment, a BL decoder (152) generates, based at least in part on a BL video signal received from BL FC video stream (112-1 of FIG. 8B), a first multiplexed 3D image frame (158-H), while an EL decoder (156) generates, based at least in part on an EL video signal received from EL FC video stream (112-3 of FIG. 8B), a second multiplexed 3D image frame (858-V). One or both of the BL decoder (152) and the EL decoder (156) may be implemented using one or more of a plurality of codecs, such as H.264/AVC, VP8, VC-1, and/or others.
In some embodiments, the EL decoder (156) generates, based at least in part on the EL video signal in EL FC video stream (112-3 of FIG. 8B) without a prediction reference image frame from the RPU (154), the second multiplexed 3D image frame (858-V).
The multi-layer video decoder (150-2) may combine residual image data received in one or more enhancement layers (e.g., EL FC video stream 112-3) with image data received in a base layer (e.g., BL FC video stream 112-1) to produce full resolution LE and RE output image frames (e.g., 162-L and 162-R) that comprise high spatial frequency content in both vertical and horizontal directions. For example, a demultiplexer (DeMux, 160) may be configured to de-multiplex the multiplexed 3D image frames (158-H and 858-V) into LE and RE output image frames (162-L and 162-R) with high spatial frequency content. While the multiplexed 3D image frames (158-H and 858-V) each comprise image data for both left and right eyes, each of the LE and RE output image frames (162-L and 162-R) is only for one of left and right eyes. A first LE image data portion in the first multiplexed 3D image frame (158-H) may be combined with a second LE image data portion in the second multiplexed 3D image frame (158-V) to form the LE output image (162-L) that comprises high spatial frequency content in both vertical and horizontal directions. Similarly, a first RE image data portion in the first multiplexed 3D image frame (158-H) may be combined with a second RE image data portion in the second multiplexed 3D image frame (158-V) to form the RE output image (162-R) that comprises high spatial frequency content in both vertical and horizontal directions.
The full resolution LE and RE output image frames (162-L and 162-R) both of which comprise high spatial frequency content in both vertical and horizontal directions may be rendered by a display device (which, for example, may comprise the multi-layer video decoder 150) to present a full resolution output 3D image. Rendering the full resolution LE and RE output image frames may, but is not limited to, be in a frame- sequential manner. Because high spatial frequency content has been preserved in the video signals as received by the multi-layer video decoder (150), the full resolution output 3D image contains high spatial frequency image details that may exist in an original 3D image (which may be one of the 3D input images of FIG. 8A).
In embodiments illustrated in FIG. 8A and FIG. 8B, inter-layer prediction is not required. Decoding of BL and EL image data by a multi-layer video decoder in these embodiments may be independently performed to a greater extent than that in other embodiments (e.g., as illustrated in FIG. 1 A and FIG. As a result, a simpler demultiplexer (e.g., 160-6 of FIG. 9) than those illustrated in FIG. 2 through FIG. 4 may be used in a multi-layer video decoder such as 150-2 of FIG. 8B.
It should be noted that residual image data may be subsampled using any of subsampling formats as illustrated in FIG. 5 in progressive video applications, and any of subsampling formats as illustrated in FIG. 7 in interlaced video applications. An RPU signal as described herein may be used to signal, to a downstream video decoder, selected subsampling formats of BL and EL image data as encoded by an upstream multi-layer video encoder.
7. EXAMPLE PROCESS FLOWS
FIG. 10A illustrates an example process flow according to an embodiment of the present invention. In some example embodiments, one or more computing devices or hardware components may perform this process flow. In block 1002, a multi-layer video encoder (e.g., 100) receives an input 3D image, the input 3D image comprising a left eye (LE) input image frame and a right eye (RE) input image frame.
In block 1004, the multi-layer video encoder (100) generates, based on the LE input image frame and the RE input image frame, a first multiplexed image frame comprising first high spatial frequency content in a horizontal direction and first reduced resolution content in a vertical direction.
In block 1006, the multi-layer video encoder (100) generates, based on the LE input image frame and the RE input image frame, a second multiplexed image frame comprising second high spatial frequency content in the vertical direction and second reduced resolution content in the horizontal direction.
In block 1008, the multi-layer video encoder (100) encodes and outputs the first multiplexed image frame and the second multiplexed image frame to represent the input 3D image.
In an embodiment, the 3D input image is a first 3D input image in a sequence of 3D input images comprising a second different 3D input image having a second LE input image frame and a second LE input image frame. The multi-layer video encoder (100) is further configured to perform: generating, based on the second LE input image frame and the second RE input image frame, a third multiplexed image frame comprising third high spatial frequency content in the horizontal direction and third reduced resolution content in the vertical direction; generating, based on the second LE input image frame and the second input image frame, a fourth multiplexed image frame comprising fourth high spatial frequency content in the vertical direction and fourth reduced resolution content in the horizontal direction; and encoding and outputting the third multiplexed image frame and the fourth multiplexed image frame to represent the second input 3D image.
In an embodiment, the first multiplexed image frame comprises a first LE image data portion and a first RE image data portion. The first LE image data portion and the first RE image data portion are of a same spatial resolution along both horizontal and vertical directions. The second multiplexed image frame comprises a second LE image data portion and a second RE image data portion. The second LE image data portion and the second RE image data portion are of a same spatial resolution along both horizontal and vertical directions. In an embodiment, each of the first LE image data portion and the first RE image data portion represents a subsampled version (e.g., one half, less than one half, or another reduced number, of the full resolution) of a whole image frame; the first multiplexed image frame adopts a side-by-side format to carry the first LE image data portion and the first RE image data portion. Each of the second LE image data portion and the second RE image data portion represents a subsampled version (e.g., one half, less than one half, or another reduced number, of the full resolution) of a whole image frame; the second multiplexed image frame adopts a top-and-bottom format to carry the second LE image data portion and the second RE image data portion.
In an embodiment, the first multiplexed image frame adopts a first multiplexing format that preserves the high spatial frequency content in the horizontal direction. The second multiplexed image frame adopts a second multiplexing format that preserves the high spatial frequency content in the vertical direction.
In an embodiment, one of the first multiplexed image frame or the second multiplexed image frame is outputted in a base layer bitstream in a plurality of bit streams, while the other of the first multiplexed image frame or the second multiplexed image frame is outputted in an enhancement layer bitstream in the plurality of bit streams.
In an embodiment, the multi-layer video encoder (100) is further configured to perform: generating, based at least in part on the first multiplexed image frame, prediction reference image data; and encoding an enhancement layer video signal based on differences between the prediction reference image data and the second input image frame.
In an embodiment, the multi-layer video encoder (100) is further configured to perform: applying one or more first operations comprising at least one of (a) spatial frequency filtering operations or (b) spatial subsampling operations in the second direction to the first input image frame and the second input image frame in generating the first multiplexed image frame, wherein the one or more first operations removes high spatial frequency content in the second direction and preserves high spatial frequency content in the first direction; and applying one or more second operations comprising at least one of (a) spatial frequency filtering operations or (b) spatial subsampling operations in the first direction to the first input image frame and the second input image frame in generating the second multiplexed image frame, wherein the one or more second operations removes high spatial frequency content in the first direction and preserves high spatial frequency content in the first direction.
In an embodiment, one of first multiplexed image frame or the second multiplexed image frame comprises residual image data generated by subtracting reference image data generated based on the other of the first multiplexed image frame or the second multiplexed image frame from input image data derived from the LE input image frame and the RE input image frame.
In an embodiment, the multi-layer video encoder (100) is further configured to convert one or more 3D input images represented, received, transmitted, or stored with one or more input video signals into one or more 3D output images represented, received, transmitted, or stored with one or more output video signals.
In an embodiment, the input 3D image comprises image data encoded in one of a high dynamic range (HDR) image format, a RGB color space associated with the Academy Color Encoding Specification (ACES) standard of the Academy of Motion Picture Arts and Sciences (AMPAS), a P3 color space standard of the Digital Cinema Initiative, a Reference Input Medium Metric/Reference Output Medium Metric (RIMM/ROMM) standard, an sRGB color space, a RGB color space associated with the BT.709 Recommendation standard of the International Telecommunications Union (ITU), etc.
FIG. 10B illustrates another example process flow according to an example embodiment of the present invention. In some example embodiments, one or more computing devices may perform this process flow. In block 1052, a multi-layer video decoder (e.g., 150) receives a 3D image represented by a first multiplexed image frame and second multiplexed image frame, the first multiplexed image frame comprising first high spatial frequency content in a horizontal direction and first reduced resolution content in a vertical direction, and the second multiplexed image frame comprising second high spatial frequency content in the vertical direction and second reduced resolution content in the horizontal direction.
In block 1054, the multi-layer video decoder (150) generates, based on the first multiplexed image frame and the second multiplexed image frame, a left eye (LE) image frame and a right eye (RE) image frame, the LE image frame comprising LE high spatial frequency content in both horizontal and vertical directions, and the RE image frame comprising RE high spatial frequency content in both horizontal and vertical directions.
In block 1056, the multi-layer video decoder (150) renders the 3D image by rendering the LE image frame and the RE image frame.
In an embodiment, the 3D image is a first 3D image in a sequence of 3D images comprising a second different 3D image having third multiplexed image frame and a fourth multiplexed image frame, the third multiplexed image frame comprising third high spatial frequency content in the horizontal direction and third reduced resolution content in the vertical direction, and the fourth multiplexed image frame comprising fourth high spatial frequency content in the vertical direction and fourth reduced resolution content in the horizontal direction. In an embodiment, the multi-layer video decoder (150) is further configured to perform:
generating a second LE image frame and a second RE image frame, the second LE image frame comprising high spatial frequency content in both horizontal and vertical directions, and the second LE image frame comprising high spatial frequency content in both horizontal and vertical directions; and rendering the second 3D image by rendering the second LE image frame and the second RE image frame.
In an embodiment, at least one of the first multiplexed image frame or the second multiplexed image frame comprises an LE image data portion and an RE image data portion. The LE image data portion and the RE image data portion are of a same spatial resolution. In an embodiment, each of the LE image data portion and the RE image data portion represents a subsampled version (e.g., one half, less than one half, or another reduced number, of the full resolution) of a whole image frame. The LE image data portion and the RE image data portion forms a single image frame in one of a side-by-side format or a top-and-bottom format.
In an embodiment, one of the first multiplexed image frame or the second multiplexed image frame is decoded from a base layer bitstream in a plurality of bit streams, while the other of the first multiplexed image frame or the second multiplexed image frame is decoded from an enhancement layer bitstream in the plurality of bit streams.
In an embodiment, the multi-layer video decoder (150) is further configured to perform: generating, based at least in part on one of the first multiplexed image frame or the second multiplexed image frame, prediction reference image data; and generating, based on enhancement layer (EL) data decoded from an EL video signal and the prediction reference image data, one of the LE image frame or the RE image frame.
In an embodiment, the multi-layer video decoder (150) is further configured to perform: applying one or more first operations comprising at least one of (a) spatial frequency filtering operations or (b) demultiplexing operations in generating the LE image frame, wherein the one or more first operations combine LE high spatial frequency content, as derived from the first multiplexed image frame and the second multiplexed image frame, of both horizontal and vertical directions into the LE image frame; and applying one or more second operations comprising at least one of (a) spatial frequency filtering operations or (b) demultiplexing operations in generating the RE image frame, wherein the one or more second operations combine RE high spatial frequency content, as derived from the first multiplexed image frame and the second multiplexed image frame, of both horizontal and vertical directions into the RE image frame.
In an embodiment, the one or more first operations and the one or more second operations comprise at least a high pass filtering operation.
In an embodiment, the one or more first operations and the one or more second operations comprise a processing sub-path that replaces at least one high pass filtering operation. The processing sub-path comprises at least one subtraction operation and no high pass filtering operation.
In an embodiment, one of the first multiplexed image frame or the second multiplexed image frame comprises residual image data. The multi-layer video decoder (150) is further configured to perform: decoding and processing enhancement layer image data without generating prediction reference data from the other of the first multiplexed image frame or the second multiplexed image frame.
In an embodiment, the multi-layer video decoder (150) is further configured to process one or more 3D images represented, received, transmitted, or stored with one or more input video signals.
In an embodiment, the 3D image comprises image data encoded in one of a high dynamic range (HDR) image format, a RGB color space associated with the Academy Color Encoding Specification (ACES) standard of the Academy of Motion Picture Arts and Sciences (AMPAS), a P3 color space standard of the Digital Cinema Initiative, a Reference Input Medium
Metric/Reference Output Medium Metric (RIMM/ROMM) standard, an sRGB color space, a RGB color space associated with the BT.709 Recommendation standard of the International Telecommunications Union (ITU), etc.
In various example embodiments, an encoder, a decoder, a system, etc., performs any or a part of the foregoing methods as described.
RESIDUAL IMAGE CODING WITH CARRIER SIGNAL
As depicted in FIG. 8A, in one embodiment, the enhancement layer (EL) stream 112-3 comprises image residuals (e.g., 806-L and 806-R) multiplexed as top-and-bottom (TaB) frames (808-V). Instead of coding these residual signal directly (e.g., by using EL encoder 116), improved compression may be achieved by combining the residual signal with a "carrier" image signal to form a new EL signal. The purpose of using such a carrier signal is to make the enhancement layer look more like a natural video signal for which existing video codecs, such as the H.264/AVC codec, as described in "ISO/IEC 14496-10: Information technology - coding of audio-visual objects - Part 10: Advanced Video Coding", are optimized. An example of such an embodiment is depicted in FIG. 12A. FIG. 12A illustrates an example FCFR encoder according to an embodiment that utilizes a carrier image signal in the enhancement layer. As depicted in FIG. 12A, the processing of the base layer follows the processing steps discussed earlier, e.g., as depicted in FIG. 8A. Left view (102-L) and right view (102-R) signals are down- sampled and multiplexed in step 1205 to generate a multiplexed half-resolution frame, e.g., in the side -by-side (SbS) format. Step 1205 represents a simplified representation of steps 104-H and 106-H depicted in FIG. 8A. After multiplexing, BL signal 1207 is then compressed with a base layer encoder 1210 (e.g., an H.264/AVC encoder) to generate compressed BL stream 1240-1.
BL signal 1207 (or alternatively, decoded BL signal 1212) may be used to regenerate full resolution (FR) versions of the left and right views (Left FR and Right FR) using horizontal up-sampling 804. The original (102-L and 102-R) and the reconstructed views are then subtracted (e.g., in 810) to generate residuals 806-L and 806-R. Multiplexer 1215, multiplexes these residuals in a frame format (e.g., TaB) that is orthogonal to the frame format being used in the base layer (e.g., SbS) to generate residual signal 808-V. Next, residual 808-V is added to a carrier signal 1222 to generate an EL signal 1237. Carrier signal 1222 may be generated using a Carrier RPU 1220 in response to the SbS BL 1207 signal. Carrier RPU may perform both horizontal up-sampling and vertical down-sampling to generate a carrier TaB signal 1222 that matches the format and resolution of the residual signal (e.g., 808V). In an embodiment (see FIG. 15), vertical down-sampling is performed before the horizontal up-sampling. In another embodiment, the carrier signal may be generated in response to a decoded version of the BL stream, e.g., decoded BL signal 1212. In another embodiment, similar processing may be applied when the base layer is in the TaB format and the enhancement layer is in the SbS format (see FIG. 14). Processing related to the Carrier RPU 1220 and the Codec RPU 1225 may be performed by the same processor or different processors.
To maintain the integrity of the original residual (808-V), in an embodiment, the range of the carrier signal may be reduced, e.g., by dividing each pixel by a constant, e.g., 2. Such a division process may be combined with the filtering process in the carrier RPU. In another embodiment, when the residual 808-V is very small, the carrier signal may have a fixed pixel value for all of its pixels, e.g., 128. In another embodiment, the value of the carrier signal may be defined adaptively based on the properties of the residual signal. Such decisions may be signaled to the decoder using metadata (e.g., as part of an RPU data stream).
In one embodiment, EL signal 1237 is compressed using multi-view coding (MVC) as specified in the H.264 (AVC) specification to generate a coded or compressed EL stream 1240-2. Since BL signal 1207 and EL signal 1237 do not have the same multiplexed format, a Reference Processing Unit (RPU) 1225 may be employed to convert the decoded SbS BL signal 1212 into a TaB signal (1227) that can be used as a reference by the MVC encoder 1230. This TaB picture (1227) is a newly generated inter- view prediction reference picture and is inserted into the reference picture list for encoding an EL picture. To further improve coding efficiency, Codec RPU 1225 may apply additional partitioning, processing and filtering to match an inter- view reference picture from the BL signal to the EL input signal, as described in PCT application PCT/US2010/040545, filed June 30, 2010, by A. Tourapis, et al., and incorporated herein by reference in its entirety. The choices of filters and partitions being used in the RPUs can be adapted at multiple levels of resolution, e.g., at the slice, picture, Group of Picture (GOP), scene, or sequence level. On an encoder, the coded EL and BL streams (1240) may be multiplexed with RPU data (e.g., 1240-3) and other auxiliary data (not shown) to be transmitted to a decoder.
FIG. 12B illustrates a decoding example process flow according to an example embodiment of the present invention. At the receiver, the incoming stream is
demultiplexed to generate a coded BL stream (1240-1), a coded EL stream (1240-2), and RPU Data (e.g., 1240-3). BL Decoder 1250 corresponds to the BL encoder 1210. In one embodiment BL decoder 1250 is an AVC decoder. BL decoder 1250 will generate a decoded (e.g., SbS) BL image 1252.
Since the coded EL stream 1240-2 was coded by using reference frames from both the EL signal 1237 and the decoded BL signal 1212, the same process is matched on the decoder as well. Using RPU data 1240-3, the codec RPU 1255 may generate signal 1257 to be used by MVC decoder 1260. Signal 1257 comprises a predicted enhancement layer signal which may be used as an additional sequence of reference frames for the MVC decoder 1260 to generate a decoded EL signal 1262.
After decompressing the BL and EL streams, the decoder needs to reconstruct the left and right views at full resolution. One example embodiment of such a method, to be referred to as "the difference" method, is depicted in FIG. 13 A. The vertical or horizontal frequencies that are missing in the base layer can be constructed as a pixel-wise difference of the enhancement layer (1262) and the carrier signal 1302. Carrier signal 1302 can be reconstructed using decoded BL signal 1252 and the decoder RPU (e.g., 1255) using processing that matches the processing of the encoder Carrier RPU (1220). An example of such a process is depicted in FIG. 15. Residue signal 1317 is then up-sampled (e.g., in 1320) and merged by pixel- wise addition with the up-sampled frame-compatible (FC) reconstructed base layer (FC-L and FC-R) to reconstruct the full resolution (FR) left and right views (FR-LE and FR-RE). In one embodiment, to reduce complexity, one may reuse the output from codec RPU 1255 because the codec RPU and carrier RPU tend to share the same filters. In the case when one partition and one filter are employed in the codec RPU, the two RPUs apply exactly the same processing.
As depicted in FIG. 13 A, legacy receivers may still decode a pair of
half-resolution, frame-compatible, views (FC-L and FC-R), by performing the proper up-sampling on the BL signal 1252.
FIG. 13B depicts another example embodiment for reconstructing the full resolution signal, the method to be referred to as "the high pass method." Under this method, the decoded TaB EL signal 1262 is first processed by a horizontal high-pass filter 1330. Such a filter removes the low-frequency components of the carrier signal in the EL signal, thus generating a carrier-free residual signal (e.g., 808-V). The output of the high-pass filter is up-sampled vertically (1320) to generate residual signals 1322-L and 1322-R, which are added to the horizontally up-sampled, reconstructed frame-compatible layer (FC-L and FC-R), to generate full-resolution estimates of the original views (e.g., FR-RE and FR-LE).
If the source video is interlaced, in an embodiment, the top field and bottom field are processed independently during the vertical filtering (e.g., down-sampling or up-sampling). Then the fields are merged (e.g., by line interleaving) to create a frame. If an AVC encoder is being used, then the interlaced signal can be coded either in frame coding mode or field coding mode. The codec RPU should also be instructed whether to process an inter- view reference picture from BL as a frame or fields. In AVC coding, there is no indication of the scan type of a coded sequence in the mandated bitstream, since it is out of the scope of decoding. There might be some information presented in the Supplemental Enhancement Information (SEI) message, but a SEI message is not required for decoding. In one embodiment, a high level syntax is proposed to indicate if the RPU should apply frame or field processing. In one embodiment, if it is an interlaced signal, no matter how a picture is coded, the RPU may always process the picture as separate fields. In another embodiment, the RPU may follow how the BL signal is coded. Hence, if the BL signal is coded as fields, the RPU applies field processing, otherwise it applies frame processing.
Embodiments of this invention comprise a variety of filters, which can be categorized as: multiplexing (or muxing) filters, RPU filters, and de-multiplexing (or de-muxing) filters. When designing muxing filters, the goal is to maintain as much information as possible from the original filter, but without causing aliasing. For down-sampling, a muxing filter may be designed to have very flat passband response and strong attenuation at the midpoint of the spectrum, where the signal is folded during down-sampling, to avoid aliasing, In an embodiment, (in Matlab notation) an example of such a filter has coefficients:
[30, -4,-61,-21, 83, 71,-102, -178, 116, 638, 904,
638, 116, -178, -102, 71, 83, -21, -61, -4, 30] ./ 211
When designing the RPU, the down-sampling filter and up-sampling filters should have a very low cutoff frequency, as the high frequencies in carrier image are not used for reconstruction and having such low passed signal would help to increase coding efficiency for the EL signal. The RPU down-sampling and up-sampling filters should also be of as low order as possible since these exact filters are used in decoders for real time decoding. Examples of such filters are depicted in Table 7.
During decoding, one may apply as a de-muxing filter the same filter as the up-sampling filter used in the codec RPU. For the high pass method of reconstruction (e.g., FIG. 13B), the high-pass filter (1330) should be complementary of the combined frequency responses of the muxing down- sampling filter and de-muxing up- sampling filter. Typically the order of such a filter will be high, which may not be suitable for certain real-time decoder applications. High-pass filters with similar pass band characteristics, but lower stop band attenuation, can also be derived with a much lower filter order, making them better suited for real-time decoder applications. Examples of such filters are depicted in Table 8.
Some implementations may have a very low bit rate requirement for the EL stream. In one embodiment, to improve coding quality at low bitrates, one may remove all chroma information from the EL stream. In one example, one may set chroma values in the EL signal to be a constant value, for example, 128. Correspondingly, the color components of an inter-view reference picture processed by the Codec RPU needs to be set in the same way. In another embodiment, one may select and transmit only those regions of the input signal with the most high frequencies in the EL signal and gray out the remaining areas, for example, by setting them to a constant value (e.g., 128). The location and size of such regions may be signaled from an encoder to the decoder using metadata, e.g., through the
RPU data stream.
In FIG 12A, the residue signal 808-V is added to the carrier signal 1222 directly to generate EL 1237. In another embodiment, linear or non-linear quantization method may be applied to the residue signal before adding it with the carrier. Example embodiments discussed so far address the problem of restoring missing horizontal or vertical frequencies. In an embodiment, additional enhancement layers may be employed to restore additional frequencies, such as diagonal. For example, the input signal (102-L and 102-R) may be down-sampled across a diagonal direction. That information (or another residual signal based on that diagonal information) may be transmitted to a decoder as a second enhancement layer (EL2). A decoder could merge BL, EL, and EL2 signals to generate an FCFR signal. In another embodiment, instead of using a separate EL2 stream to code diagonal information, one may code luma diagonal information in the chroma channel in the EL signal. In such an implementation, luma will be coded in full resolution, but chroma will be coded in half resolution. tax Examples
8.3 A coding standard, such as H.264, typically defines only the syntax of the coded bistream and the decoding process. This section presents examples of a proposed syntax for a new FCFR profile in H.264 or other compression standard that supports the methods of this invention.
The first part is called the RPU header, rpu_header(), and includes the static information that most likely is not going to change during the transmission of the signal.
The second part is called the RPU data payload, rpu_data_payload(), and includes the dynamic information which might be updated more frequently. The RPU data payload signals to the decoder the filters that will be used to update the inter- view reference pictures prior to their use for prediction.
[0100] The syntax can be sent at the slice level, the picture level, the GOP level, the scene level, or at the sequence level. It can be included at the NAL unit header, the Sequence Parameter Set (SPS) and its extension, the SubSPS, the Picture Parameter Set (PPS), the sheer header, the SEI message, or a new NAL unit, and the like. In an example embodiment, the RPU syntax is only updated at the sequence level. For backwards compatibility, as shown in Table 1, a new NAL unit for Coded slice extension for MFC, slice_layer_extension_rbsp( ), is also defined. In our example, a new profile, denoted in Table 2 as profile 134, is assigned for an embodiment of a FCFR 3D system using orthogonal multiplexing (OM).
Additional examples of the proposed syntax are depicted in Table 2, Table 3 and Table 4, where proposed additions to the existing H.264/ A VC specification are depicted in Cour ier font. In this example, the proposed RPU syntax is invoked at the sequence level and it is added in the sequence parameter set MVC extension.
RPU Data Header Semantics rpu_type specifies the prediction type purpose for the RPU signal. If not present, then its value is assumed to be 0.
rpu_format specifies the prediction process format, given the rpu_type, that will be used when processing the video data for prediction and/or final reconstruction. If not present, then its value is assumed to be 0. Table 5 depicts examples of rpujype and rpu_format values.
default_grid_position signals whether viewO and viewl grid position information should be explicitly signaled. If default_grid_position is set to 1, or not present, then default values are obtained as follows:
(rpu_type == 0 && rpu_format == 0 ) {
view0_grid_position_x = 4 ;
view0_grid_position_y = 8 ;
viewl_grid_position_x = 12;
viewl_grid_position_y = 8 ; if (rpu_type == 0 && rpu_format == 1) view0_grid_position_x =
view0_grid_position_y =
viewl_grid_position_x =
viewl_grid_position_y = 12; viewO_grid_position_x is same as the frameO_grid_position_x as defined in Frame packing arrangement SEI message semantics section in the H.264 specification.
viewO_grid_position_y is same as the frameO_grid_position_y as defined in Frame packing arrangement SEI message semantics section in the H.264 specification.
viewl_grid_position_x is same as the frame l_grid_position_x as defined in Frame packing arrangement SEI message semantics section in the H.264 specification.
viewl_grid_position_y is same as the frame l_grid_position_y as defined in Frame packing arrangement SEI message semantics section in the H.264 specification. interlace_processing_flag signals whether reference processing will be applied on a frame or a field basis. If it is set to zero, processing will take place in the frame domain. If this flag is set to 1 then processing shall be performed separately for each field.
disable_part_symmetry_flag (when present) signals whether filter selection for spatially collocated partitions belonging to different views is constrained or unconstrained. When this flag is not set, both collocated partitions in either view are processed with the same RPU filter to derive the enhancement layer prediction. Hence half as many filters are signaled. When this flag is set, a filter is signaled for each partition in the processed picture. If not present, then all partitions use the same filtering method (NULL). This flag is constrained to be equal to 1 if the rpu_format is set to SBS and PicWidthlnMbs = 1 or if rpu_format is set to OU or TAB and PicHeightlnMapUnits is equal to 1. If not present, the value of this flag will be set to 1.
num_x_partitions_minusl signals the number of partitions that are used to subdivide the processed picture in the horizontal dimension during filtering. It can take any non-negative integer value. If not present, then the value of num_x_partitions_minusl is set equal to 0. The value of num_x_partitions_minus 1 is between 0 and Clip3(0, 15, (PicWidthlnMbs » 1) - 1), where PicWidthlnMbs is specified in the H.264 specification.
num_y_partitions_minusl signals the number of partitions that are used to subdivide the processed picture in the vertical dimension during filtering. It can take any non-negative integer value. If not present, then the value of num_y_partitions_minus 1 is set equal to 0. The value of num_y_partitions_minus l is between 0 and Clip3(0, 7,
(PicHeightlnMapUnits » 1) - 1), where PicHeightlnMapUnits is specified in the H.264 specification.
RPU Data Payload Semantics separate_component_filtering_flag signals whether separate filters are transmitted for each color space component or a single filter is used for all components. If this flag is set to zero, one then sets the following for each filter tap coefficient: filter_idx[ y ] [ x ][ 1 ] = filter_idx[ y ] [ x ][ 2 ] = filter_idx[ y ] [ x ][ 0 ] .
filter_idx_down[ y ][ x ][ cmp ] contains an index that corresponds to the down-sampling processing filter that is to be used for the partition with vertical coordinate y and horizontal coordinate x, corresponding to color component cmp. This index may take any non-negative value, each corresponding to a unique processing filter or scheme. filter_idx_up[ y ][ x ][ cmp ] contains an index that corresponds to the up-sampling processing filter that is to be used for the partition with vertical coordinate y and horizontal coordinate x, corresponding to color component cmp. This index may take any non-negative value, each corresponding to a unique processing filter or scheme.
An example of the system providing the filter_idx methods is shown in Table 6. Examples of filters are shown in Table 7. The "F2" filter for down-sampling and up-sampling has no filter coefficients. It simply sets the carrier signal to a constant value (e.g., 128).
FIG. 14 depicts an example process flow for generating the EL signal (1237). The right-half of FIG. 14 depicts the case where the base layer is coded in side-by-side (SbS) format, hence the EL layer is coded in top-and-bottom (TaB) format. The process flow in this half matches the process flow as depicted in FIG. 12A. The left-half of FIG. 14 depicts the case when the base layer is coded in top-and-bottom (TaB) format (1405) and the enhancement layer is coded in SbS format (1415). TaB multiplexing step 1405 may follow the processing depicted in steps 104-V and 106-V in FIG.1, while SbS multiplexing 1415 may follow the processing depicted in steps 104-H and 106-H in FIG. 1.
Given the syntax described earlier, FIG. 15 depicts an embodiment of an example process in the RPU during the decoding of an FCFR stream to generate carrier signal 1302. A similar process may also be applied to generate the carrier signal 1222 in the encoder, e.g., using Carrier RPU 1220. As depicted in FIG. 15, the process operates on all partitions and all color components of an input sequence. The data flow may be the same regardless whether the BL signal is multiplexed in SbS format or in TaB format; however, the filtering orientations depend on the format of the base layer. The process also assumes that down-sampling precedes up-sampling; however, in another embodiment up-sampling may precede down-sampling.
Using the RPU data stream (e.g., 1240-3), and a filter identification look-up table (e.g., Table 6), in step 1510 the RPU identifies the down-sampling filter to be used to down-sample the decoded BL signal. If it is an F0 or Fl filter (1515- 1), then it proceeds to perform down-sampling (1520- 1). If it is an F2 filter, then it simply creates a carrier signal with all pixel values set to a constant (e.g., 128) (1520-2). If the BL layer is coded in the SbS format, then down-sampling (1520) is performed in the vertical direction. If the BL layer is coded in the TaB format, then down-sampling (1520) is performed in the horizontal direction. After down-sampling (1520), the two original halves (or views) are de-multiplexed and then multiplexed again in an orthogonal orientation, e.g., from SbS to TaB or from TaB to SbS, to form an intermediate result that matches the multiplexing format of the residual signal. This intermediate signal is then up-sampled so that the final carrier signal matches the resolution of the residual signal. If the up-sampling filter is FO or Fl (1525-1), then the intermediate result is up-sampled to generate the final carrier signal 1237. If it is an F2 filter, then it creates a carrier signal with all pixel values set to a fixed value (e.g., 128) (1530-2). If the BL layer is coded in the SbS format, then up-sampling (1530) is performed in the horizontal direction. If the BL layer is coded in the TaB format, then up-sampling (1530) is performed in the vertical direction. If the decoder does not recognize any of the filters, then the process terminates and error messages may be generated (1540).
Table 1: NAL Unit type, syntax element categories, and NAL unit type classes
Table 2: Sequence parameter set MVC extension Syntax
num_x_part it ions_minus 1 0 ue (v) num_y_part it ions_minus 1 0 ue (v)
}
Table 4: RPU Data Payload Syntax
Table 5: RPU format
Table 6: Down-sampling and Up-sampling filter
Table 7: RPU implicit filter definition OM_RPU_UP_F0 [3 -17 78 78 -17 3] 7 64 6
OM_RPU_UP _F1 [-11 75 75 -11] 7 64 4
Table 8: OM_FCFR Reconstruction filters
9. IMPLEMENTATION MECHANISMS - HARDWARE OVERVIEW
According to one embodiment, the techniques described herein are implemented by one or more special-purpose computing devices. The special-purpose computing devices may be hard- wired to perform the techniques, or may include digital electronic devices such as one or more application- specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques, or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination. Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques. The special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard- wired and/or program logic to implement the techniques.
For example, FIG. 11 is a block diagram that illustrates a computer system 1100 upon which an example embodiment of the invention may be implemented. Computer system 1100 includes a bus 1102 or other communication mechanism for communicating information, and a hardware processor 1104 coupled with bus 1102 for processing information. Hardware processor 1104 may be, for example, a general purpose microprocessor.
Computer system 1100 also includes a main memory 1106, such as a random access memory (RAM) or other dynamic storage device, coupled to bus 1102 for storing information and instructions to be executed by processor 1104. Main memory 1106 also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor 1104. Such instructions, when stored in non-transitory storage media accessible to processor 1104, render computer system 1100 into a special-purpose machine that is customized to perform the operations specified in the instructions. Computer system 1100 further includes a read only memory (ROM) 1108 or other static storage device coupled to bus 1102 for storing static information and instructions for processor 1104. A storage device 1110, such as a magnetic disk or optical disk, is provided and coupled to bus 1102 for storing information and instructions.
Computer system 1100 may be coupled via bus 1102 to a display 1112, such as a liquid crystal display, for displaying information to a computer user. An input device 1114, including alphanumeric and other keys, is coupled to bus 1102 for communicating information and command selections to processor 1104. Another type of user input device is cursor control 1116, such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor 1104 and for controlling cursor movement on display 1112. This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
Computer system 1100 may implement the techniques described herein using customized hard- wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system 1100 to be a special-purpose machine. According to one embodiment, the techniques herein are performed by computer system 1100 in response to processor 1104 executing one or more sequences of one or more instructions contained in main memory 1106. Such instructions may be read into main memory 1106 from another storage medium, such as storage device 1110. Execution of the sequences of instructions contained in main memory 1106 causes processor 1104 to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.
The term "storage media" as used herein refers to any non-transitory media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non- volatile media includes, for example, optical or magnetic disks, such as storage device 1110. Volatile media includes dynamic memory, such as main memory 1106. Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.
Storage media is distinct from but may be used in conjunction with transmission media. Transmission media participates in transferring information between storage media. For example, transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus 1102. Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor 1104 for execution. For example, the instructions may initially be carried on a magnetic disk or solid state drive of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 1100 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus 1102. Bus 1102 carries the data to main memory 1106, from which processor 1104 retrieves and executes the instructions. The instructions received by main memory 1106 may optionally be stored on storage device 1110 either before or after execution by processor 1104.
Computer system 1100 also includes a communication interface 1118 coupled to bus 1102. Communication interface 1118 provides a two-way data communication coupling to a network link 1120 that is connected to a local network 1122. For example, communication interface 1118 may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface 1118 may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface 1118 sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
Network link 1120 typically provides data communication through one or more networks to other data devices. For example, network link 1120 may provide a connection through local network 1122 to a host computer 1124 or to data equipment operated by an Internet Service
Provider (ISP) 1126. ISP 1126 in turn provides data communication services through the world wide packet data communication network now commonly referred to as the "Internet" 1128. Local network 1122 and Internet 1128 both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 1120 and through communication interface 1118, which carry the digital data to and from computer system 1100, are example forms of transmission media.
Computer system 1100 can send messages and receive data, including program code, through the network(s), network link 1120 and communication interface 1118. In the Internet example, a server 1130 might transmit a requested code for an application program through Internet 1128, ISP 1126, local network 1122 and communication interface 1118.
The received code may be executed by processor 1104 as it is received, and/or stored in storage device 1110, or other non-volatile storage for later execution.
10. EQUIVALENTS, EXTENSIONS, ALTERNATIVES AND MISCELLANEOUS
In the foregoing specification, example embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to
implementation. Thus, the sole and exclusive indicator of what is the invention, and is intended by the applicants to be the invention, is the set of claims that issue from this application, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Hence, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims

1. A method, comprising:
receiving an input 3D image, the input 3D image comprising a left eye (LE) input image frame and a right eye (RE) input image frame;
generating, based on the LE input image frame and the RE input image frame, a first multiplexed image frame comprising first high spatial frequency content in a vertical direction and first reduced resolution content in a horizontal direction; generating, based on the LE input image frame and the RE input image frame, a second multiplexed image frame comprising second high spatial frequency content in the horizontal direction and second reduced resolution content in the vertical direction; and
encoding and outputting the first multiplexed image frame and the second multiplexed image frame to represent the input 3D image.
2. The method as recited in Claim 1, wherein the 3D input image is a first 3D input image in a sequence of 3D input images comprising a second different 3D input image having a second LE input image frame and a second LE input image frame; and the method further comprising:
generating, based on the second LE input image frame and the second RE input image frame, a third multiplexed image frame comprising third high spatial frequency content in the vertical direction and third reduced resolution content in the horizontal direction;
generating, based on the second LE input image frame and the second input image frame, a fourth multiplexed image frame comprising fourth high spatial frequency content in the horizontal direction and fourth reduced resolution content in the vertical direction; and
encoding and outputting the third multiplexed image frame and the fourth multiplexed image frame to represent the second input 3D image.
3. The method as recited in Claim 1, wherein the first multiplexed image frame comprises a first LE image data portion and a first RE image data portion; wherein the first LE image data portion and the first RE image data portion are of a same spatial resolution along both horizontal and vertical directions; wherein the second multiplexed image frame comprises a second LE image data portion and a second RE image data portion; and wherein the second LE image data portion and the second RE image data portion are of a same spatial resolution along both horizontal and vertical directions.
The method as recited in Claim 3, wherein each of the first LE image data portion and the first RE image data portion represents a subsampled version of a full resolution image frame; wherein the first multiplexed image frame adopts a side-by-side (SbS) format to carry the first LE image data portion and the first RE image data portion; wherein each of the second LE image data portion and the second RE image data portion represents a subsampled version of a full resolution image frame; and wherein the second multiplexed image frame adopts a top-and-bottom (TaB) format to carry the second LE image data portion and the second RE image data portion.
The method as recited in Claim 1, wherein the first multiplexed image frame adopts a first multiplexing format that preserves the high spatial frequency content in the vertical direction, and wherein the second multiplexed image frame adopts a second multiplexing format that preserves the high spatial frequency content in the horizontal direction.
The method as recited in Claim 1, wherein one of the first multiplexed image frame or the second multiplexed image frame is outputted in a base layer bitstream in a plurality of bit streams, while the other of the first multiplexed image frame or the second multiplexed image frame is outputted in an enhancement layer bitstream in the plurality of bit streams.
The method as recited in Claim 1, further comprising:
generating, based at least in part on the first multiplexed image frame, prediction reference image data; and
encoding an enhancement layer video signal based on differences between the prediction reference image data and the input 3D image.
The method as recited in Claim 1, further comprising:
applying one or more first operations comprising at least one of (a) spatial frequency filtering operations or (b) spatial subsampling operations in the second direction to the first input image frame and the second input image frame in generating the first multiplexed image frame, wherein the one or more first operations removes high spatial frequency content in the second direction and preserves high spatial frequency content in the first direction; and
applying one or more second operations comprising at least one of (a) spatial frequency filtering operations or (b) spatial subsampling operations in the first direction to the first input image frame and the second input image frame in generating the second multiplexed image frame, wherein the one or more second operations removes high spatial frequency content in the first direction and preserves high spatial frequency content in the second direction.
9. The method as recited in Claim 1, wherein one of first multiplexed image frame or the second multiplexed image frame comprises residual image data, wherein the residual image data is generated by subtracting reference image data generated based on the other of the first multiplexed image frame or the second multiplexed image frame from input image data derived from the LE input image frame and the RE input image frame.
10. The method as recited in Claim 1, further comprising converting one or more 3D input images represented, received, transmitted, or stored with one or more input video signals into one or more 3D output images represented, received, transmitted, or stored with one or more output video signals.
11. The method as recited in Claim 1, wherein the input 3D image comprises image data encoded in one of a high dynamic range (HDR) image format, a RGB color space associated with the Academy Color Encoding Specification (ACES) standard of the Academy of Motion Picture Arts and Sciences (AMPAS), a P3 color space standard of the Digital Cinema Initiative, a Reference Input Medium Metric/Reference Output Medium Metric (RIMM/ROMM) standard, an sRGB color space, or a RGB color space associated with the BT.709 Recommendation standard of the International Telecommunications Union (ITU).
12. A method, comprising:
receiving a 3D image represented by a first multiplexed image frame and second
multiplexed image frame, the first multiplexed image frame comprising first high spatial frequency content in a verticall direction and first reduced resolution content in a horizontal direction, and the second multiplexed image frame comprising second high spatial frequency content in the horizontal direction and second reduced resolution content in the vertical direction;
generating, based on the first multiplexed image frame and the second multiplexed image frame, a left eye (LE) image frame and a right eye (RE) image frame, the LE image frame comprising LE high spatial frequency content in both horizontal and vertical directions, and the RE image frame comprising RE high spatial frequency content in both horizontal and vertical directions; and
rendering the 3D image by rendering the LE image frame and the RE image frame.
13. The method as recited in Claim 12, wherein the 3D image is a first 3D image in a sequence of 3D images comprising a second different 3D image having third multiplexed image frame and a fourth multiplexed image frame, the third multiplexed image frame comprising third high spatial frequency content in the vertical direction and third reduced resolution content in the horizontal direction, and the fourth multiplexed image frame comprising fourth high spatial frequency content in the horizontal direction and fourth reduced resolution content in the vertical direction; and the method further comprising: generating a second LE image frame and a second RE image frame, the second LE image frame comprising high spatial frequency content in both horizontal and vertical directions, and the second LE image frame comprising high spatial frequency content in both horizontal and vertical directions; and
rendering the second 3D image by rendering the second LE image frame and the second RE image frame.
14. The method as recited in Claim 12, wherein at least one of the first multiplexed image frame or the second multiplexed image frame comprises an LE image data portion and an RE image data portion; and wherein the LE image data portion and the RE image data portion are of a same spatial resolution.
15. The method as recited in Claim 14, wherein each of the LE image data portion and the RE image data portion represents a subsampled version of a full resolution image frame; and wherein the LE image data portion and the RE image data portion forms a single image frame in one of a side-by-side format or a top-and-bottom format.
16. The method as recited in Claim 12, wherein one of the first multiplexed image frame or the second multiplexed image frame is decoded from a base layer bitstream in a plurality of bit streams, while the other of the first multiplexed image frame or the second multiplexed image frame is decoded from an enhancement layer bitstream in the plurality of bit streams.
17. The method as recited in Claim 12, further comprising:
generating, based at least in part on one of the first multiplexed image frame or the second multiplexed image frame, prediction reference image data; and
generating, based on enhancement layer (EL) data decoded from an EL video signal and the prediction reference image data, one of the LE image frame or the RE image frame.
18. The method as recited in Claim 12, further comprising:
applying one or more first operations comprising at least one of (a) spatial frequency filtering operations or (b) demultiplexing operations in generating the LE image frame, wherein the one or more first operations combine LE high spatial frequency content, as derived from the first multiplexed image frame and the second multiplexed image frame, of both horizontal and vertical directions into the LE image frame; and
applying one or more second operations comprising at least one of (a) spatial frequency filtering operations or (b) demultiplexing operations in generating the RE image frame, wherein the one or more second operations combine RE high spatial frequency content, as derived from the first multiplexed image frame and the second multiplexed image frame, of both horizontal and vertical directions into the RE image frame.
19. The method as recited in Claim 18, wherein the one or more first operations and the one or more second operations comprise at least a high pass filtering operation.
20. The method as recited in Claim 18, wherein the one or more first operations and the one or more second operations comprise a processing sub-path that replaces at least one high pass filtering operation; and wherein the processing sub-path comprises at least one subtraction operation and no high pass filtering operation.
21. The method as recited in Claim 12, wherein one of the first multiplexed image frame or the second multiplexed image frame comprises residual image data; and the method further comprising:
decoding and processing enhancement layer image data without generating prediction reference data from the other of the first multiplexed image frame or the second multiplexed image frame.
22. The method as recited in Claim 12, further comprising processing one or more 3D images represented, received, transmitted, or stored with one or more input video signals.
23. The method as recited in Claim 12, wherein the 3D image comprises image data encoded in one of a high dynamic range (HDR) image format, a RGB color space associated with the Academy Color Encoding Specification (ACES) standard of the Academy of Motion Picture Arts and Sciences (AMPAS), a P3 color space standard of the Digital Cinema Initiative, a Reference Input Medium Metric/Reference Output Medium Metric
(RIMM/ROMM) standard, an sRGB color space, or a RGB color space associated with the BT.709 Recommendation standard of the International Telecommunications Union (ITU).
24. An encoder performing any of the methods as recited in Claims 1-11.
25. A decoder performing any of the methods as recited in Claims 12-23.
26. A system performing any of the methods as recited in Claims 1-23.
27. A system, comprising:
an encoder configured to
receive an input 3D image, the input 3D image comprising a left eye (LE) input image frame and a right eye (RE) input image frame;
generate, based on the LE input image frame and the RE input image frame, a first multiplexed image frame comprising first high spatial frequency content in a vertical direction and first reduced resolution content in a horizontal direction;
generate, based on the LE input image frame and the RE input image frame, a second multiplexed image frame comprising second high spatial frequency content in the horizontal direction and second reduced resolution content in the vertical direction; and
encode and outputting the first multiplexed image frame and the second
multiplexed image frame to represent the input 3D image;
a decoder configured to:
receive the first multiplexed image frame and the second multiplexed image frame in a plurality of video bitstreams;
generate, based on the first multiplexed image frame and the second multiplexed image frame, a left eye (LE) image frame and a right eye (RE) image frame, the LE image frame comprising LE high spatial frequency content in both horizontal and vertical directions, and the RE image frame comprising RE high spatial frequency content in both horizontal and vertical directions; and
rendering the LE image frame and the RE image frame.
28. A method for encoding 3D frame compatible full resolution (FCFR) images, the method comprising:
receiving an input 3D image, the input 3D image comprising a left eye (LE) input image frame and a right eye (RE) input image frame;
generating, based on the LE input image frame and the RE input image frame, a first multiplexed image frame comprising first high spatial frequency content in a first direction and first reduced resolution content in a second direction, the second direction being orthogonal to the first direction;
generating, based on the first multiplexed image frame reference image data and carrier image data;
subtracting the reference image data from the input 3D image data to generate residual image data;
generating, based on the residual image data and the carrier image data, a second
multiplexed image frame comprising second high spatial frequency content in the second direction and second reduced spatial frequency content in the first direction; and
encoding and outputting the first multiplexed image frame and the second multiplexed image frame to represent the input 3D image.
29. The method as recited in Claim 28, wherein generating the carrier image data comprises: applying a first spatial filtering in a first direction of a base layer image frame to down-sample the base layer image frame and generate an intermediate image, the base layer image based at least in part on the first multiplexed image frame; and
applying a second spatial filtering in a second direction to the intermediate image to up- sample the intermediate image to generate the carrier image data, wherein the second direction is orthogonal to the first direction.
30. The method of claim 28, wherein all the carrier image data comprise pixel values of the same fixed value.
31. A method for decoding 3D signals coded in an FCFR format, the method comprising:
receiving a 3D image represented by a first multiplexed image frame and second multiplexed image frame, the first multiplexed image frame comprising first high spatial frequency content in a first spatial direction and first reduced resolution content in a second spatial direction, and the second multiplexed image frame comprising second high spatial frequency content in the second spatial direction and second reduced resolution content in the first spatial direction, wherein the second spatial direction is orthogonal to the first spatial direction;
generating based on the first multiplexed image a frame-compatible pair of image frames (FC-L, FC-R) comprising high spatial frequency content in the first spatial direction and reduced spatial frequency content in the second spatial direction; and
generating based on the first multiplexed image and the second multiplexed image frame a full-resolution pair of frames (FR-LE, FR-RE) comprising high spatial frequency content in both the first spatial direction and the second spatial direction.
32. The method of claim 31, wherein the second multiplexed image frame comprises a residual image frame combined with a carrier image frame.
33. The method of claim 32, further comprising:
generating the carrier image frame based on the first multiplexed image frame; and subtracting the carrier image frame from the second multiplexed image frame to generate the residual image frame.
34. The method of claim 32, further comprising applying a high-pass filter in the second spatial direction of the second multiplexed image frame to generate the residual image frame.
A non-transitory computer readable medium, storing software instructions, which when executed by one or more processors cause performance of the steps of any of the methods as recited in Claims 1-23 or 28-34.
EP12769865.2A 2011-09-29 2012-09-27 Frame-compatible full resolution stereoscopic 3d video delivery with symmetric picture resolution and quality Active EP2761874B1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US201161541005P 2011-09-29 2011-09-29
US201261583081P 2012-01-04 2012-01-04
PCT/US2012/057610 WO2013049383A1 (en) 2011-09-29 2012-09-27 Frame-compatible full resolution stereoscopic 3d video delivery with symmetric picture resolution and quality

Publications (2)

Publication Number Publication Date
EP2761874A1 true EP2761874A1 (en) 2014-08-06
EP2761874B1 EP2761874B1 (en) 2020-12-09

Family

ID=47003282

Family Applications (1)

Application Number Title Priority Date Filing Date
EP12769865.2A Active EP2761874B1 (en) 2011-09-29 2012-09-27 Frame-compatible full resolution stereoscopic 3d video delivery with symmetric picture resolution and quality

Country Status (8)

Country Link
US (1) US10097820B2 (en)
EP (1) EP2761874B1 (en)
JP (1) JP5926387B2 (en)
CN (1) CN103828358B (en)
AR (1) AR088081A1 (en)
HK (1) HK1194572A1 (en)
TW (1) TWI595770B (en)
WO (1) WO2013049383A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013103490A1 (en) 2012-01-04 2013-07-11 Dolby Laboratories Licensing Corporation Dual-layer backwards-compatible progressive video delivery
US9357197B2 (en) 2012-05-24 2016-05-31 Dolby Laboratories Licensing Corporation Multi-layer backwards-compatible video delivery for enhanced dynamic range and enhanced resolution formats
EP2860927A4 (en) * 2012-06-11 2015-09-23 Huawei Tech Co Ltd Equalization method and equalizer for receiving signal in microwave mimo system
KR102301083B1 (en) * 2013-04-15 2021-09-10 루카 로사토 Hybrid backward-compatible signal encoding and decoding
EP3242487B1 (en) * 2014-12-29 2021-10-13 Sony Group Corporation Transmitting device, transmitting method, receiving device and receiving method
KR20170075349A (en) * 2015-12-23 2017-07-03 한국전자통신연구원 Transmitter and receiver for multi-image having multi-view and method for multiplexing multi-image
CN108769681B (en) 2018-06-20 2022-06-10 腾讯科技(深圳)有限公司 Video encoding method, video decoding method, video encoding apparatus, video decoding apparatus, computer device, and storage medium

Family Cites Families (52)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5416510A (en) 1991-08-28 1995-05-16 Stereographics Corporation Camera controller for stereoscopic video system
US5404167A (en) * 1993-03-12 1995-04-04 At&T Corp. Subband color video coding using a reduced motion information subband
US6229927B1 (en) * 1994-09-21 2001-05-08 Ricoh Company, Ltd. Reversible embedded wavelet system implementation
US6549666B1 (en) * 1994-09-21 2003-04-15 Ricoh Company, Ltd Reversible embedded wavelet system implementation
US6141446A (en) * 1994-09-21 2000-10-31 Ricoh Company, Ltd. Compression and decompression system with reversible wavelets and lossy reconstruction
DE69525127T2 (en) * 1994-10-28 2002-10-02 Oki Electric Industry Co., Ltd. Device and method for encoding and decoding images using edge synthesis and wavelet inverse transformation
JP3534465B2 (en) * 1994-12-28 2004-06-07 パイオニア株式会社 Subband coding method
US5612735A (en) 1995-05-26 1997-03-18 Luncent Technologies Inc. Digital 3D/stereoscopic video compression technique utilizing two disparity estimates
US6674911B1 (en) * 1995-09-14 2004-01-06 William A. Pearlman N-dimensional data compression using set partitioning in hierarchical trees
DE69720559T2 (en) * 1996-11-06 2004-02-12 Matsushita Electric Industrial Co., Ltd., Kadoma Image encoding method with variable length codes
US6057884A (en) 1997-06-05 2000-05-02 General Instrument Corporation Temporal and spatial scaleable coding for video object planes
US6959120B1 (en) * 2000-10-27 2005-10-25 Microsoft Corporation Rebinning methods and arrangements for use in compressing image-based rendering (IBR) data
KR100454194B1 (en) 2001-12-28 2004-10-26 한국전자통신연구원 Stereoscopic Video Encoder and Decoder Supporting Multi-Display Mode and Method Thereof
JP4104895B2 (en) * 2002-04-25 2008-06-18 シャープ株式会社 Stereo image encoding device and stereo image decoding device
WO2005006766A1 (en) * 2003-07-09 2005-01-20 Nec Corporation Moving picture encoding method, moving picture decoding method, moving picture encoding device, moving picture decoding device, and computer program
AU2003295998A1 (en) * 2003-12-01 2005-08-12 Ess Technology, Inc. Optimized structure for digital separation of composite video signals
KR100834748B1 (en) * 2004-01-19 2008-06-05 삼성전자주식회사 Apparatus and method for playing of scalable video coding
US8374238B2 (en) 2004-07-13 2013-02-12 Microsoft Corporation Spatial scalability in 3D sub-band decoding of SDMCTF-encoded video
US7515759B2 (en) 2004-07-14 2009-04-07 Sharp Laboratories Of America, Inc. 3D video coding using sub-sequences
US7262767B2 (en) * 2004-09-21 2007-08-28 Victor Company Of Japan, Limited Pseudo 3D image creation device, pseudo 3D image creation method, and pseudo 3D image display system
JP4489605B2 (en) * 2005-01-19 2010-06-23 株式会社メガチップス Compression encoding apparatus, compression encoding method and program
FR2889017A1 (en) 2005-07-19 2007-01-26 France Telecom METHODS OF FILTERING, TRANSMITTING AND RECEIVING SCALABLE VIDEO STREAMS, SIGNAL, PROGRAMS, SERVER, INTERMEDIATE NODE AND CORRESPONDING TERMINAL
US8503806B2 (en) * 2005-09-06 2013-08-06 Megachips Corporation Compression encoder, compression encoding method and program
WO2007047736A2 (en) 2005-10-19 2007-04-26 Thomson Licensing Multi-view video coding using scalable video coding
WO2007085950A2 (en) 2006-01-27 2007-08-02 Imax Corporation Methods and systems for digitally re-mastering of 2d and 3d motion pictures for exhibition with enhanced visual quality
US8767836B2 (en) 2006-03-27 2014-07-01 Nokia Corporation Picture delimiter in scalable video coding
JP4424522B2 (en) * 2006-07-13 2010-03-03 日本電気株式会社 Encoding and decoding apparatus, encoding method and decoding method
MY162861A (en) 2007-09-24 2017-07-31 Koninl Philips Electronics Nv Method and system for encoding a video data signal, encoded video data signal, method and system for decoding a video data signal
CN101420609B (en) 2007-10-24 2010-08-25 华为终端有限公司 Video encoding, decoding method and video encoder, decoder
KR101580516B1 (en) 2008-04-07 2015-12-28 엘지전자 주식회사 method of receiving a broadcasting signal and apparatus for receiving a broadcasting signal
EP2301256A2 (en) 2008-07-21 2011-03-30 Thomson Licensing Multistandard coding device for 3d video signals
CN102257818B (en) 2008-10-17 2014-10-29 诺基亚公司 Sharing of motion vector in 3d video coding
US20100177162A1 (en) 2009-01-15 2010-07-15 Charles Macfarlane Method and system for enabling 3d video and image processing using one full resolution video stream and one lower resolution video stream
KR101597987B1 (en) * 2009-03-03 2016-03-08 삼성전자주식회사 Layer-independent encoding and decoding apparatus and method for multi-layer residual video
EP2409495A4 (en) 2009-03-16 2013-02-06 Lg Electronics Inc A method of displaying three-dimensional image data and an apparatus of processing three-dimensional image data
EP2420068A4 (en) * 2009-04-13 2012-08-08 Reald Inc Encoding, decoding, and distributing enhanced resolution stereoscopic video
JP5416271B2 (en) 2009-04-20 2014-02-12 ドルビー ラボラトリーズ ライセンシング コーポレイション Adaptive interpolation filter for multi-layer video delivery
KR101676310B1 (en) 2009-04-27 2016-11-16 엘지전자 주식회사 Broadcast receiver and 3D video data processing method thereof
US8639046B2 (en) * 2009-05-04 2014-01-28 Mamigo Inc Method and system for scalable multi-user interactive visualization
US9774882B2 (en) * 2009-07-04 2017-09-26 Dolby Laboratories Licensing Corporation Encoding and decoding architectures for format compatible 3D video delivery
US8676041B2 (en) 2009-07-04 2014-03-18 Dolby Laboratories Licensing Corporation Support of full resolution graphics, menus, and subtitles in frame compatible 3D delivery
KR20110007928A (en) 2009-07-17 2011-01-25 삼성전자주식회사 Method and apparatus for encoding/decoding multi-view picture
US8665968B2 (en) * 2009-09-30 2014-03-04 Broadcom Corporation Method and system for 3D video coding using SVC spatial scalability
US8537200B2 (en) * 2009-10-23 2013-09-17 Qualcomm Incorporated Depth map generation techniques for conversion of 2D video data to 3D video data
US9014276B2 (en) 2009-12-04 2015-04-21 Broadcom Corporation Method and system for 3D video coding using SVC temporal and spatial scalabilities
US8520020B2 (en) * 2009-12-14 2013-08-27 Canon Kabushiki Kaisha Stereoscopic color management
EP2529551B1 (en) 2010-01-27 2022-03-16 Dolby Laboratories Licensing Corporation Methods and systems for reference processing in image and video codecs
EP2532162B1 (en) 2010-02-01 2017-08-30 Dolby Laboratories Licensing Corporation Filtering for image and video enhancement using asymmetric samples
US8619852B2 (en) 2010-07-21 2013-12-31 Dolby Laboratories Licensing Corporation Systems and methods for multi-layered frame-compatible video delivery
US20120236115A1 (en) 2011-03-14 2012-09-20 Qualcomm Incorporated Post-filtering in full resolution frame-compatible stereoscopic video coding
US9473788B2 (en) 2011-09-16 2016-10-18 Dolby Laboratories Licensing Corporation Frame-compatible full resolution stereoscopic 3D compression and decompression
US8923403B2 (en) 2011-09-29 2014-12-30 Dolby Laboratories Licensing Corporation Dual-layer frame-compatible full-resolution stereoscopic 3D video delivery

Also Published As

Publication number Publication date
US20140286397A1 (en) 2014-09-25
WO2013049383A1 (en) 2013-04-04
US10097820B2 (en) 2018-10-09
TW201330588A (en) 2013-07-16
HK1194572A1 (en) 2014-10-17
CN103828358B (en) 2016-05-18
JP5926387B2 (en) 2016-05-25
EP2761874B1 (en) 2020-12-09
CN103828358A (en) 2014-05-28
JP2014529271A (en) 2014-10-30
AR088081A1 (en) 2014-05-07
TWI595770B (en) 2017-08-11

Similar Documents

Publication Publication Date Title
EP2761877B1 (en) Dual-layer frame-compatible full-resolution stereoscopic 3d video delivery
US9473788B2 (en) Frame-compatible full resolution stereoscopic 3D compression and decompression
ES2885250T3 (en) Systems and Methods for Delivering Raster Compatible Multilayer Video
EP2591609B1 (en) Method and apparatus for multi-layered image and video coding using reference processing signals
Chen et al. Overview of the MVC+ D 3D video coding standard
KR101675780B1 (en) Frame compatible depth map delivery formats for stereoscopic and auto-stereoscopic displays
EP2752000B1 (en) Multiview and bitdepth scalable video delivery
US8773505B2 (en) Broadcast receiver and 3D video data processing method thereof
EP2859724B1 (en) Method and apparatus of adaptive intra prediction for inter-layer coding
US10097820B2 (en) Frame-compatible full-resolution stereoscopic 3D video delivery with symmetric picture resolution and quality
US20130222539A1 (en) Scalable frame compatible multiview encoding and decoding methods
Lu et al. Orthogonal Muxing Frame Compatible Full Resolution technology for multi-resolution frame-compatible stereo coding
Lee et al. Interlaced MVD format for free viewpoint video
KR20130063603A (en) Methods of coding additional frame and apparatuses for using the same

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20140429

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

DAX Request for extension of the european patent (deleted)
17Q First examination report despatched

Effective date: 20160920

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: DOLBY LABORATORIES LICENSING CORPORATION

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

REG Reference to a national code

Ref country code: DE

Ref legal event code: R079

Ref document number: 602012073642

Country of ref document: DE

Free format text: PREVIOUS MAIN CLASS: H04N0013000000

Ipc: H04N0019597000

RIC1 Information provided on ipc code assigned before grant

Ipc: H04N 19/597 20140101AFI20200626BHEP

Ipc: H04N 13/106 20180101ALI20200626BHEP

Ipc: H04N 13/194 20180101ALI20200626BHEP

Ipc: H04N 19/119 20140101ALI20200626BHEP

Ipc: H04N 13/161 20180101ALI20200626BHEP

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: GRANT OF PATENT IS INTENDED

INTG Intention to grant announced

Effective date: 20200807

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE PATENT HAS BEEN GRANTED

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: AT

Ref legal event code: REF

Ref document number: 1344554

Country of ref document: AT

Kind code of ref document: T

Effective date: 20201215

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602012073642

Country of ref document: DE

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210310

Ref country code: RS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210309

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 1344554

Country of ref document: AT

Kind code of ref document: T

Effective date: 20201209

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210309

REG Reference to a national code

Ref country code: NL

Ref legal event code: MP

Effective date: 20201209

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG9D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210409

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602012073642

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210409

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

26N No opposition filed

Effective date: 20210910

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

REG Reference to a national code

Ref country code: BE

Ref legal event code: MM

Effective date: 20210930

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20210409

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210927

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210927

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210930

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210930

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210930

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO

Effective date: 20120927

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

P01 Opt-out of the competence of the unified patent court (upc) registered

Effective date: 20230512

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201209

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20240820

Year of fee payment: 13

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20240822

Year of fee payment: 13

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20240820

Year of fee payment: 13