US20220279195A1 - Adaptive perceptual mapping and signaling for video coding - Google Patents
Adaptive perceptual mapping and signaling for video coding Download PDFInfo
- Publication number
- US20220279195A1 US20220279195A1 US17/747,337 US202217747337A US2022279195A1 US 20220279195 A1 US20220279195 A1 US 20220279195A1 US 202217747337 A US202217747337 A US 202217747337A US 2022279195 A1 US2022279195 A1 US 2022279195A1
- Authority
- US
- United States
- Prior art keywords
- values
- video data
- transfer function
- decoder
- encoder
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/40—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
- H04N19/126—Details of normalisation or weighting functions, e.g. normalisation matrices or variable uniform quantisers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/132—Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/136—Incoming video signal characteristics or properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/154—Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/90—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
- H04N19/98—Adaptive-dynamic-range coding [ADRC]
Definitions
- the present disclosure relates to the field of video encoding and decoding, particularly a method of adaptively transforming linear input values into non-linear values that can be quantized, based on content characteristics of an input video.
- High Dynamic Range (HDR) video and Wide Color Gamut (WCG) video offer greater ranges of luminance and color values than traditional video.
- traditional video can have a limited luminance and color range, such that details in shadows or highlights can be lost when images are captured, encoded, and/or displayed.
- HDR and/or WCG video can capture a broader range of luminance and color information, allowing the video to appear more natural and closer to real life to the human eye.
- HDR and WCG video information normally must be converted into other formats before it can be encoded using a video compression algorithm.
- HDR video formats such as the EXR file format describe colors in the RGB color space with 16-bit values to cover a broad range of potential HDR values, while 8 or 10-bit values are often used to express the colors of non-HDR video. Since many video compression algorithms expect 8 or 10-bit values, 16-bit HDR color values can be quantized into 10-bit values that the compression algorithms can work with.
- Some encoders use a coding transfer function to convert linear values from the input video into non-linear values prior to uniform quantization.
- coding transfer functions are often gamma correction functions.
- the coding transfer function is generally fixed, such that it does not change dependent on the content of the input video.
- an encoder's coding transfer function can be defined to statically map every possible input value in an HDR range, such as from 0 to 10,000 nits, to specific non-linear values.
- fixed mapping can lead to poor allocation of quantization levels.
- a picture primarily showing a blue sky can have a lot of similar shades of blue, but those blues can occupy a small section of the overall range for which the coding transfer function is defined.
- similar blues can be quantized into the same value. This quantization can often be perceived by viewers as contouring or banding, when quantized shades of blue extends in bands across the sky displayed on their screen instead of a more natural transitions between the colors.
- What is needed is a method of adapting the coding transfer function, or otherwise converting and/or redistributing input values, based on the actual content of the input video. This can generate a curve of non-linear values that represents the color and/or intensity information actually present in the input video instead of across a full range of potential values. As such, when the non-linear values are uniformly quantized, the noise and/or distortion introduced by uniform quantization can be minimized such that it is unlikely to be perceived by a human viewer. Additionally, what is needed is a method of transmitting information about the perceptual mapping operations used by the encoder to decoders, such that the decoders can perform corresponding reverse perceptual mapping operations when decoding the video.
- the present disclosure provides a method of encoding a digital video, the method comprising receiving a digital video at a video encoder, providing a perceptual quantizer function at the video encoder, the perceptual quantizer function defined by
- L is a luminance value
- c 1 , c 2 , c 3 , and m 1 are parameters with fixed values
- m 2 is a parameter with a variable value
- adapting the perceptual quantizer function at the video encoder by adjusting the value of the m 2 parameter based on different luminance value ranges found within a coding level of the digital video, encoding the digital video into a bitstream with the video encoder using, in part, the perceptual quantizer function, transmitting the bitstream to a decoder, and transmitting the value of the m 2 parameter to the decoder for each luminance value range in the coding level.
- the present disclosure also provides a method of decoding a digital video, the method comprising receiving a bitstream at a video decoder, providing a perceptual quantizer function at the video decoder, the perceptual quantizer function defined by
- L is a luminance value
- c 1 , c 2 , c 3 , and m 1 are parameters with fixed values
- m 2 is a parameter with a variable value
- receiving a particular value for the m 2 parameter for each luminance value range in a coding level and decoding the digital video with the video decoder using, in part, the perceptual quantizer function with the received value of the m 2 parameter.
- the present disclosure also provides a video encoder comprising a data transmission interface configured to receive a digital video comprising linear color values, and a processor configured to analyze the linear color values within a coding level to determine a range of color values present within the coding level, adapt a perceptual mapping operation based the range of color values present within the coding level, perform the perceptual mapping operation to convert the linear color values into non-linear color values, uniformly quantize the non-linear color values and encode them into a coded bitstream, wherein the data transmission interface is further configured to transmit the coded bitstream to a decoder, and transmit one or more parameters to the decoder from which the decoder can derive a reverse perceptual mapping operation that substantially reverses the perceptual mapping operation for the coding level.
- the present disclosure also provides a video decoder comprising a data transmission interface configured to receive a coded bitstream and one or more parameters associated with a coding level, and a processor configured to decode the coded bitstream into non-linear values, derive a reverse perceptual mapping operation for the coding level from the one or more parameters, and perform the reverse perceptual mapping operating for the coding level to convert the non-linear values to linear values to reconstruct a digital video.
- FIG. 1 depicts an embodiment of a video coding system comprising an encoder and a decoder.
- FIG. 2 depicts a first embodiment of a process for encoding an input video into a coded bitstream with an encoder using a perceptual mapping operation, and decoding that coded bitstream into a decoded video with a decoder using a reverse perceptual mapping operation.
- FIG. 3 depicts a second embodiment of a process for encoding an input video into a coded bitstream with an encoder using a perceptual mapping operation, and decoding that coded bitstream into a decoded video with a decoder using a reverse perceptual mapping operation.
- FIG. 4 depicts a third embodiment of a process for encoding an input video into a coded bitstream with an encoder using a perceptual mapping operation, and decoding that coded bitstream into a decoded video with a decoder using a reverse perceptual mapping operation.
- FIG. 5 depicts an exemplary plot of relative quantization step sizes from variants of Weber's Law on a log-log scale.
- FIG. 6 depicts an exemplary plot of relative quantization step sizes from variants of Stevens' Power Law on a log-log scale.
- FIG. 7 depicts an exemplary plot of a perceptual quantizer curve for different values of an m 2 parameter in a perceptual quantizer function.
- FIG. 8 depicts an exemplary plot of relative quantization step sizes for different values of an m 2 parameter in a perceptual quantizer function.
- FIG. 1 depicts an embodiment of a video coding system comprising an encoder 100 and a decoder 102 .
- An encoder 100 can comprise processors, memory, circuits, and/or other hardware and software elements configured to encode, transcode, and/or compress input video 104 into a coded bitstream 106 .
- the encoder 100 can be configured to generate the coded bitstream 106 according to a video coding format and/or compression scheme, such as HEVC (High Efficiency Video Coding), H.264/MPEG-4 AVC (Advanced Video Coding), or MPEG-2.
- HEVC High Efficiency Video Coding
- H.264/MPEG-4 AVC Advanced Video Coding
- MPEG-2 MPEG-2.
- the encoder 100 can be a Main 10 HEVC encoder.
- the encoder 100 can receive an input video 104 from a source, such as over a network or via local data storage from a broadcaster, content provider, or any other source.
- the encoder 100 can encode the input video 104 into the coded bitstream 106 .
- the coded bitstream 106 can be transmitted to decoders 102 over the internet, over a digital cable television connection such as Quadrature Amplitude Modulation (QAM), or over any other digital transmission mechanism.
- QAM Quadrature Amplitude Modulation
- a decoder 102 can comprise processors, memory, circuits, and/or other hardware and software elements configured to decode, transcode, and/or decompress a coded bitstream 106 into decoded video 108 .
- the decoder 102 can be configured to decode the coded bitstream 106 according to a video coding format and/or compression scheme, such as HEVC, H.264/MPEG-4 AVC, or MPEG-2.
- the decoder 102 can be a Main 10 HEVC decoder.
- the decoded video 108 can be output to a display device for playback, such as playback on a television, monitor, or other display.
- the encoder 100 and/or decoder 102 can be a dedicated hardware devices. In other embodiments the encoder 100 and/or decoder 102 can be, or use, software programs running on other hardware such as servers, computers, or video processing devices.
- an encoder 100 can be a video encoder operated by a video service provider, while the decoder 102 can be part of a set top box connected to a television, such as a cable box.
- the input video 104 can comprise a sequence of pictures, also referred to as frames.
- colors in the pictures can be described digitally using one or more values according to a color space or color model.
- colors in a picture can be indicated using an RGB color model in which the colors are described through a combination of values in a red channel, a green channel, and a blue channel.
- many video coding formats and/or compression schemes use a Y′CbCr color space when encoding and decoding video.
- Y′CbCr color space Y′ is a luma component while Cb and Cr are chroma components that indicate blue-difference and red-difference components.
- the input video 104 can be an HDR input video 104 .
- An HDR input video 104 can have one or more sequences with luminance and/or color values described in a high dynamic range (HDR) and/or on a wide color gamut (WCG).
- HDR high dynamic range
- WCG wide color gamut
- a video with a high dynamic range can have luminance values indicated on a scale with a wider range of possible values than a non-HDR video
- a video using a wide color gamut can have its colors expressed on a color model with a wider range of possible values in at least some channels than a non-WCG video.
- an HDR input video 104 can have a broader range of luminance and/or chroma values than standard or non-HDR videos.
- the HDR input video 104 can have its colors indicated with RGB values in a high bit depth format, relative to non-HDR formats that express color values using lower bit depths such as 8 or 10 bits per color channel.
- an HDR input video 104 can be in an EXR file format with RGB color values expressed in a linear light RGB domain using a 16 bit floating point value for each color channel.
- the encoder 100 can perform at least one perceptual mapping operation 110 while encoding the input video 104 into the coded bitstream 106 .
- a perceptual mapping operation 110 can convert linear color values from the input video 104 onto a values on a non-linear curve, based on one or more characteristics of the video's content.
- the perceptual mapping operation 110 can be tailored to the content of the video based on its minimum brightness, average brightness, peak brightness, maximum contrast ratio, a cumulative distribution function, and/or any other factor. In some embodiments, such characteristics can be found through a histogram or statistical analysis of color components of the video's content.
- the perceptual mapping operation 110 can be configured to redistribute linear color information on a non-linear curve that is tailored to the content of the input video 104 .
- redistributing linear color values on a non-linear curve based on the content of the input video 104 can reduce the risk of distortion and/or noise being introduced through uniform quantization operations that may be perceptible to a human viewer.
- a greater amount of bits and/or quantization levels can be allocated to ranges of intensities that are present in each color component and/or that are most likely to be perceived by a human viewer, while fewer bits and/or quantization levels can be allocated to intensities that are not present in the color channels and/or are less likely to be perceived by viewers.
- a scene in the input video 104 when a scene in the input video 104 is a scene that takes place at night, its pictures can primarily include dark colors that are substantially bunched together in the RGB domain. In such a scene, lighter colors in the RGB domain can be absent or rare.
- the perceptual mapping operation 110 can be adapted for the scene, such that the color values are redistributed on a non-linear curve that includes the range of colors actually present within the scene, while omitting or deemphasizing colors that are not present within the scene.
- formerly bunched-together dark RGB values can be spread out substantially evenly on a curve of non-linear values, while less common brighter RGB values can be compressed together or even omitted if they are absent in the scene.
- the dark values can be spread out on the curve, fine differences between them can be distinguished even when the values on the non-linear curve are uniformly quantized into discrete values or codewords.
- the perceptual mapping operation 110 can be adaptive, such that it can change to generate different non-linear values depending on the content of the input video 104 .
- the perceptual mapping operation 110 can be changed on a sub-picture level for different sub-areas of the same picture, such as processing windows, slices, macroblocks in AVC, or coding tree units (CTUs) in HEVC.
- the perceptual mapping operation 110 can be changed on a picture level for different pictures.
- the perceptual mapping operation 110 can be changed on a supra-picture level for different sequences of pictures, such as different Groups of Pictures (GOPs).
- GOPs Groups of Pictures
- a perceptual mapping operation 110 can be applied in any desired color space, such as the RGB or Y′CbCr color spaces.
- the encoder 100 can convert those color values into linear space.
- the encoder 100 can then perform a perceptual mapping operation 110 to convert them back to a non-linear space, this time based on the actual content of the input video 104 .
- the encoder 100 can transmit one or more parameters 112 associated with the perceptual mapping operation 110 to the decoder 102 , such that the decoder 102 can derive a corresponding reverse perceptual mapping operation 114 from the parameters 112 .
- the decoder 102 can use the reverse perceptual mapping operation 114 to appropriately convert the perceptually mapped non-linear color values back into linear values when decoding the coded bitstream 106 into a decoded video 108 .
- the reverse perceptual mapping operation 114 may not necessarily be an exact inverse of the perceptual mapping operation 110 , but it can be configured to convert the perceptually mapped non-linear color values into linear values that approximate the original linear values such that noise and/or distortion introduced by uniform quantization of the perceptually mapped values is unlikely to be perceived by a human viewer.
- FIG. 2 depicts a first embodiment of a process for encoding an input video 104 into a coded bitstream 106 with an encoder 100 using a perceptual mapping operation 110 , and decoding that coded bitstream 106 into a decoded video 108 with a decoder 102 using a reverse perceptual mapping operation 114 .
- the perceptual mapping operation 110 is a coding transfer function 204 performed as one of the initial steps of the encoding process
- the reverse perceptual mapping operation 114 is an inverse coding transfer function 238 performed as one of the final steps of the decoding process.
- the encoder 100 can receive an input video 104 .
- the input video 104 can be an HDR and/or WCG video that has color information indicated with RGB values 202 in a high bit depth format, such as color values expressed with a 16-bit floating point value for each RGB color channel.
- the encoder 100 can perform a coding transfer function 204 on the RGB values 202 in each color channel to convert the HDR input video's RGB values 202 into non-linear R′G′B′ values 206 .
- the RGB values 202 can be expressed on a linear scale, while corresponding non-linear R′G′B′ values 206 can be expressed on a non-linear curve.
- the coding transfer function 204 can be a perceptual mapping operation 110 that is adaptive based on the content of the input video 104 on a sub-picture level, picture level, or supra-picture level.
- the non-linear R′G′B′ values 206 generated with the coding transfer function 204 can be expressed with the same number of bits as the RGB values 202 , such that the non-linear R′G′B′ values 206 have the same bit depth.
- the non-linear R′G′B′ values 206 can also be expressed with 16-bit values.
- the encoder 100 can perform color space conversion 208 to translate the non-linear R′G′B′ values 206 into Y′CbCr values 210 .
- the Y′ luma component can be calculated from a weighted average of the non-linear R′G′B′ values 206 .
- the Y′CbCr values 210 can be expressed with the same number of bits as the RGB values 202 and/or the non-linear R′G′B′ values 206 , such that the Y′CbCr values 210 have the same bit depth.
- the Y′CbCr values 210 can also be expressed with 16-bit values.
- the encoder 100 can receive RGB values 202 in a high bit depth format, perform an adaptive coding transfer function 204 based on the content of the input video 104 , and then convert the resulting non-linear values into the Y′CbCr color space.
- the input video 104 can be received in a format wherein its values are already in the Y′CbCr format, and the adaptive coding transfer function 204 can operate to convert or redistribute those values along a curve based on the content of the input video 104 , while the color space conversion 208 can be skipped if the resulting values are already in the desired format.
- the color space conversion 208 and coding transfer function 204 can be reversed, such that the encoder 100 converts received color values to another format before applying an adaptive coding transfer function 204 based on the content of the input video 104 .
- the encoder 100 can perform a uniform quantization operation 212 on the Y′CbCr values 210 to generate quantized Y′CbCr values 214 .
- the uniform quantization operation 212 can fit each of the Y′CbCr values 210 into one of a finite number of possible quantized Y′CbCr values 214 .
- each possible quantized Y′CbCr value 214 can be expressed with a codeword or other value in fewer bits than the bit depth of the Y′CbCr values 210 .
- the 16-bit Y′CbCr values 210 can be quantized into lower bit depth quantized Y′CbCr values 214 , such as 8 or 10-bit quantized Y′CbCr values 214 .
- the step size between each possible quantized Y′CbCr value 214 can be uniform.
- the step size selected by the encoder 100 can influence the amount of distortion and/or noise introduced by the uniform quantization operation 212 .
- distortion and/or noise can be introduced as differences in the color information between the two high bit depth Y′CbCr values 210 are lost due to the uniform quantization operation 212 .
- the previous coding transfer function 204 can have been configured to redistribute color information along a non-linear curve such that uniform quantization of that non-linear curve leads to distortion and/or noise that is unlikely to be perceived by a human viewer.
- the encoder 100 can perform a chroma subsampling operation 216 to convert the quantized Y′CbCr values 214 into chroma subsampled Y′CbCr values 218 .
- the chroma subsampling operation 216 can subsample the Cb and Cr chroma components at a lower resolution to decrease the amount of bits dedicated to the chroma components, without impacting a viewer's perception of the Y′ luma component.
- the chroma subsampling operation 216 can implement 4:2:0 subsampling.
- the quantized Y′CbCr values 214 can be expressed with a full 4:4:4 resolution, and the encoder 100 can subsample the Cb and Cr chroma components of the quantized Y′CbCr values 214 to express them as chroma subsampled Y′CbCr values 218 with 4:2:0 subsampling at half their horizontal and vertical resolution.
- the chroma subsampling operation 216 can implement 4:2:2 subsampling, 4:1:1 subsampling, or any other subsampling ratio.
- the encoder 100 can perform an encoding operation 220 on the chroma subsampled Y′CbCr values 218 to generate the coded bitstream 106 .
- the pixels of each picture can be broken into sub-pictures, such as processing windows, slices, macroblocks, or CTUs.
- the encoder 100 can encode each individual picture and/or sub-picture using intra-prediction and/or inter-prediction. Coding with intra-prediction uses spatial prediction based on other similar sections of the same picture or sub-picture, while coding with inter-prediction uses temporal prediction to encode motion vectors that point to similar sections of another picture or sub-picture, such as a preceding or subsequent picture in the input video 104 . As such, coding of some pictures or sub-pictures can be at least partially dependent on other reference pictures in the same group of pictures (GOP).
- GOP group of pictures
- the coded bitstream 106 generated by the encoder 100 can be transmitted to one or more decoders 102 .
- the encoder 100 can also transmit one or more parameters 112 associated with the coding transfer function 204 on a sub-picture level, per picture level, and/or supra-picture level, such that the decoder 102 can derive a corresponding reverse coding transfer function 204 from the parameters 112 for each sub-picture, picture, or sequence of pictures.
- Each decoder 102 can receive the coded bitstream 106 and perform a decoding operation 222 to generate reconstructed chroma subsampled Y′CbCr values 224 that approximate the chroma subsampled Y′CbCr values 218 output by the encoder's chroma subsampling operation 216 .
- the coded bitstream 106 can be decoded into reconstructed chroma subsampled Y′CbCr values 224 expressed with 4:2:0 subsampling.
- the decoder 102 can decode individual pictures and/or sub-pictures with intra-prediction and/or inter-prediction.
- the decoder 102 can perform a chroma upsampling operation 226 to express the chroma components of the reconstructed chroma subsampled Y′CbCr values 224 with more bits, as reconstructed quantized Y′CbCr values 228 .
- the chroma upsampling operation 226 can copy, sample, and/or average the subsampled chroma information to generate the reconstructed quantized Y′CbCr values 228 at a full 4:4:4 resolution, such that they approximate the quantized Y′CbCr values 214 output by the encoder's uniform quantization operation 212 .
- the decoder 102 can perform a reverse quantization operation 230 on the reconstructed quantized Y′CbCr values 228 , to generate reconstructed Y′CbCr values 232 .
- the reverse quantization operation 230 can convert the low-bit depth reconstructed quantized Y′CbCr values 228 to higher bit depth reconstructed Y′CbCr values 232 .
- the reverse quantization operation 230 can convert the values to be expressed in a 16 bit format used for reconstructed Y′CbCr values 232 .
- the reconstructed Y′CbCr values 232 can approximate the Y′CbCr values 210 output by the encoder's color space conversion 208 .
- the decoder 102 can perform reverse color space conversion 234 to translate the reconstructed Y′CbCr values 232 into reconstructed non-linear R′G′B′ values 236 .
- the reconstructed non-linear R′G′B′ values 236 can have the same bit depth as the reconstructed Y′CbCr values 232 , and can approximate the non-linear R′G′B′ values 206 output by the encoder's coding transfer function 204 .
- the decoder 102 can perform an inverse coding transfer function 238 on the reconstructed non-linear R′G′B′ values 236 in each color channel to convert the reconstructed non-linear R′G′B′ values 236 into reconstructed RGB values 240 .
- the decoder 102 can have received parameters 112 from which it can derive an inverse coding transfer function 238 that effectively reverses the conversion between RGB values and R′G′B′ values performed by the encoder's coding transfer function 204 .
- the parameters 112 can be sent by the encoder 100 on a sub-picture level, a picture level, and/or a supra-picture level to indicate changes in the coding transfer function 204 and inverse coding transfer function 238 between different sub-pictures in the same picture, between pictures, or between sequences of pictures, as the coding transfer function 204 can change depending on the content of the input video 104 .
- the decoder 102 can use the inverse coding transfer function 238 to convert the reconstructed non-linear R′G′B′ values 236 into linear reconstructed RGB values 240 that approximate the original RGB values 202 of the input video 104 .
- the reconstructed RGB values 240 can be used to display pictures to a viewer on a display screen.
- FIG. 3 depicts a second embodiment of a process for encoding an input video 104 into a coded bitstream 106 with an encoder 100 using a perceptual mapping operation 110 , and decoding that coded bitstream 106 into a decoded video 108 with a decoder 102 using a reverse perceptual mapping operation 114 .
- a perceptual mapping operation 110 is performed at the encoder 100 after a color space conversion 208 from R′G′B′ values 206 into Y′CbCr values 210 , prior to a uniform quantization operation 212 .
- a reverse perceptual mapping operation 114 is performed at the decoder 102 after a reverse uniform quantization operation 230 , before a reverse color space conversion 234 into reconstructed R′G′B′ values 236 .
- the encoder 100 can receive an input video 104 , such as a HDR and/or WCG video that has color information indicated with RGB values 202 in a high bit depth format. After receiving the input video 104 , the encoder 100 can perform a non-adaptive coding transfer function 302 to convert the RGB values 202 in each color channel into R′G′B′ values 206 .
- the non-adaptive coding transfer function 302 can be a fixed function that operates the same way for all input values, independent of the content of the input video 104 .
- the non-adaptive coding transfer function 302 can be a pass-through function, such that the R′G′B′ values 206 are substantially identical to the RGB values 202 .
- the encoder 100 can perform color space conversion 208 to translate the R′G′B′ values 206 into Y′CbCr values 210 .
- the Y′CbCr values 210 can be expressed with the same number of bits as the RGB values 202 and/or the R′G′B′ values 206 , such that the Y′CbCr values 210 have the same bit depth.
- the encoder 100 can receive an input video with values already described in a Y′CbCr color space, such that color space conversion 208 can be skipped.
- the order of the coding transfer function 302 and color space conversion 208 can be reversed.
- the encoder 100 can perform a perceptual mapping operation 110 on the Y′CbCr values 210 to generate perceptually mapped Y′CbCr values 304 .
- a perceptual mapping operation 110 can be adaptive based on the content of the input video 104 on a sub-picture level, picture level, or supra-picture level.
- the perceptual mapping operation 110 can use a 3D lookup table that maps Y′CbCr values 210 to associated perceptually mapped Y′CbCr values 304 .
- the perceptual mapping operation 110 can use one or more formulas to convert each color component.
- the perceptual mapping operation 110 can convert values using formulas such as:
- Y ′_PM f ( Y′,Cb,Cr )
- the functions can each take the three Y′CbCr values 210 as inputs and output a perceptually mapped Y′CbCr value 304 , Y′_PM, Cb_PM, or Cr_PM.
- the 3D lookup table or conversion functions can be adaptive based on the content of the input video 104 .
- each possible quantized Y′CbCr value 214 can be expressed with a codeword or other value in fewer bits than the bit depth of the perceptually mapped Y′CbCr values 304 .
- the step size between each possible quantized Y′CbCr value 214 can be uniform.
- the perceptual mapping operation 110 can have been configured to redistribute color information along a non-linear curve such that uniform quantization of that non-linear curve leads to distortion and/or noise that is unlikely to be perceived by a human viewer.
- each decoder 102 can receive the coded bitstream 106 and perform a decoding operation 222 to generate reconstructed chroma subsampled Y′CbCr values 224 that approximate the chroma subsampled Y′CbCr values 218 output by the encoder's chroma subsampling operation 216 .
- the decoder 102 can perform a chroma upsampling operation 226 to express the chroma components of the reconstructed chroma subsampled Y′CbCr values 224 with more bits, as reconstructed quantized Y′CbCr values 228 that approximate the quantized Y′CbCr values 214 output by the encoder's uniform quantization operation 212 .
- the decoder 102 can perform a reverse quantization operation 230 on the reconstructed quantized Y′CbCr values 228 , to generate reconstructed perceptually mapped Y′CbCr values 306 .
- the reverse quantization operation 230 can convert the low-bit depth reconstructed quantized Y′CbCr values 228 to higher bit depth reconstructed perceptually mapped Y′CbCr values 306 .
- the reverse quantization operation 230 can convert the values to be expressed in a 16 bit format used for reconstructed perceptually mapped Y′CbCr values 306 .
- the reconstructed perceptually mapped Y′CbCr values 306 can approximate the perceptually mapped Y′CbCr values 304 output by the encoder's perceptual mapping operation 110 .
- the decoder 102 can perform a reverse perceptual mapping operation 114 on the reconstructed perceptually mapped Y′CbCr values 306 to generate reconstructed Y′CbCr values 232 .
- the decoder 102 can have received parameters 112 from which it can derive a reverse perceptual mapping operation 114 that effectively reverses the conversion between Y′CbCr values 210 and perceptually mapped Y′CbCr values 304 performed by the encoder's perceptual mapping operation 110 .
- the parameters 112 can be sent by the encoder 100 on a sub-picture level, a picture level, and/or a supra-picture level to indicate changes in the perceptual mapping operation 110 and reverse perceptual mapping operation 114 between different sub-pictures in the same picture, between pictures, or between sequences of pictures, as the perceptual mapping operation 110 can change depending on the content of the input video 104 .
- the decoder 102 can perform reverse color space conversion 234 to translate the reconstructed Y′CbCr values 232 into reconstructed non-linear R′G′B′ values 236 .
- the reconstructed non-linear R′G′B′ values 236 can have the same bit depth as the reconstructed Y′CbCr values 232 , and can approximate the R′G′B′ values 206 output by the encoder's coding transfer function 302 .
- the decoder 102 can perform an inverse non-adaptive coding transfer function 308 on the reconstructed R′G′B′ values 236 in each color channel to convert the reconstructed R′G′B′ values 236 into reconstructed RGB values 240 .
- the inverse non-adaptive coding transfer function 308 can be a fixed function that operates the same way for all input values.
- the inverse non-adaptive coding transfer function 308 can be a pass-through function, such that the reconstructed RGB values 240 are substantially identical to the reconstructed R′G′B′ values 236 .
- the reconstructed RGB values 240 can be used to display pictures to a viewer on a display screen.
- FIG. 3 depicts a third embodiment of a process for encoding an input video 104 into a coded bitstream 106 with an encoder 100 using a perceptual mapping operation 110 , and decoding that coded bitstream 106 into a decoded video 108 with a decoder 102 using a reverse perceptual mapping operation 114 .
- FIG. 3 can be substantially similar to the embodiment of FIG. 4 , with the standalone perceptual mapping operation 110 and uniform quantization operation 212 replaced with a joint perceptual quantization operation 402 at the encoder 100 .
- the standalone reverse uniform quantization operation 230 and reverse perceptual mapping operation 114 can be replaced with a reverse joint perceptual quantization operation 404 in the embodiment of FIG. 4 .
- the Y′CbCr values 210 output by the color space conversion 208 can be converted into quantized Y′CbCr values 214 using a joint perceptual quantization operation 402 that can be adaptive based on the content of the input video 104 on a sub-picture level, picture level, or supra-picture level. While the uniform quantization operation 212 in the embodiments of FIGS. 2 and 3 can be performed independently on each color component, the joint perceptual quantization operation 402 can take all three color components into consideration.
- the joint perceptual quantization operation 402 can use a 3D lookup table that maps Y′CbCr values 210 to associated quantized Y′CbCr values 214 .
- the joint perceptual quantization operation 402 can use one or more formulas to quantize each color component.
- the joint perceptual quantization operation 402 can quantize values using formulas such as:
- the functions can each take the three Y′CbCr values 210 as inputs and output a quantized Y′CbCr value 214 , D Y′ , D Cb , or D Cr .
- the 3D lookup table or quantization functions can be adaptive based on the content of the input video 104 . While in some embodiments the step size can be uniform between each possible quantized Y′CbCr value 214 , the joint perceptual quantization operation 402 can be configured to redistribute and quantize color information such that any distortion and/or noise it introduces is unlikely to be perceived by a human viewer.
- each possible quantized Y′CbCr value 214 that can be generated with the joint perceptual quantization operation 402 can be expressed with a codeword or other value in fewer bits than the bit depth of the Y′CbCr values 210 .
- the encoder 100 can perform a chroma subsampling operation 216 to convert the quantized Y′CbCr values 214 into chroma subsampled Y′CbCr values 218 , and then perform an encoding operation 220 on the chroma subsampled Y′CbCr values 218 to generate the coded bitstream 106 .
- the coded bitstream 106 generated by the encoder 100 can be transmitted to one or more decoders 102 , as well as one or more parameters 112 associated with the joint perceptual quantization operation 402 on a sub-picture level, per picture level, and/or supra-picture level, such that the decoder 102 can derive a corresponding reverse joint perceptual quantization operation 404 from the parameters 112 for each sub-picture, picture, or sequence of pictures.
- each decoder 102 can receive the coded bitstream 106 and perform a decoding operation 222 to generate reconstructed chroma subsampled Y′CbCr values 224 that approximate the chroma subsampled Y′CbCr values 218 output by the encoder's chroma subsampling operation 216 .
- the decoder 102 can perform a chroma upsampling operation 226 to express the chroma components of the reconstructed chroma subsampled Y′CbCr values 224 with more bits, as reconstructed quantized Y′CbCr values 228 that approximate the quantized Y′CbCr values 214 output by the encoder's joint perceptual quantization operation 402 .
- the decoder 102 can perform a reverse joint perceptual quantization operation 404 on the reconstructed quantized Y′CbCr values 228 , to generate reconstructed Y′CbCr values 232 .
- the decoder 102 can have received parameters 112 from which it can derive a reverse joint perceptual quantization operation 404 that effectively reverses the conversion between Y′CbCr values 210 and quantized Y′CbCr values 214 performed by the encoder's joint perceptual quantization operation 402 .
- the parameters 112 can be sent by the encoder 100 on a sub-picture level, a picture level, and/or a supra-picture level to indicate changes in the joint perceptual quantization operation 402 and reverse joint perceptual quantization operation 404 between different sub-pictures in the same picture, between pictures, or between sequences of pictures, as the perceptual mapping operation 110 can change depending on the content of the input video 104 .
- the reverse joint perceptual quantization operation 404 can also convert the low-bit depth reconstructed quantized Y′CbCr values 228 to higher bit depth reconstructed Y′CbCr values 232 .
- the reverse joint perceptual quantization operation 404 can convert the values to be expressed in a 16 bit format used for reconstructed Y′CbCr values 232 .
- the reconstructed Y′CbCr values 232 can approximate the Y′CbCr values 210 output by the encoder's color space conversion 208 .
- the decoder 102 can perform reverse color space conversion 234 to translate the reconstructed Y′CbCr values 232 into reconstructed R′G′B′ values 236 , followed by an inverse non-adaptive coding transfer function 308 to convert the reconstructed R′G′B′ values 236 into reconstructed RGB values 240 .
- the reconstructed RGB values 240 can be used to display pictures to a viewer on a display screen.
- a perceptual mapping operation 110 can be performed in an initial coding transfer function 204 , as a standalone step at a later portion of the encoding process after color space conversion, or as part of a joint perceptual quantization operation 402 .
- a perceptual mapping operation 110 can be performed after a uniform quantization step, after a chroma subsampling step, or at any other step of the encoding process.
- the decoder 102 can perform its decoding process with corresponding steps in substantially the reverse order from the encoding process.
- the encoder 100 can send the decoder 102 information about a 3D lookup table it used with the perceptual mapping operation 110 , or send complete information about conversion functions it used within the perceptual mapping operation 110 , for each sub-picture level, picture level, or supra-picture level.
- the decoder 102 can determine an associated reverse perceptual mapping operation 114 to use during the decoding process.
- the encoder 100 can save bandwidth by transmitting parameters 112 associated with the perceptual mapping operation 110 it used at each sub-picture level, picture level, or supra-picture level.
- the decoder 102 can use the received parameters 112 to generate and use a corresponding reverse perceptual mapping operation 114 for each sub-picture level, picture level, or supra-picture level.
- the encoder's uniform quantization operation 212 can be denoted as Q (v), as it can operate on converted v values generated by the coding transfer function 204 .
- the step size between quantization levels used in the uniform quantization operation 212 can be denoted as ⁇ step .
- the effective quantization step size, Q(I), of a cascaded adaptive coding transfer function 204 and a uniform quantization operation 212 can be proportional to the slope of the inverse coding transfer function 238 , as shown below:
- the effective quantization step size, Q(I), can thus depend on the slope of the inverse coding transfer function 238 and the step size ⁇ step of the uniform quantization operation 212 .
- the effective quantization step size Q(I) can decrease.
- the step size ⁇ step of the uniform quantization operation 212 is large enough that distortion and/or noise introduced by uniform quantization would otherwise be perceptible to human viewers, the effects of the relatively large step size ⁇ step can be modulated by adapting the coding transfer function 204 to the content of the input video 104 , such that the slope of the inverse coding transfer function 238 is smaller.
- decreasing the slope of the inverse coding transfer function 238 can counteract the effects of a relatively large step size ⁇ step , and thus modulate the effective quantization step size Q(I) such that the overall distortion and/or noise is less likely to be perceived by a human viewer.
- the effective quantization step size Q(I) can be included in a related metric, the relative quantization step size, ⁇ (I), wherein:
- the coding transfer function 204 and thus the corresponding inverse coding transfer function 238 , can be adapted based on the content of the input video 104 such that the relative quantization step size ⁇ (I) stays below a set threshold level.
- the threshold level can be defined by a function ⁇ 0 (I) that gives an optimal slope for the inverse coding transfer function 238 that results in encoding with distortion and noise that is perceptually transparent or perceptually lossless.
- the coding transfer function 204 and thus the corresponding inverse coding transfer function 238 , can be adapted such that ⁇ (I) ⁇ 0 (I).
- the coding transfer function 204 and inverse coding transfer function 238 can be based on the first variant of Weber's Law, such that:
- I N can be a normalized brightness of a portion of the input video 104 , on a sub-picture level, picture level, or supra-picture level.
- the normalized brightness can a brightness level divided by the maximum brightness, such that
- C can be the maximum contrast in the portion of the input video 104 on a sub-picture level, picture level, or supra-picture level.
- the maximum contrast can be the maximum brightness divided by the minimum brightness, such that:
- v N can be a value generated by the coding transfer function 204 , normalized by the dynamic range of the uniform quantizer operation 202 , denoted as D, such that:
- the relative quantization step size for the first variant of Weber's Law can therefore be given by:
- ⁇ WL ⁇ ⁇ 1 ⁇ ( I N ) ⁇ ln ⁇ ( C ) ⁇ ⁇ step , I N ⁇ e C e ⁇ ln ⁇ ( C ) C ⁇ I N - 1 ⁇ ⁇ step , I N ⁇ e C
- the coding transfer function 204 and inverse coding transfer function 238 can be based on the second variant of Weber's Law, such that:
- ⁇ WL ⁇ ⁇ 2 ⁇ ( I N ) ln ⁇ ( C + 1 ) ⁇ ( C ⁇ I N + 1 C ⁇ I N ) ⁇ ⁇ step
- the relative quantization step sizes of the two examples above based on variants of Weber's Law can be plotted on a log-log scale, as shown in FIG. 5 .
- the slope of the relative quantization step size based on the first variant of Weber's Law can be linear on the log-log scale with a negative slope for small values of I N , and then be flat (linear on the log-log scale with a slope of 0) for values of I N that are larger than a particular point.
- the slope of the relative quantization step size based on the second variant of Weber's Law can be negative for small values of I N , and then transition smoothly to approaching a flat slope for larger values of I N .
- the two variants can thus be similar, with the second variant having a smoother transition between the I N ranges that have different slopes.
- the coding transfer function 204 and inverse coding transfer function 238 can be based on the third variant of Stevens' Power Law, such that:
- the relative quantization step sizes of the two examples above based on variants of Stevens' Power Law can be plotted on a log-log scale, as shown in FIG. 6 .
- the slope of the relative quantization step size can have or approach a slope of ⁇ 1 for small values of I N , and have or approach a slope of ⁇ for large values of I N , with the two examples varying on the smoothness of the transition between the I N ranges that have different slopes.
- ⁇ goes to 0, the first variant of Stevens' Power Law can converge with the first variant of Weber's Law, while the third variant of Stevens' Power Law can converge with the second variant of Weber's Law.
- the slope of the curve of relative quantization step sizes ⁇ (I) can differ for different brightness values.
- the coding transfer function 204 is adaptive and can be changed based on perceptual and/or statistical properties of the input video 104 on a sub-picture level, a picture level, or a supra-picture level, the overall shape of the ⁇ (I) curve can change.
- the decoder 102 can derive the inverse coding transfer function 238 from the ⁇ (I) function, by solving for ⁇ ⁇ 1 (v) in the following differential equation:
- the encoder 100 can send one or more parameters 112 that describe the shape of the ⁇ (I) curve to the decoder 102 at each sub-picture level, picture level, or supra-picture level, so that the decoder 102 can derive the appropriate inverse coding transfer function 238 .
- the encoder 100 can save bandwidth by sending a relatively small number of parameters 112 that describe the ⁇ (I) curve at each sub-picture level, picture level, or supra-picture level, compared to sending the full inverse coding transfer function 238 or a full lookup table showing mappings between all possible converted values at every sub-picture level, picture level, or supra-picture level.
- the shape of the ⁇ (I) curve can be expressed through a piecewise log-linear function such as a variant of Weber's Law or Stevens' Power Law, as shown above.
- the encoder 100 can send two parameters 112 to the decoder 102 at each sub-picture level, picture level, or supra-picture level: a normalized brightness value I N and a maximum contrast value C. From these two parameters 112 , the decoder 102 can find ⁇ (I) using a predetermined piecewise log-linear function, and thus derive the appropriate inverse coding transfer function 238 to use when decoding values at that sub-picture level, picture level, or supra-picture level.
- the shape of the ⁇ (I) curve can be expressed through a second order log-polynomial, a polynomial in a logarithmic domain.
- parameters 112 describing in the second order log-polynomial can be sent from the encoder 100 to the decoder 102 for each sub-picture level, picture level, or supra-picture level, such that the decoder 102 can find ⁇ (I) from the parameters 112 and derive the appropriate inverse coding transfer function 238 for the coding level.
- a second order log-polynomial with three parameters a, b, and c can be given by:
- the encoder 100 can send values of the parameters a, b, and c to the decoder 102 .
- the decoder 102 can use the received parameters in the predefined formula to find ⁇ (I) from the parameters 112 and from it derive a corresponding inverse coding transfer function 238 .
- the encoder 100 can directly send one or more parameters 112 that describe a particular coding transfer function 204 or other perceptual mapping operation 110 , and/or a particular inverse coding transfer function 238 or other reverse perceptual mapping operation 114 .
- the coding transfer function 204 or other perceptual mapping operation 110 can be a perceptual quantizer (PQ) transfer function.
- PQ perceptual quantizer
- the PQ transfer function can be a function that operates on Luminance values, L, with the function defined as:
- parameters 112 that can be sent from the encoder 100 to the decoder 102 at each sub-picture level, picture level, or supra-picture level include one or more of: m 1 , m 2 , c 1 , c 2 , c 3 .
- the values of the parameters 112 can be as follows:
- the values of one or more of these parameters 112 can be predetermined, such that they are known to both the encoder 100 and decoder 102 . As such, the encoder 100 can send less than all of the parameters 112 to the decoder 102 to adjust the PQ curve.
- all the parameters 112 except for m 2 can be preset, such that the encoder 100 only sends the value of m 2 it used at each coding level to the decoder 102 .
- tuning the value of m 2 can adjust the PQ curve for different luminance values.
- m 2 is set to be larger than the 78.84375 value indicated above, such as when m 2 is set to 62, the PQ values can be increased throughout some or all of the curve.
- m 2 is set to be less than the 78.84375 value indicated above, such as when m 2 is set to 160, the PQ values can be decreased throughout some or all of the curve.
- different values of m 2 can result in lower relative quantization step sizes ⁇ (I) at different luminance values.
- setting m 2 to 62 can provide lower relative quantization step sizes ⁇ (I) at the low range of luminance values, thereby allocating more bits to those values when they are encoded.
- setting m 2 to 160 can provide lower relative quantization step sizes ⁇ (I) at the high range of luminance values, thereby allocating more bits to those values when they are encoded.
- bits can be allocated differently and/or flexibly for different parts of the input video 104 depending on its content.
- the encoder 100 can inform the decoder 102 how to derive its inverse coding transfer function 238 by sending just the m 2 parameter 112 at each coding level
- the encoder 100 can additionally or alternately adjust any or all of the m 1 , c 1 , c 2 , and c 3 parameters 112 during encoding to flexibly adjust the bit allocations based on the content of the input video 104 .
- the encoder 100 can send the adjusted parameters 112 to the decoder 102 .
- the encoder 100 can use a predefined mapping function or lookup table to determine the value of m 2 or any other parameter 112 based on a distribution of pixel values.
- the encoder 100 can find a value for m 2 based on an average intensity of pixel values.
- the perceptual mapping operation 110 and reverse perceptual mapping operation 114 can change between different areas of the same picture, between pictures, or between sequences of pictures.
- encoding and decoding sub-portions of pictures and/or full pictures can depend on interrelated coding dependencies between the pictures, such as the relationships between I pictures and P or B pictures.
- the encoder 100 can transmit one or more parameters 112 to the decoder 102 related to a perceptual mapping operation 110 at any desired coding level, such as a sub-picture level related to the coding of a sub-portion of a picture, a picture level related to coding a single picture, or a supra-picture related to coding a sequence of pictures.
- the decoder 102 can use the received parameters 112 to derive an appropriate reverse perceptual mapping operation 114 for each sub-portion of a picture, single picture, or sequence of pictures.
- the encoder 100 can send parameters 112 to the decoder 102 on a supra-picture level.
- the reverse perceptual mapping operation 114 described by the parameters 112 can be applicable to all the pictures in a given sequence, such as a GOP.
- the encoder 100 can statistically analyze the input values of all the pictures in a GOP, and use a coding transfer function 204 adapted to the range of values actually found within the pictures of that GOP. The encoder 100 can then send parameters 112 to the decoder 102 from which the decoder 102 can derive a corresponding inverse coding transfer function 238 .
- the encoder 100 can send the parameters 112 to the decoder 102 on a supra-picture level using supplemental enhancement information (SEI) message.
- SEI Supplemental Enhancement Information
- the encoder 100 can send the parameters 112 to the decoder 102 on a supra-picture level using video usability information (VUI) or other information within a Sequence Parameter Set (SPS) associated with the GOP.
- VUI video usability information
- SPS Sequence Parameter Set
- the decoder 102 can use the most recently received parameters 112 until new parameters 112 are received, at which point it can derive a new reverse perceptual mapping operation 114 from the newly received parameters 112 .
- parameters 112 can initially be set in an SPS, and then be updated on a per-GOP basis as the characteristics of the input video 104 changes.
- the encoder 100 can send parameters 112 to the decoder 102 on a picture level.
- the reverse perceptual mapping operation 114 described by the parameters 112 can be applicable to full pictures.
- the encoder 100 can send the parameters 112 to the decoder 102 on a picture level within a Picture Parameter Set (PPS) associated with a picture.
- PPS Picture Parameter Set
- the decoder 102 can receive and maintain parameters 112 for the reference pictures, as well as parameters 112 specific to individual temporally encoded pictures.
- the decoder 102 can first reverse the previous reverse perceptual mapping operation 114 on the reference picture using the parameters 112 received for the reference picture.
- the decoder 102 can then perform a new reverse perceptual mapping operation 114 on the reference picture using the new set of parameters 112 received for the current picture, to re-map the reference picture according to the current picture's parameters 112 .
- the decoder 102 can use the re-mapped reference picture when decoding the current picture.
- the decoder 102 can re-map reference pictures according to new parameters 112 associated with a current picture if the new parameters 112 differ from old parameters 112 associated with the reference picture.
- the decoder 102 can re-map reference pictures as described above if re-mapping is indicated in a flag or parameter received from the encoder 100 .
- the encoder 100 can send parameters 112 to the decoder 102 on a sub-picture level.
- the reverse perceptual mapping operation 114 described by the parameters 112 can be applicable to sub-pictures within a picture, such as processing windows, slices, macroblocks, or CTUs.
- the decoder 102 can receive and maintain parameters 112 for a current sub-picture and all reference pictures or sub-pictures, such as pixel blocks of size 4 ⁇ 4 or 8 ⁇ 8. As such, when decoding a sub-picture that was coded with reference to one or more reference pictures, the decoder 102 can first reverse previous reverse perceptual mapping operations 114 performed on reference pixels using parameters 112 previously received for the reference pixels. The decoder can then apply a new reverse perceptual mapping operation 114 on the reference pixels using new parameters 112 associated with the current sub-picture, to re-map the reference pixels according to the current sub-picture's parameters 112 .
- the decoder 102 can use the re-mapped reference pixels when decoding the current sub-picture. In some embodiments, the decoder 102 can re-map reference pixels according to new parameters 112 associated with a current sub-picture if the new parameters 112 differ from old parameters 112 associated with the reference pixels. In alternate embodiments, the decoder 102 can re-map reference pixels as described above if re-mapping is indicated in a flag or parameter received from the encoder 100 .
- a perceptual mapping operation 110 can be configured to reduce perceptible distortion and/or noise introduced by variable length quantization.
- AVC and HEVC use a predictive, variable length quantization scheme.
- the risk of introducing perceptible distortion and/or noise through variable length quantization can be reduced with a perceptual mapping operation 110 or quantization step that follows a Rate-Distortion-Optimization (RDO) scheme based on a perceptual distortion measure.
- RDO Rate-Distortion-Optimization
- the encoder 100 can calculate a non-perceptual distortion metric, denoted as D non-perceptual .
- the non-perceptual distortion metric can be the mean squared error (MSE) or the error sum of squares (SSE).
- the encoder 100 can then calculate a perceptual distortion metric, denoted as D perceptual , from the non-perceptual distortion metric.
- D perceptual a perceptual distortion metric
- the encoder 100 can use a lookup table that maps D non-perceptual values to D perceptual values.
- the encoder 100 can use a perceptual distortion function based on characteristics of the human visual system, such as:
- the perceptual distortion function can be:
- D perceptual w perceptual ⁇ D non-perceptual +b perceptual
- w perceptual can be a weighting term and b perceptual can be an offset term, which can each be calculated for each color component of a pixel or group of pixels based on characteristics of the human visual system.
- the perceptual distortion metric can be a weighted average of non-perceptual distortion metrics, such as MSE, across different color components.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
A method is provided for encoding a digital video to improve perceptual quality. The method includes receiving a digital video at a video encoder, providing a perceptual quantizer function defined byPQ(L)=(c1+c2Lm11+c3Lm1)m2,wherein L is a luminance value, c1, c2, c3, and m1 are parameters with fixed values, and m2 is a parameter with a variable value, adapting the perceptual quantizer function by adjusting the value of the m2 parameter based on different luminance value ranges found within a coding level of the digital video, encoding the digital video into a bitstream using, in part, the perceptual quantizer function, transmitting the bitstream to a decoder, and transmitting the value of the m2 parameter to the decoder for each luminance value range in the coding level.
Description
- This application is a Continuation of U.S. patent application Ser. No. 15/135,497 filed on Apr. 21, 2016, which claims priority to U.S. Provisional Application Ser. No. 62/150,457, filed Apr. 21, 2015, all of which are hereby incorporated by reference.
- The present disclosure relates to the field of video encoding and decoding, particularly a method of adaptively transforming linear input values into non-linear values that can be quantized, based on content characteristics of an input video.
- High Dynamic Range (HDR) video and Wide Color Gamut (WCG) video offer greater ranges of luminance and color values than traditional video. For example, traditional video can have a limited luminance and color range, such that details in shadows or highlights can be lost when images are captured, encoded, and/or displayed. In contrast, HDR and/or WCG video can capture a broader range of luminance and color information, allowing the video to appear more natural and closer to real life to the human eye.
- However, many common video encoding and decoding schemes, such as MPEG-4 Advanced Video Coding (AVC) and High Efficiency Video Coding (HEVC), are not designed to directly handle HDR or WCG video. As such, HDR and WCG video information normally must be converted into other formats before it can be encoded using a video compression algorithm.
- For example, HDR video formats such as the EXR file format describe colors in the RGB color space with 16-bit values to cover a broad range of potential HDR values, while 8 or 10-bit values are often used to express the colors of non-HDR video. Since many video compression algorithms expect 8 or 10-bit values, 16-bit HDR color values can be quantized into 10-bit values that the compression algorithms can work with.
- Some encoders use a coding transfer function to convert linear values from the input video into non-linear values prior to uniform quantization. By way of a non-limiting example, coding transfer functions are often gamma correction functions. However, even when an encoder uses a coding transfer function to convert linear input values into non-linear values, the coding transfer function is generally fixed, such that it does not change dependent on the content of the input video. For example, an encoder's coding transfer function can be defined to statically map every possible input value in an HDR range, such as from 0 to 10,000 nits, to specific non-linear values. However, when the input video contains input values in only a portion of that range, fixed mapping can lead to poor allocation of quantization levels. For example, a picture primarily showing a blue sky can have a lot of similar shades of blue, but those blues can occupy a small section of the overall range for which the coding transfer function is defined. As such, similar blues can be quantized into the same value. This quantization can often be perceived by viewers as contouring or banding, when quantized shades of blue extends in bands across the sky displayed on their screen instead of a more natural transitions between the colors.
- Additionally, psychophysical studies of the human visual system have shown that a viewer's sensitivity to contrast levels at a particular location can be more dependent on the average brightness of surrounding locations than the actual levels at the location itself. However, most coding transfer functions do not take this into account and instead have fixed conversion functions or tables that do not take characteristics of the actual content, such as its average brightness, into account.
- What is needed is a method of adapting the coding transfer function, or otherwise converting and/or redistributing input values, based on the actual content of the input video. This can generate a curve of non-linear values that represents the color and/or intensity information actually present in the input video instead of across a full range of potential values. As such, when the non-linear values are uniformly quantized, the noise and/or distortion introduced by uniform quantization can be minimized such that it is unlikely to be perceived by a human viewer. Additionally, what is needed is a method of transmitting information about the perceptual mapping operations used by the encoder to decoders, such that the decoders can perform corresponding reverse perceptual mapping operations when decoding the video.
- The present disclosure provides a method of encoding a digital video, the method comprising receiving a digital video at a video encoder, providing a perceptual quantizer function at the video encoder, the perceptual quantizer function defined by
-
- wherein L is a luminance value, c1, c2, c3, and m1 are parameters with fixed values, and m2 is a parameter with a variable value, adapting the perceptual quantizer function at the video encoder by adjusting the value of the m2 parameter based on different luminance value ranges found within a coding level of the digital video, encoding the digital video into a bitstream with the video encoder using, in part, the perceptual quantizer function, transmitting the bitstream to a decoder, and transmitting the value of the m2 parameter to the decoder for each luminance value range in the coding level.
- The present disclosure also provides a method of decoding a digital video, the method comprising receiving a bitstream at a video decoder, providing a perceptual quantizer function at the video decoder, the perceptual quantizer function defined by
-
- wherein L is a luminance value, c1, c2, c3, and m1 are parameters with fixed values, and m2 is a parameter with a variable value, receiving a particular value for the m2 parameter for each luminance value range in a coding level, and decoding the digital video with the video decoder using, in part, the perceptual quantizer function with the received value of the m2 parameter.
- The present disclosure also provides a video encoder comprising a data transmission interface configured to receive a digital video comprising linear color values, and a processor configured to analyze the linear color values within a coding level to determine a range of color values present within the coding level, adapt a perceptual mapping operation based the range of color values present within the coding level, perform the perceptual mapping operation to convert the linear color values into non-linear color values, uniformly quantize the non-linear color values and encode them into a coded bitstream, wherein the data transmission interface is further configured to transmit the coded bitstream to a decoder, and transmit one or more parameters to the decoder from which the decoder can derive a reverse perceptual mapping operation that substantially reverses the perceptual mapping operation for the coding level.
- The present disclosure also provides a video decoder comprising a data transmission interface configured to receive a coded bitstream and one or more parameters associated with a coding level, and a processor configured to decode the coded bitstream into non-linear values, derive a reverse perceptual mapping operation for the coding level from the one or more parameters, and perform the reverse perceptual mapping operating for the coding level to convert the non-linear values to linear values to reconstruct a digital video.
- Further details of the present invention are explained with the help of the attached drawings in which:
-
FIG. 1 depicts an embodiment of a video coding system comprising an encoder and a decoder. -
FIG. 2 depicts a first embodiment of a process for encoding an input video into a coded bitstream with an encoder using a perceptual mapping operation, and decoding that coded bitstream into a decoded video with a decoder using a reverse perceptual mapping operation. -
FIG. 3 depicts a second embodiment of a process for encoding an input video into a coded bitstream with an encoder using a perceptual mapping operation, and decoding that coded bitstream into a decoded video with a decoder using a reverse perceptual mapping operation. -
FIG. 4 depicts a third embodiment of a process for encoding an input video into a coded bitstream with an encoder using a perceptual mapping operation, and decoding that coded bitstream into a decoded video with a decoder using a reverse perceptual mapping operation. -
FIG. 5 depicts an exemplary plot of relative quantization step sizes from variants of Weber's Law on a log-log scale. -
FIG. 6 depicts an exemplary plot of relative quantization step sizes from variants of Stevens' Power Law on a log-log scale. -
FIG. 7 depicts an exemplary plot of a perceptual quantizer curve for different values of an m2 parameter in a perceptual quantizer function. -
FIG. 8 depicts an exemplary plot of relative quantization step sizes for different values of an m2 parameter in a perceptual quantizer function. -
FIG. 1 depicts an embodiment of a video coding system comprising anencoder 100 and adecoder 102. Anencoder 100 can comprise processors, memory, circuits, and/or other hardware and software elements configured to encode, transcode, and/or compressinput video 104 into a codedbitstream 106. Theencoder 100 can be configured to generate the codedbitstream 106 according to a video coding format and/or compression scheme, such as HEVC (High Efficiency Video Coding), H.264/MPEG-4 AVC (Advanced Video Coding), or MPEG-2. By way of a non-limiting example, in some embodiments theencoder 100 can be a Main 10 HEVC encoder. - The
encoder 100 can receive aninput video 104 from a source, such as over a network or via local data storage from a broadcaster, content provider, or any other source. Theencoder 100 can encode theinput video 104 into the codedbitstream 106. The codedbitstream 106 can be transmitted todecoders 102 over the internet, over a digital cable television connection such as Quadrature Amplitude Modulation (QAM), or over any other digital transmission mechanism. - A
decoder 102 can comprise processors, memory, circuits, and/or other hardware and software elements configured to decode, transcode, and/or decompress a codedbitstream 106 into decodedvideo 108. Thedecoder 102 can be configured to decode the codedbitstream 106 according to a video coding format and/or compression scheme, such as HEVC, H.264/MPEG-4 AVC, or MPEG-2. By way of a non-limiting example, in some embodiments thedecoder 102 can be a Main 10 HEVC decoder. The decodedvideo 108 can be output to a display device for playback, such as playback on a television, monitor, or other display. - In some embodiments, the
encoder 100 and/ordecoder 102 can be a dedicated hardware devices. In other embodiments theencoder 100 and/ordecoder 102 can be, or use, software programs running on other hardware such as servers, computers, or video processing devices. By way of a non-limiting example, anencoder 100 can be a video encoder operated by a video service provider, while thedecoder 102 can be part of a set top box connected to a television, such as a cable box. - The
input video 104 can comprise a sequence of pictures, also referred to as frames. In some embodiments, colors in the pictures can be described digitally using one or more values according to a color space or color model. By way of a non-limiting example, colors in a picture can be indicated using an RGB color model in which the colors are described through a combination of values in a red channel, a green channel, and a blue channel. By way of another non-limiting example, many video coding formats and/or compression schemes use a Y′CbCr color space when encoding and decoding video. In the Y′CbCr color space, Y′ is a luma component while Cb and Cr are chroma components that indicate blue-difference and red-difference components. - In some embodiments or situations, the
input video 104 can be anHDR input video 104. AnHDR input video 104 can have one or more sequences with luminance and/or color values described in a high dynamic range (HDR) and/or on a wide color gamut (WCG). By way of a non-limiting example, a video with a high dynamic range can have luminance values indicated on a scale with a wider range of possible values than a non-HDR video, and a video using a wide color gamut can have its colors expressed on a color model with a wider range of possible values in at least some channels than a non-WCG video. As such, anHDR input video 104 can have a broader range of luminance and/or chroma values than standard or non-HDR videos. - In some embodiments, the
HDR input video 104 can have its colors indicated with RGB values in a high bit depth format, relative to non-HDR formats that express color values using lower bit depths such as 8 or 10 bits per color channel. By way of a non-limiting example, anHDR input video 104 can be in an EXR file format with RGB color values expressed in a linear light RGB domain using a 16 bit floating point value for each color channel. - As shown in
FIG. 1 , theencoder 100 can perform at least oneperceptual mapping operation 110 while encoding theinput video 104 into the codedbitstream 106. Aperceptual mapping operation 110 can convert linear color values from theinput video 104 onto a values on a non-linear curve, based on one or more characteristics of the video's content. By way of non-limiting examples, theperceptual mapping operation 110 can be tailored to the content of the video based on its minimum brightness, average brightness, peak brightness, maximum contrast ratio, a cumulative distribution function, and/or any other factor. In some embodiments, such characteristics can be found through a histogram or statistical analysis of color components of the video's content. - The
perceptual mapping operation 110 can be configured to redistribute linear color information on a non-linear curve that is tailored to the content of theinput video 104. As will be discussed below, redistributing linear color values on a non-linear curve based on the content of theinput video 104 can reduce the risk of distortion and/or noise being introduced through uniform quantization operations that may be perceptible to a human viewer. In some embodiments, a greater amount of bits and/or quantization levels can be allocated to ranges of intensities that are present in each color component and/or that are most likely to be perceived by a human viewer, while fewer bits and/or quantization levels can be allocated to intensities that are not present in the color channels and/or are less likely to be perceived by viewers. - By way of a non-limiting example, when a scene in the
input video 104 is a scene that takes place at night, its pictures can primarily include dark colors that are substantially bunched together in the RGB domain. In such a scene, lighter colors in the RGB domain can be absent or rare. In this situation theperceptual mapping operation 110 can be adapted for the scene, such that the color values are redistributed on a non-linear curve that includes the range of colors actually present within the scene, while omitting or deemphasizing colors that are not present within the scene. As such, formerly bunched-together dark RGB values can be spread out substantially evenly on a curve of non-linear values, while less common brighter RGB values can be compressed together or even omitted if they are absent in the scene. As the dark values can be spread out on the curve, fine differences between them can be distinguished even when the values on the non-linear curve are uniformly quantized into discrete values or codewords. - As described above, the
perceptual mapping operation 110 can be adaptive, such that it can change to generate different non-linear values depending on the content of theinput video 104. In some embodiments or situations, theperceptual mapping operation 110 can be changed on a sub-picture level for different sub-areas of the same picture, such as processing windows, slices, macroblocks in AVC, or coding tree units (CTUs) in HEVC. In other embodiments or situations, theperceptual mapping operation 110 can be changed on a picture level for different pictures. In still other embodiments or situations, theperceptual mapping operation 110 can be changed on a supra-picture level for different sequences of pictures, such as different Groups of Pictures (GOPs). - A
perceptual mapping operation 110 can be applied in any desired color space, such as the RGB or Y′CbCr color spaces. In some embodiments, if theinput video 104 received by theencoder 100 indicates color values in a non-linear color space in a manner that is not dependent on the characteristics of the input video's content, such as a perceptual quantizer (PQ) space or gamma-corrected color space, theencoder 100 can convert those color values into linear space. Theencoder 100 can then perform aperceptual mapping operation 110 to convert them back to a non-linear space, this time based on the actual content of theinput video 104. - As will be discussed below, the
encoder 100 can transmit one ormore parameters 112 associated with theperceptual mapping operation 110 to thedecoder 102, such that thedecoder 102 can derive a corresponding reverseperceptual mapping operation 114 from theparameters 112. Thedecoder 102 can use the reverseperceptual mapping operation 114 to appropriately convert the perceptually mapped non-linear color values back into linear values when decoding the codedbitstream 106 into a decodedvideo 108. In some embodiments the reverseperceptual mapping operation 114 may not necessarily be an exact inverse of theperceptual mapping operation 110, but it can be configured to convert the perceptually mapped non-linear color values into linear values that approximate the original linear values such that noise and/or distortion introduced by uniform quantization of the perceptually mapped values is unlikely to be perceived by a human viewer. -
FIG. 2 depicts a first embodiment of a process for encoding aninput video 104 into a codedbitstream 106 with anencoder 100 using aperceptual mapping operation 110, and decoding that codedbitstream 106 into a decodedvideo 108 with adecoder 102 using a reverseperceptual mapping operation 114. In this embodiment, theperceptual mapping operation 110 is acoding transfer function 204 performed as one of the initial steps of the encoding process, while the reverseperceptual mapping operation 114 is an inverse coding transfer function 238 performed as one of the final steps of the decoding process. - As shown in
FIG. 2 , theencoder 100 can receive aninput video 104. In some embodiments theinput video 104 can be an HDR and/or WCG video that has color information indicated withRGB values 202 in a high bit depth format, such as color values expressed with a 16-bit floating point value for each RGB color channel. - After receiving the
input video 104, theencoder 100 can perform acoding transfer function 204 on the RGB values 202 in each color channel to convert the HDR input video's RGB values 202 into non-linear R′G′B′ values 206. By way of a non-limiting example, the RGB values 202 can be expressed on a linear scale, while corresponding non-linear R′G′B′ values 206 can be expressed on a non-linear curve. As described above, thecoding transfer function 204 can be aperceptual mapping operation 110 that is adaptive based on the content of theinput video 104 on a sub-picture level, picture level, or supra-picture level. - In some embodiments, the non-linear R′G′B′ values 206 generated with the
coding transfer function 204 can be expressed with the same number of bits as the RGB values 202, such that the non-linear R′G′B′ values 206 have the same bit depth. By way of a non-limiting example, when the RGB values 202 were expressed with 16-bit values, the non-linear R′G′B′ values 206 can also be expressed with 16-bit values. - After using the
coding transfer function 204 to generate non-linear R′G′B′ values 206, theencoder 100 can performcolor space conversion 208 to translate the non-linear R′G′B′ values 206 into Y′CbCr values 210. By way of a non-limiting example, the Y′ luma component can be calculated from a weighted average of the non-linear R′G′B′ values 206. In some embodiments, the Y′CbCr values 210 can be expressed with the same number of bits as the RGB values 202 and/or the non-linear R′G′B′ values 206, such that the Y′CbCr values 210 have the same bit depth. By way of a non-limiting example, when the original RGB values 202 were expressed with 16-bit values, the Y′CbCr values 210 can also be expressed with 16-bit values. - As described above, in some embodiments or situations the
encoder 100 can receiveRGB values 202 in a high bit depth format, perform an adaptivecoding transfer function 204 based on the content of theinput video 104, and then convert the resulting non-linear values into the Y′CbCr color space. In alternate embodiments or situations, theinput video 104 can be received in a format wherein its values are already in the Y′CbCr format, and the adaptivecoding transfer function 204 can operate to convert or redistribute those values along a curve based on the content of theinput video 104, while thecolor space conversion 208 can be skipped if the resulting values are already in the desired format. In still other embodiments or situations, thecolor space conversion 208 andcoding transfer function 204 can be reversed, such that theencoder 100 converts received color values to another format before applying an adaptivecoding transfer function 204 based on the content of theinput video 104. - The
encoder 100 can perform auniform quantization operation 212 on the Y′CbCr values 210 to generate quantized Y′CbCr values 214. Theuniform quantization operation 212 can fit each of the Y′CbCr values 210 into one of a finite number of possible quantized Y′CbCr values 214. In some embodiments, each possible quantized Y′CbCr value 214 can be expressed with a codeword or other value in fewer bits than the bit depth of the Y′CbCr values 210. By way of a non-limiting example, when theinput video 104 used 16-bit values for each of the RGB values 202, and that bit depth was carried through to the Y′CbCr values 210, the 16-bit Y′CbCr values 210 can be quantized into lower bit depth quantized Y′CbCr values 214, such as 8 or 10-bit quantized Y′CbCr values 214. - The step size between each possible quantized Y′
CbCr value 214 can be uniform. The step size selected by theencoder 100 can influence the amount of distortion and/or noise introduced by theuniform quantization operation 212. By way of a non-limiting example, when the step size is selected such that two similar but different high bit depth Y′CbCr values 210 fall within the same range defined by the step size and both are converted into the same quantized Y′CbCr value 214, distortion and/or noise can be introduced as differences in the color information between the two high bit depth Y′CbCr values 210 are lost due to theuniform quantization operation 212. However, as discussed above, the previouscoding transfer function 204 can have been configured to redistribute color information along a non-linear curve such that uniform quantization of that non-linear curve leads to distortion and/or noise that is unlikely to be perceived by a human viewer. - The
encoder 100 can perform achroma subsampling operation 216 to convert the quantized Y′CbCr values 214 into chroma subsampled Y′CbCr values 218. As the human eye is less sensitive to chroma information than luma information, thechroma subsampling operation 216 can subsample the Cb and Cr chroma components at a lower resolution to decrease the amount of bits dedicated to the chroma components, without impacting a viewer's perception of the Y′ luma component. In some embodiments, thechroma subsampling operation 216 can implement 4:2:0 subsampling. By way of a non-limiting example, the quantized Y′CbCr values 214 can be expressed with a full 4:4:4 resolution, and theencoder 100 can subsample the Cb and Cr chroma components of the quantized Y′CbCr values 214 to express them as chroma subsampled Y′CbCr values 218 with 4:2:0 subsampling at half their horizontal and vertical resolution. In alternate embodiments, thechroma subsampling operation 216 can implement 4:2:2 subsampling, 4:1:1 subsampling, or any other subsampling ratio. - The
encoder 100 can perform anencoding operation 220 on the chroma subsampled Y′CbCr values 218 to generate the codedbitstream 106. In some embodiments, the pixels of each picture can be broken into sub-pictures, such as processing windows, slices, macroblocks, or CTUs. Theencoder 100 can encode each individual picture and/or sub-picture using intra-prediction and/or inter-prediction. Coding with intra-prediction uses spatial prediction based on other similar sections of the same picture or sub-picture, while coding with inter-prediction uses temporal prediction to encode motion vectors that point to similar sections of another picture or sub-picture, such as a preceding or subsequent picture in theinput video 104. As such, coding of some pictures or sub-pictures can be at least partially dependent on other reference pictures in the same group of pictures (GOP). - The coded
bitstream 106 generated by theencoder 100 can be transmitted to one ormore decoders 102. Theencoder 100 can also transmit one ormore parameters 112 associated with thecoding transfer function 204 on a sub-picture level, per picture level, and/or supra-picture level, such that thedecoder 102 can derive a corresponding reversecoding transfer function 204 from theparameters 112 for each sub-picture, picture, or sequence of pictures. - Each
decoder 102 can receive the codedbitstream 106 and perform adecoding operation 222 to generate reconstructed chroma subsampled Y′CbCr values 224 that approximate the chroma subsampled Y′CbCr values 218 output by the encoder'schroma subsampling operation 216. By way of a non-limiting example, the codedbitstream 106 can be decoded into reconstructed chroma subsampled Y′CbCr values 224 expressed with 4:2:0 subsampling. As with encoding, thedecoder 102 can decode individual pictures and/or sub-pictures with intra-prediction and/or inter-prediction. - The
decoder 102 can perform achroma upsampling operation 226 to express the chroma components of the reconstructed chroma subsampled Y′CbCr values 224 with more bits, as reconstructed quantized Y′CbCr values 228. By way of a non-limiting example, when the reconstructed chroma subsampled Y′CbCr values 224 are 10-bit values expressed with 4:2:0 sub sampling at half the original resolution, thechroma upsampling operation 226 can copy, sample, and/or average the subsampled chroma information to generate the reconstructed quantized Y′CbCr values 228 at a full 4:4:4 resolution, such that they approximate the quantized Y′CbCr values 214 output by the encoder'suniform quantization operation 212. - The
decoder 102 can perform areverse quantization operation 230 on the reconstructed quantized Y′CbCr values 228, to generate reconstructed Y′CbCr values 232. Thereverse quantization operation 230 can convert the low-bit depth reconstructed quantized Y′CbCr values 228 to higher bit depth reconstructed Y′CbCr values 232. By way of a non-limiting example, when the reconstructed quantized Y′CbCr values 228 are expressed with 10 bits, thereverse quantization operation 230 can convert the values to be expressed in a 16 bit format used for reconstructed Y′CbCr values 232. The reconstructed Y′CbCr values 232 can approximate the Y′CbCr values 210 output by the encoder'scolor space conversion 208. - The
decoder 102 can perform reversecolor space conversion 234 to translate the reconstructed Y′CbCr values 232 into reconstructed non-linear R′G′B′ values 236. The reconstructed non-linear R′G′B′ values 236 can have the same bit depth as the reconstructed Y′CbCr values 232, and can approximate the non-linear R′G′B′ values 206 output by the encoder'scoding transfer function 204. - The
decoder 102 can perform an inverse coding transfer function 238 on the reconstructed non-linear R′G′B′ values 236 in each color channel to convert the reconstructed non-linear R′G′B′ values 236 into reconstructed RGB values 240. As will be discussed further below, thedecoder 102 can have receivedparameters 112 from which it can derive an inverse coding transfer function 238 that effectively reverses the conversion between RGB values and R′G′B′ values performed by the encoder'scoding transfer function 204. Theparameters 112 can be sent by theencoder 100 on a sub-picture level, a picture level, and/or a supra-picture level to indicate changes in thecoding transfer function 204 and inverse coding transfer function 238 between different sub-pictures in the same picture, between pictures, or between sequences of pictures, as thecoding transfer function 204 can change depending on the content of theinput video 104. - After deriving an inverse coding transfer function 238 based on the received
parameters 112, thedecoder 102 can use the inverse coding transfer function 238 to convert the reconstructed non-linear R′G′B′ values 236 into linearreconstructed RGB values 240 that approximate the original RGB values 202 of theinput video 104. The reconstructedRGB values 240 can be used to display pictures to a viewer on a display screen. -
FIG. 3 depicts a second embodiment of a process for encoding aninput video 104 into a codedbitstream 106 with anencoder 100 using aperceptual mapping operation 110, and decoding that codedbitstream 106 into a decodedvideo 108 with adecoder 102 using a reverseperceptual mapping operation 114. In this embodiment, aperceptual mapping operation 110 is performed at theencoder 100 after acolor space conversion 208 from R′G′B′ values 206 into Y′CbCr values 210, prior to auniform quantization operation 212. Similarly, in this embodiment a reverseperceptual mapping operation 114 is performed at thedecoder 102 after a reverseuniform quantization operation 230, before a reversecolor space conversion 234 into reconstructed R′G′B′ values 236. - As shown in
FIG. 3 , theencoder 100 can receive aninput video 104, such as a HDR and/or WCG video that has color information indicated withRGB values 202 in a high bit depth format. After receiving theinput video 104, theencoder 100 can perform a non-adaptivecoding transfer function 302 to convert the RGB values 202 in each color channel into R′G′B′ values 206. In some embodiments, the non-adaptivecoding transfer function 302 can be a fixed function that operates the same way for all input values, independent of the content of theinput video 104. In other embodiments, the non-adaptivecoding transfer function 302 can be a pass-through function, such that the R′G′B′ values 206 are substantially identical to the RGB values 202. - As in the embodiment of
FIG. 2 , theencoder 100 can performcolor space conversion 208 to translate the R′G′B′ values 206 into Y′CbCr values 210. In some embodiments, the Y′CbCr values 210 can be expressed with the same number of bits as the RGB values 202 and/or the R′G′B′ values 206, such that the Y′CbCr values 210 have the same bit depth. In alternate embodiments or situations, theencoder 100 can receive an input video with values already described in a Y′CbCr color space, such thatcolor space conversion 208 can be skipped. In still other embodiments, the order of thecoding transfer function 302 andcolor space conversion 208 can be reversed. - In this embodiment, the
encoder 100 can perform aperceptual mapping operation 110 on the Y′CbCr values 210 to generate perceptually mapped Y′CbCr values 304. As described above, aperceptual mapping operation 110 can be adaptive based on the content of theinput video 104 on a sub-picture level, picture level, or supra-picture level. In some embodiments theperceptual mapping operation 110 can use a 3D lookup table that maps Y′CbCr values 210 to associated perceptually mapped Y′CbCr values 304. In other embodiments, theperceptual mapping operation 110 can use one or more formulas to convert each color component. By way of a non-limiting example, theperceptual mapping operation 110 can convert values using formulas such as: -
Y′_PM=f(Y′,Cb,Cr) -
Cb_PM=g(Y′,Cb,Cr) -
Cr_PM=h(Y′,Cb,Cr) - In this example, the functions can each take the three Y′CbCr values 210 as inputs and output a perceptually mapped Y′
CbCr value 304, Y′_PM, Cb_PM, or Cr_PM. The 3D lookup table or conversion functions can be adaptive based on the content of theinput video 104. - After generating perceptually mapped Y′CbCr values 304 with the
perceptual mapping operation 110, theencoder 100 can perform auniform quantization operation 212 on the perceptually mapped Y′CbCr values 304 to generate quantized Y′CbCr values 214. In some embodiments, each possible quantized Y′CbCr value 214 can be expressed with a codeword or other value in fewer bits than the bit depth of the perceptually mapped Y′CbCr values 304. - The step size between each possible quantized Y′
CbCr value 214 can be uniform. However, as discussed above, theperceptual mapping operation 110 can have been configured to redistribute color information along a non-linear curve such that uniform quantization of that non-linear curve leads to distortion and/or noise that is unlikely to be perceived by a human viewer. - As with the embodiment of
FIG. 2 , theencoder 100 can perform a chromasub sampling operation 216 to convert the quantized Y′CbCr values 214 into chroma subsampled Y′CbCr values 218, and then perform anencoding operation 220 on the chroma subsampled Y′CbCr values 218 to generate the codedbitstream 106. - The coded
bitstream 106 generated by theencoder 100 can be transmitted to one ormore decoders 102, as well as one ormore parameters 112 associated with theperceptual mapping operation 110 on a sub-picture level, per picture level, and/or supra-picture level, such that thedecoder 102 can derive a corresponding reverseperceptual mapping operation 114 from theparameters 112 for each sub-picture, picture, or sequence of pictures. - As with the embodiment of
FIG. 2 , eachdecoder 102 can receive the codedbitstream 106 and perform adecoding operation 222 to generate reconstructed chroma subsampled Y′CbCr values 224 that approximate the chroma subsampled Y′CbCr values 218 output by the encoder'schroma subsampling operation 216. Thedecoder 102 can perform achroma upsampling operation 226 to express the chroma components of the reconstructed chroma subsampled Y′CbCr values 224 with more bits, as reconstructed quantized Y′CbCr values 228 that approximate the quantized Y′CbCr values 214 output by the encoder'suniform quantization operation 212. - The
decoder 102 can perform areverse quantization operation 230 on the reconstructed quantized Y′CbCr values 228, to generate reconstructed perceptually mapped Y′CbCr values 306. Thereverse quantization operation 230 can convert the low-bit depth reconstructed quantized Y′CbCr values 228 to higher bit depth reconstructed perceptually mapped Y′CbCr values 306. By way of a non-limiting example, when the reconstructed quantized Y′CbCr values 228 are expressed with 10 bits, thereverse quantization operation 230 can convert the values to be expressed in a 16 bit format used for reconstructed perceptually mapped Y′CbCr values 306. The reconstructed perceptually mapped Y′CbCr values 306 can approximate the perceptually mapped Y′CbCr values 304 output by the encoder'sperceptual mapping operation 110. - In this embodiment, the
decoder 102 can perform a reverseperceptual mapping operation 114 on the reconstructed perceptually mapped Y′CbCr values 306 to generate reconstructed Y′CbCr values 232. As will be discussed further below, thedecoder 102 can have receivedparameters 112 from which it can derive a reverseperceptual mapping operation 114 that effectively reverses the conversion between Y′CbCr values 210 and perceptually mapped Y′CbCr values 304 performed by the encoder'sperceptual mapping operation 110. Theparameters 112 can be sent by theencoder 100 on a sub-picture level, a picture level, and/or a supra-picture level to indicate changes in theperceptual mapping operation 110 and reverseperceptual mapping operation 114 between different sub-pictures in the same picture, between pictures, or between sequences of pictures, as theperceptual mapping operation 110 can change depending on the content of theinput video 104. - The
decoder 102 can perform reversecolor space conversion 234 to translate the reconstructed Y′CbCr values 232 into reconstructed non-linear R′G′B′ values 236. The reconstructed non-linear R′G′B′ values 236 can have the same bit depth as the reconstructed Y′CbCr values 232, and can approximate the R′G′B′ values 206 output by the encoder'scoding transfer function 302. - The
decoder 102 can perform an inverse non-adaptivecoding transfer function 308 on the reconstructed R′G′B′ values 236 in each color channel to convert the reconstructed R′G′B′ values 236 into reconstructed RGB values 240. In some embodiments, the inverse non-adaptivecoding transfer function 308 can be a fixed function that operates the same way for all input values. In other embodiments, the inverse non-adaptivecoding transfer function 308 can be a pass-through function, such that the reconstructedRGB values 240 are substantially identical to the reconstructed R′G′B′ values 236. The reconstructedRGB values 240 can be used to display pictures to a viewer on a display screen. -
FIG. 3 depicts a third embodiment of a process for encoding aninput video 104 into a codedbitstream 106 with anencoder 100 using aperceptual mapping operation 110, and decoding that codedbitstream 106 into a decodedvideo 108 with adecoder 102 using a reverseperceptual mapping operation 114. - The embodiment of
FIG. 3 can be substantially similar to the embodiment ofFIG. 4 , with the standaloneperceptual mapping operation 110 anduniform quantization operation 212 replaced with a jointperceptual quantization operation 402 at theencoder 100. Similarly, at thedecoder 102 the standalone reverseuniform quantization operation 230 and reverseperceptual mapping operation 114 can be replaced with a reverse jointperceptual quantization operation 404 in the embodiment ofFIG. 4 . - In this embodiment, the Y′CbCr values 210 output by the
color space conversion 208 can be converted into quantized Y′CbCr values 214 using a jointperceptual quantization operation 402 that can be adaptive based on the content of theinput video 104 on a sub-picture level, picture level, or supra-picture level. While theuniform quantization operation 212 in the embodiments ofFIGS. 2 and 3 can be performed independently on each color component, the jointperceptual quantization operation 402 can take all three color components into consideration. - In some embodiments the joint
perceptual quantization operation 402 can use a 3D lookup table that maps Y′CbCr values 210 to associated quantized Y′CbCr values 214. In other embodiments, the jointperceptual quantization operation 402 can use one or more formulas to quantize each color component. By way of a non-limiting example, the jointperceptual quantization operation 402 can quantize values using formulas such as: -
D Y′ =Q1(Y′,Cb,Cr) -
D Cb =Q2(Y′,Cb,Cr) -
D Cr =Q3(Y′,Cb,Cr) - In this example, the functions can each take the three Y′CbCr values 210 as inputs and output a quantized Y′
CbCr value 214, DY′, DCb, or DCr. The 3D lookup table or quantization functions can be adaptive based on the content of theinput video 104. While in some embodiments the step size can be uniform between each possible quantized Y′CbCr value 214, the jointperceptual quantization operation 402 can be configured to redistribute and quantize color information such that any distortion and/or noise it introduces is unlikely to be perceived by a human viewer. In some embodiments, each possible quantized Y′CbCr value 214 that can be generated with the jointperceptual quantization operation 402 can be expressed with a codeword or other value in fewer bits than the bit depth of the Y′CbCr values 210. - As in the embodiments of
FIGS. 2 and 3 , theencoder 100 can perform achroma subsampling operation 216 to convert the quantized Y′CbCr values 214 into chroma subsampled Y′CbCr values 218, and then perform anencoding operation 220 on the chroma subsampled Y′CbCr values 218 to generate the codedbitstream 106. - The coded
bitstream 106 generated by theencoder 100 can be transmitted to one ormore decoders 102, as well as one ormore parameters 112 associated with the jointperceptual quantization operation 402 on a sub-picture level, per picture level, and/or supra-picture level, such that thedecoder 102 can derive a corresponding reverse jointperceptual quantization operation 404 from theparameters 112 for each sub-picture, picture, or sequence of pictures. - As with the embodiment of
FIGS. 2 and 3 , eachdecoder 102 can receive the codedbitstream 106 and perform adecoding operation 222 to generate reconstructed chroma subsampled Y′CbCr values 224 that approximate the chroma subsampled Y′CbCr values 218 output by the encoder'schroma subsampling operation 216. Thedecoder 102 can perform achroma upsampling operation 226 to express the chroma components of the reconstructed chroma subsampled Y′CbCr values 224 with more bits, as reconstructed quantized Y′CbCr values 228 that approximate the quantized Y′CbCr values 214 output by the encoder's jointperceptual quantization operation 402. - The
decoder 102 can perform a reverse jointperceptual quantization operation 404 on the reconstructed quantized Y′CbCr values 228, to generate reconstructed Y′CbCr values 232. As will be discussed further below, thedecoder 102 can have receivedparameters 112 from which it can derive a reverse jointperceptual quantization operation 404 that effectively reverses the conversion between Y′CbCr values 210 and quantized Y′CbCr values 214 performed by the encoder's jointperceptual quantization operation 402. Theparameters 112 can be sent by theencoder 100 on a sub-picture level, a picture level, and/or a supra-picture level to indicate changes in the jointperceptual quantization operation 402 and reverse jointperceptual quantization operation 404 between different sub-pictures in the same picture, between pictures, or between sequences of pictures, as theperceptual mapping operation 110 can change depending on the content of theinput video 104. - The reverse joint
perceptual quantization operation 404 can also convert the low-bit depth reconstructed quantized Y′CbCr values 228 to higher bit depth reconstructed Y′CbCr values 232. By way of a non-limiting example, when the reconstructed quantized Y′CbCr values 228 are expressed with 10 bits, the reverse jointperceptual quantization operation 404 can convert the values to be expressed in a 16 bit format used for reconstructed Y′CbCr values 232. The reconstructed Y′CbCr values 232 can approximate the Y′CbCr values 210 output by the encoder'scolor space conversion 208. - The
decoder 102 can perform reversecolor space conversion 234 to translate the reconstructed Y′CbCr values 232 into reconstructed R′G′B′ values 236, followed by an inverse non-adaptivecoding transfer function 308 to convert the reconstructed R′G′B′ values 236 into reconstructed RGB values 240. The reconstructedRGB values 240 can be used to display pictures to a viewer on a display screen. - As shown above, in various embodiments a
perceptual mapping operation 110 can be performed in an initialcoding transfer function 204, as a standalone step at a later portion of the encoding process after color space conversion, or as part of a jointperceptual quantization operation 402. In alternate embodiments aperceptual mapping operation 110 can be performed after a uniform quantization step, after a chroma subsampling step, or at any other step of the encoding process. Thedecoder 102 can perform its decoding process with corresponding steps in substantially the reverse order from the encoding process. - In some embodiments the
encoder 100 can send thedecoder 102 information about a 3D lookup table it used with theperceptual mapping operation 110, or send complete information about conversion functions it used within theperceptual mapping operation 110, for each sub-picture level, picture level, or supra-picture level. As such, thedecoder 102 can determine an associated reverseperceptual mapping operation 114 to use during the decoding process. - However, in other embodiments the
encoder 100 can save bandwidth by transmittingparameters 112 associated with theperceptual mapping operation 110 it used at each sub-picture level, picture level, or supra-picture level. Thedecoder 102 can use the receivedparameters 112 to generate and use a corresponding reverseperceptual mapping operation 114 for each sub-picture level, picture level, or supra-picture level. - Various non-limiting examples of possible
coding transfer functions 204, and theparameters 112 associated with them that can be sent to thedecoder 102 to derive inverse coding transfer functions 238, will be provided below. In these examples, the encoder'scoding transfer function 204 can be denoted as ψ(I)=v, such that it can use a brightness or intensity value I in a color component as an input and output a converted value denoted as v. Similarly, while the decoder's inverse coding transfer function 238 can be denoted as ψ−1(v)=I, such that it can take a value v and convert it back to a value I. The encoder'suniform quantization operation 212 can be denoted as Q (v), as it can operate on converted v values generated by thecoding transfer function 204. The step size between quantization levels used in theuniform quantization operation 212 can be denoted as Δstep. - The effective quantization step size, Q(I), of a cascaded adaptive
coding transfer function 204 and auniform quantization operation 212 can be proportional to the slope of the inverse coding transfer function 238, as shown below: -
- The effective quantization step size, Q(I), can thus depend on the slope of the inverse coding transfer function 238 and the step size Δstep of the
uniform quantization operation 212. For example, when the slope of the inverse coding transfer function 238 decreases, the effective quantization step size Q(I) can decrease. When the step size Δstep of theuniform quantization operation 212 is large enough that distortion and/or noise introduced by uniform quantization would otherwise be perceptible to human viewers, the effects of the relatively large step size Δstep can be modulated by adapting thecoding transfer function 204 to the content of theinput video 104, such that the slope of the inverse coding transfer function 238 is smaller. As such, decreasing the slope of the inverse coding transfer function 238 can counteract the effects of a relatively large step size Δstep, and thus modulate the effective quantization step size Q(I) such that the overall distortion and/or noise is less likely to be perceived by a human viewer. - The effective quantization step size Q(I) can be included in a related metric, the relative quantization step size, Λ(I), wherein:
-
- The
coding transfer function 204, and thus the corresponding inverse coding transfer function 238, can be adapted based on the content of theinput video 104 such that the relative quantization step size Λ(I) stays below a set threshold level. For example, the threshold level can be defined by a function Λ0(I) that gives an optimal slope for the inverse coding transfer function 238 that results in encoding with distortion and noise that is perceptually transparent or perceptually lossless. As such thecoding transfer function 204, and thus the corresponding inverse coding transfer function 238, can be adapted such that Λ(I)≤Λ0(I). - Similarly, if a perceptually minor or “just noticeable” contrast condition is considered acceptable and is defined by Λ0(I), the following differential equation can apply:
-
- As such, solving the above differential equation for ψ−1(v) can provide the decoder's inverse coding transfer function 238 for the desired Λ0(I). Similarly, the relative quantization step size Λ(I) can be calculated for any given inverse transfer function 238.
- As a first non-limiting example, the
coding transfer function 204 and inverse coding transfer function 238 can be based on the first variant of Weber's Law, such that: -
- In this and other examples below, IN can be a normalized brightness of a portion of the
input video 104, on a sub-picture level, picture level, or supra-picture level. The normalized brightness can a brightness level divided by the maximum brightness, such that -
- In this and other examples below, C can be the maximum contrast in the portion of the
input video 104 on a sub-picture level, picture level, or supra-picture level. The maximum contrast can be the maximum brightness divided by the minimum brightness, such that: -
- In these and other examples below, vN can be a value generated by the
coding transfer function 204, normalized by the dynamic range of theuniform quantizer operation 202, denoted as D, such that: -
- From the above definitions, the relative quantization step size for the first variant of Weber's Law can therefore be given by:
-
- As a second non-limiting example, the
coding transfer function 204 and inverse coding transfer function 238 can be based on the second variant of Weber's Law, such that: -
- From this, the relative quantization step size for the second variant of Weber's Law can therefore be given by:
-
- The relative quantization step sizes of the two examples above based on variants of Weber's Law can be plotted on a log-log scale, as shown in
FIG. 5 . The slope of the relative quantization step size based on the first variant of Weber's Law can be linear on the log-log scale with a negative slope for small values of IN, and then be flat (linear on the log-log scale with a slope of 0) for values of IN that are larger than a particular point. Similarly, the slope of the relative quantization step size based on the second variant of Weber's Law can be negative for small values of IN, and then transition smoothly to approaching a flat slope for larger values of IN. The two variants can thus be similar, with the second variant having a smoother transition between the IN ranges that have different slopes. - As a third non-limiting example, the
coding transfer function 204 and inverse coding transfer function 238 can be based on the first variant of Stevens' Power Law, such that: -
- From this, the relative quantization step size for the first variant of Stevens' Power Law can therefore be given by:
-
- As a fourth non-limiting example, the
coding transfer function 204 and inverse coding transfer function 238 can be based on the third variant of Stevens' Power Law, such that: -
- From this, the relative quantization step size for the third variant of Stevens' Power Law can therefore be given by:
-
- The relative quantization step sizes of the two examples above based on variants of Stevens' Power Law can be plotted on a log-log scale, as shown in
FIG. 6 . In both of these examples, the slope of the relative quantization step size can have or approach a slope of −1 for small values of IN, and have or approach a slope of −γ for large values of IN, with the two examples varying on the smoothness of the transition between the IN ranges that have different slopes. Additionally, as γ goes to 0, the first variant of Stevens' Power Law can converge with the first variant of Weber's Law, while the third variant of Stevens' Power Law can converge with the second variant of Weber's Law. - As shown in the examples above, the slope of the curve of relative quantization step sizes Λ(I) can differ for different brightness values. As such, when the
coding transfer function 204 is adaptive and can be changed based on perceptual and/or statistical properties of theinput video 104 on a sub-picture level, a picture level, or a supra-picture level, the overall shape of the Λ(I) curve can change. - By sending
parameters 112 from theencoder 100 to thedecoder 102 that describe the Λ(I) function, thedecoder 102 can derive the inverse coding transfer function 238 from the Λ(I) function, by solving for ψ−1(v) in the following differential equation: -
- As such, the
encoder 100 can send one ormore parameters 112 that describe the shape of the Λ(I) curve to thedecoder 102 at each sub-picture level, picture level, or supra-picture level, so that thedecoder 102 can derive the appropriate inverse coding transfer function 238. Since thecoding transfer function 204 and thus the Λ(I) function can change throughout the encoding process based on the content of theinput video 104, theencoder 100 can save bandwidth by sending a relatively small number ofparameters 112 that describe the Λ(I) curve at each sub-picture level, picture level, or supra-picture level, compared to sending the full inverse coding transfer function 238 or a full lookup table showing mappings between all possible converted values at every sub-picture level, picture level, or supra-picture level. - By way of a first non-limiting example, the shape of the Λ(I) curve can be expressed through a piecewise log-linear function such as a variant of Weber's Law or Stevens' Power Law, as shown above. As such, in some embodiments the
encoder 100 can send twoparameters 112 to thedecoder 102 at each sub-picture level, picture level, or supra-picture level: a normalized brightness value IN and a maximum contrast value C. From these twoparameters 112, thedecoder 102 can find Λ(I) using a predetermined piecewise log-linear function, and thus derive the appropriate inverse coding transfer function 238 to use when decoding values at that sub-picture level, picture level, or supra-picture level. - By way of a second non-limiting example, the shape of the Λ(I) curve can be expressed through a second order log-polynomial, a polynomial in a logarithmic domain. In these embodiments,
parameters 112 describing in the second order log-polynomial can be sent from theencoder 100 to thedecoder 102 for each sub-picture level, picture level, or supra-picture level, such that thedecoder 102 can find Λ(I) from theparameters 112 and derive the appropriate inverse coding transfer function 238 for the coding level. By way of a non-limiting example, a second order log-polynomial with three parameters a, b, and c can be given by: -
log(Λ(I))=a·(log(I))2 +b·log(I)+c - In this example, the
encoder 100 can send values of the parameters a, b, and c to thedecoder 102. Thedecoder 102 can use the received parameters in the predefined formula to find Λ(I) from theparameters 112 and from it derive a corresponding inverse coding transfer function 238. - In other embodiments, the
encoder 100 can directly send one ormore parameters 112 that describe a particularcoding transfer function 204 or otherperceptual mapping operation 110, and/or a particular inverse coding transfer function 238 or other reverseperceptual mapping operation 114. In some embodiments, thecoding transfer function 204 or otherperceptual mapping operation 110 can be a perceptual quantizer (PQ) transfer function. By way of a non-limiting example, in some embodiments the PQ transfer function can be a function that operates on Luminance values, L, with the function defined as: -
- In this example,
parameters 112 that can be sent from theencoder 100 to thedecoder 102 at each sub-picture level, picture level, or supra-picture level include one or more of: m1, m2, c1, c2, c3. For instance, in one non-limiting exemplary implementation, the values of theparameters 112 can be as follows: -
- In some embodiments or situations, the values of one or more of these
parameters 112 can be predetermined, such that they are known to both theencoder 100 anddecoder 102. As such, theencoder 100 can send less than all of theparameters 112 to thedecoder 102 to adjust the PQ curve. By way of a non-limiting example, all theparameters 112 except for m2 can be preset, such that theencoder 100 only sends the value of m2 it used at each coding level to thedecoder 102. - As shown in
FIG. 7 , tuning the value of m2 can adjust the PQ curve for different luminance values. When m2 is set to be larger than the 78.84375 value indicated above, such as when m2 is set to 62, the PQ values can be increased throughout some or all of the curve. In contrast, when m2 is set to be less than the 78.84375 value indicated above, such as when m2 is set to 160, the PQ values can be decreased throughout some or all of the curve. - As shown in
FIG. 8 , when different m2 values are used in the PQ transfer function and its relative quantization step size Λ(I) is found, different values of m2 can result in lower relative quantization step sizes Λ(I) at different luminance values. By way of a non-limiting example, setting m2 to 62 can provide lower relative quantization step sizes Λ(I) at the low range of luminance values, thereby allocating more bits to those values when they are encoded. Similarly, setting m2 to 160 can provide lower relative quantization step sizes Λ(I) at the high range of luminance values, thereby allocating more bits to those values when they are encoded. As such, bits can be allocated differently and/or flexibly for different parts of theinput video 104 depending on its content. - While the example above showed the effects of changing the m2 parameter 112, such that the
encoder 100 can inform thedecoder 102 how to derive its inverse coding transfer function 238 by sending just the m2 parameter 112 at each coding level, in other embodiments or situations theencoder 100 can additionally or alternately adjust any or all of the m1, c1, c2, and c3 parameters 112 during encoding to flexibly adjust the bit allocations based on the content of theinput video 104. In such embodiments or situations, theencoder 100 can send the adjustedparameters 112 to thedecoder 102. In some embodiments, theencoder 100 can use a predefined mapping function or lookup table to determine the value of m2 or anyother parameter 112 based on a distribution of pixel values. By way of a non-limiting example, theencoder 100 can find a value for m2 based on an average intensity of pixel values. - As described above, the
perceptual mapping operation 110 and reverseperceptual mapping operation 114 can change between different areas of the same picture, between pictures, or between sequences of pictures. In embodiments, encoding and decoding sub-portions of pictures and/or full pictures can depend on interrelated coding dependencies between the pictures, such as the relationships between I pictures and P or B pictures. As such, theencoder 100 can transmit one ormore parameters 112 to thedecoder 102 related to aperceptual mapping operation 110 at any desired coding level, such as a sub-picture level related to the coding of a sub-portion of a picture, a picture level related to coding a single picture, or a supra-picture related to coding a sequence of pictures. Thedecoder 102 can use the receivedparameters 112 to derive an appropriate reverseperceptual mapping operation 114 for each sub-portion of a picture, single picture, or sequence of pictures. - In some embodiments or situations, the
encoder 100 can sendparameters 112 to thedecoder 102 on a supra-picture level. In these embodiments or situations, the reverseperceptual mapping operation 114 described by theparameters 112 can be applicable to all the pictures in a given sequence, such as a GOP. By way of a non-limiting example, theencoder 100 can statistically analyze the input values of all the pictures in a GOP, and use acoding transfer function 204 adapted to the range of values actually found within the pictures of that GOP. Theencoder 100 can then sendparameters 112 to thedecoder 102 from which thedecoder 102 can derive a corresponding inverse coding transfer function 238. - In some embodiments, the
encoder 100 can send theparameters 112 to thedecoder 102 on a supra-picture level using supplemental enhancement information (SEI) message. In other embodiments, theencoder 100 can send theparameters 112 to thedecoder 102 on a supra-picture level using video usability information (VUI) or other information within a Sequence Parameter Set (SPS) associated with the GOP. In some embodiments, thedecoder 102 can use the most recently receivedparameters 112 untilnew parameters 112 are received, at which point it can derive a new reverseperceptual mapping operation 114 from the newly receivedparameters 112. By way of a non-limiting example,parameters 112 can initially be set in an SPS, and then be updated on a per-GOP basis as the characteristics of theinput video 104 changes. - In some embodiments or situations, the
encoder 100 can sendparameters 112 to thedecoder 102 on a picture level. In these embodiments or situations, the reverseperceptual mapping operation 114 described by theparameters 112 can be applicable to full pictures. In some embodiments, theencoder 100 can send theparameters 112 to thedecoder 102 on a picture level within a Picture Parameter Set (PPS) associated with a picture. - In some embodiments, such as when the pictures are P or B pictures that were encoded with reference to one or more reference pictures, the
decoder 102 can receive and maintainparameters 112 for the reference pictures, as well asparameters 112 specific to individual temporally encoded pictures. As such, when thedecoder 102 previously generated a reference picture with a reverseperceptual mapping operation 114 using a first set ofparameters 112, and thedecoder 102 receives a different set ofparameters 112 for decoding a P or B picture encoded with reference to the reference picture, thedecoder 102 can first reverse the previous reverseperceptual mapping operation 114 on the reference picture using theparameters 112 received for the reference picture. Thedecoder 102 can then perform a new reverseperceptual mapping operation 114 on the reference picture using the new set ofparameters 112 received for the current picture, to re-map the reference picture according to the current picture'sparameters 112. Thedecoder 102 can use the re-mapped reference picture when decoding the current picture. In some embodiments, thedecoder 102 can re-map reference pictures according tonew parameters 112 associated with a current picture if thenew parameters 112 differ fromold parameters 112 associated with the reference picture. In alternate embodiments, thedecoder 102 can re-map reference pictures as described above if re-mapping is indicated in a flag or parameter received from theencoder 100. - In some embodiments or situations, the
encoder 100 can sendparameters 112 to thedecoder 102 on a sub-picture level. In these embodiments or situations, the reverseperceptual mapping operation 114 described by theparameters 112 can be applicable to sub-pictures within a picture, such as processing windows, slices, macroblocks, or CTUs. - In some embodiments, the
decoder 102 can receive and maintainparameters 112 for a current sub-picture and all reference pictures or sub-pictures, such as pixel blocks of size 4×4 or 8×8. As such, when decoding a sub-picture that was coded with reference to one or more reference pictures, thedecoder 102 can first reverse previous reverseperceptual mapping operations 114 performed on referencepixels using parameters 112 previously received for the reference pixels. The decoder can then apply a new reverseperceptual mapping operation 114 on the reference pixels usingnew parameters 112 associated with the current sub-picture, to re-map the reference pixels according to the currentsub-picture's parameters 112. Thedecoder 102 can use the re-mapped reference pixels when decoding the current sub-picture. In some embodiments, thedecoder 102 can re-map reference pixels according tonew parameters 112 associated with a current sub-picture if thenew parameters 112 differ fromold parameters 112 associated with the reference pixels. In alternate embodiments, thedecoder 102 can re-map reference pixels as described above if re-mapping is indicated in a flag or parameter received from theencoder 100. - While the above description describes encoding and decoding processes in which a
perceptual mapping operation 110 and reverseperceptual mapping operation 114 can decrease the likelihood of perceptible distortion and/or noise introduced by uniform quantization of values, in alternate embodiments aperceptual mapping operation 110 can be configured to reduce perceptible distortion and/or noise introduced by variable length quantization. By way of a non-limiting example, some implementations of AVC and HEVC use a predictive, variable length quantization scheme. - In these embodiments, the risk of introducing perceptible distortion and/or noise through variable length quantization can be reduced with a
perceptual mapping operation 110 or quantization step that follows a Rate-Distortion-Optimization (RDO) scheme based on a perceptual distortion measure. By way of a non-limiting example, in the embodiment ofFIG. 2 , theencoder 100 can calculate a non-perceptual distortion metric, denoted as Dnon-perceptual. By way of non-limiting examples, the non-perceptual distortion metric can be the mean squared error (MSE) or the error sum of squares (SSE). Theencoder 100 can then calculate a perceptual distortion metric, denoted as Dperceptual, from the non-perceptual distortion metric. In some embodiments, theencoder 100 can use a lookup table that maps Dnon-perceptual values to Dperceptual values. In other embodiments, theencoder 100 can use a perceptual distortion function based on characteristics of the human visual system, such as: -
D perceptual =f(D non-perceptual) - By way of a non-limiting example, the perceptual distortion function can be:
-
D perceptual =w perceptual ·D non-perceptual +b perceptual - In this example, wperceptual can be a weighting term and bperceptual can be an offset term, which can each be calculated for each color component of a pixel or group of pixels based on characteristics of the human visual system. By way of another non-limiting example, the perceptual distortion metric can be a weighted average of non-perceptual distortion metrics, such as MSE, across different color components.
- Although the present invention has been described above with particularity, this was merely to teach one of ordinary skill in the art how to make and use the invention. Many additional modifications will fall within the scope of the invention, as that scope is defined by the following claims.
Claims (16)
1-3. (canceled)
4. A method for generating High Dynamic Range (HDR) video data from an encoded video data stream that includes encoded Low Dynamic Range (LDR) video data set and does not include encoded High Dynamic Range (HDR) video data, the method comprising:
decoding said encoded Low Dynamic Range (LDR) video data set from said encoded video data stream by a non-HDR video decoder;
extracting, by said non-HDR video decoder, a metadata structure signaled for said video data set in said encoded video data stream comprising said encoded Low Dynamic Range (LDR) video data set;
wherein said signaled metadata structure which identifies a reshaping transfer function for video data is signaled in a supplemental enhancement information message of said encoded video data stream and/or video usability information message of said encoded video data stream;
wherein said reshaping transfer function is relevant to the video data set signaled at a picture level in the encoded video data stream;
wherein said reshaping transfer function is based upon a function including f(L)=((c1+c2*Lm1)/(1+c3*Lm1))m2, where L is a luminance value, and c1, c2, c3, m1, m2 are parameters;
wherein said decoding, by the non-HDR video decoder, the encoded Low Dynamic Range (LDR) video data set produces said decoded Low Dynamic Range (LDR) video data set without reference to said reshaping transfer function;
generating reshaped HDR video data as output data by applying the decoded Low Dynamic Range (LDR) video data set to a regenerated video data reshaping transfer function.
5. The method of claim 1 wherein said reshaping transfer function is included as a look up table.
6. The method of claim 1 wherein said encoded video data stream is HEVC compliant.
7. The method of claim 1 wherein said encoded video data stream is AVC compliant.
8. The method of claim 1 wherein said supplemental enhancement information message is included in a NAL unit of said encoded video data stream.
9. The method of claim 1 wherein said video usability information message is included in a sequence parameter set of said encoded video data stream.
10. The method of claim 1 wherein said reshaping transfer function is a perceptual quantizer function.
11. The method of claim 1 wherein said m1 and m2 are positive integer values.
12. The method of claim 1 wherein said m2 parameter is included within said metadata structure signaled for said video data set.
13. The method of claim 1 wherein said m2 parameter is included within said metadata structure signaled for said video data set that is adjusted based upon different luminance value ranges within a coding level of said video data stream.
14. The method of claim 1 wherein said reshaped HDR video data has a bit depth greater than 10.
15. The method of claim 1 wherein said video data is signaled in said supplemental enhancement information message of said encoded video data stream.
16. The method of claim 1 wherein said signaled metadata structure which identifies said reshaping transfer function is continued to be used for said video data stream until a different signaled metadata structure which identifies a modified reshaping transfer function based upon modified values of at least one of L, c1, c2, c3, m1, m2.
17. The method of claim 1 wherein said encoded Low Dynamic Range (LDR) video data set has a first color volume and said reshaped HDR video data has a second color volume.
18. The method of claim 1 wherein said m1 is a contrast parameter.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/747,337 US20220279195A1 (en) | 2015-04-21 | 2022-05-18 | Adaptive perceptual mapping and signaling for video coding |
US18/235,700 US12096014B2 (en) | 2015-04-21 | 2023-08-18 | Adaptive perceptual mapping and signaling for video coding |
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201562150457P | 2015-04-21 | 2015-04-21 | |
US15/135,497 US10735755B2 (en) | 2015-04-21 | 2016-04-21 | Adaptive perceptual mapping and signaling for video coding |
US15/394,721 US10735756B2 (en) | 2015-04-21 | 2016-12-29 | Adaptive perceptual mapping and signaling for video coding |
US16/909,698 US11368701B2 (en) | 2015-04-21 | 2020-06-23 | Adaptive perceptual mapping and signaling for video coding |
US17/747,337 US20220279195A1 (en) | 2015-04-21 | 2022-05-18 | Adaptive perceptual mapping and signaling for video coding |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/909,698 Continuation US11368701B2 (en) | 2015-04-21 | 2020-06-23 | Adaptive perceptual mapping and signaling for video coding |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/235,700 Continuation US12096014B2 (en) | 2015-04-21 | 2023-08-18 | Adaptive perceptual mapping and signaling for video coding |
Publications (1)
Publication Number | Publication Date |
---|---|
US20220279195A1 true US20220279195A1 (en) | 2022-09-01 |
Family
ID=55910397
Family Applications (5)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/135,497 Active 2036-06-29 US10735755B2 (en) | 2015-04-21 | 2016-04-21 | Adaptive perceptual mapping and signaling for video coding |
US15/394,721 Expired - Fee Related US10735756B2 (en) | 2015-04-21 | 2016-12-29 | Adaptive perceptual mapping and signaling for video coding |
US16/909,698 Active US11368701B2 (en) | 2015-04-21 | 2020-06-23 | Adaptive perceptual mapping and signaling for video coding |
US17/747,337 Abandoned US20220279195A1 (en) | 2015-04-21 | 2022-05-18 | Adaptive perceptual mapping and signaling for video coding |
US18/235,700 Active US12096014B2 (en) | 2015-04-21 | 2023-08-18 | Adaptive perceptual mapping and signaling for video coding |
Family Applications Before (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/135,497 Active 2036-06-29 US10735755B2 (en) | 2015-04-21 | 2016-04-21 | Adaptive perceptual mapping and signaling for video coding |
US15/394,721 Expired - Fee Related US10735756B2 (en) | 2015-04-21 | 2016-12-29 | Adaptive perceptual mapping and signaling for video coding |
US16/909,698 Active US11368701B2 (en) | 2015-04-21 | 2020-06-23 | Adaptive perceptual mapping and signaling for video coding |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/235,700 Active US12096014B2 (en) | 2015-04-21 | 2023-08-18 | Adaptive perceptual mapping and signaling for video coding |
Country Status (2)
Country | Link |
---|---|
US (5) | US10735755B2 (en) |
WO (1) | WO2016172394A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220256157A1 (en) * | 2019-10-18 | 2022-08-11 | Huawei Technologies Co., Ltd. | Method and apparatus for processing image signal conversion, and terminal device |
Families Citing this family (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10735755B2 (en) * | 2015-04-21 | 2020-08-04 | Arris Enterprises Llc | Adaptive perceptual mapping and signaling for video coding |
US10284863B2 (en) * | 2015-06-08 | 2019-05-07 | Qualcomm Incorporated | Adaptive constant-luminance approach for high dynamic range and wide color gamut video coding |
US10397585B2 (en) * | 2015-06-08 | 2019-08-27 | Qualcomm Incorporated | Processing high dynamic range and wide color gamut video data for video coding |
US10244245B2 (en) * | 2015-06-08 | 2019-03-26 | Qualcomm Incorporated | Content-adaptive application of fixed transfer function to high dynamic range (HDR) and/or wide color gamut (WCG) video data |
JP6532962B2 (en) * | 2015-06-09 | 2019-06-19 | 華為技術有限公司Huawei Technologies Co.,Ltd. | Image encoding method, image decoding method, encoding device, and decoding device |
US10116938B2 (en) | 2015-07-22 | 2018-10-30 | Arris Enterprises Llc | System for coding high dynamic range and wide color gamut sequences |
AU2015227469A1 (en) * | 2015-09-17 | 2017-04-06 | Canon Kabushiki Kaisha | Method, apparatus and system for displaying video data |
WO2017053852A1 (en) | 2015-09-23 | 2017-03-30 | Arris Enterprises Llc | System for reshaping and coding high dynamic range and wide color gamut sequences |
WO2017053860A1 (en) | 2015-09-23 | 2017-03-30 | Arris Enterprises Llc | High dynamic range adaptation operations at a video decoder |
CN107027027B (en) * | 2016-01-31 | 2021-02-12 | 西安电子科技大学 | Image encoding method, image decoding method, image encoding device, image decoding device, and image encoding/decoding system |
JP6822121B2 (en) * | 2016-12-19 | 2021-01-27 | ソニー株式会社 | Image processing equipment, image processing methods and programs |
US11100888B2 (en) * | 2017-06-28 | 2021-08-24 | The University Of British Columbia | Methods and apparatuses for tone mapping and inverse tone mapping |
EP3425911A1 (en) | 2017-07-06 | 2019-01-09 | Thomson Licensing | A method and a device for picture encoding and decoding |
CN108337516B (en) * | 2018-01-31 | 2022-01-18 | 宁波大学 | Multi-user-oriented HDR video dynamic range scalable coding method |
EP3742432A1 (en) * | 2019-05-24 | 2020-11-25 | InterDigital CE Patent Holdings | Device and method for transition between luminance levels |
US11335033B2 (en) * | 2020-09-25 | 2022-05-17 | Adobe Inc. | Compressing digital images utilizing deep learning-based perceptual similarity |
US12100185B2 (en) * | 2021-06-18 | 2024-09-24 | Tencent America LLC | Non-linear quantization with substitution in neural image compression |
US11985346B2 (en) * | 2022-09-20 | 2024-05-14 | Qualcomm Incorporated | Encoding high dynamic range video data |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020094127A1 (en) * | 2001-01-16 | 2002-07-18 | Mitchell Joan L. | Enhanced compression of documents |
US20030152165A1 (en) * | 2001-01-25 | 2003-08-14 | Tetsujiro Kondo | Data processing apparatus |
US20110280302A1 (en) * | 2010-05-14 | 2011-11-17 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding video signal and method and apparatus for decoding video signal |
US20120314026A1 (en) * | 2011-06-09 | 2012-12-13 | Qualcomm Incorporated | Internal bit depth increase in video coding |
US11368701B2 (en) * | 2015-04-21 | 2022-06-21 | Arris Enterprises Llc | Adaptive perceptual mapping and signaling for video coding |
Family Cites Families (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2001086820A1 (en) * | 2000-05-09 | 2001-11-15 | Sony Corporation | Data processing device and data processing method, and recorded medium |
US7068850B2 (en) * | 2001-06-29 | 2006-06-27 | Equator Technologies, Inc. | Decoding of predicted DC coefficient without division |
US8054886B2 (en) | 2007-02-21 | 2011-11-08 | Microsoft Corporation | Signaling and use of chroma sample positioning information |
KR101882024B1 (en) * | 2009-12-14 | 2018-07-25 | 톰슨 라이센싱 | Object-aware video encoding strategies |
KR102333225B1 (en) * | 2010-04-13 | 2021-12-02 | 지이 비디오 컴프레션, 엘엘씨 | Inheritance in sample array multitree subdivision |
US8800717B2 (en) * | 2010-07-28 | 2014-08-12 | General Tree Corporation | Mobile scaffolding units with extendible gantry platform and methods of using same |
EP2445214A1 (en) * | 2010-10-19 | 2012-04-25 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Video coding using temporally coherent dynamic range mapping |
US10091513B2 (en) * | 2011-09-29 | 2018-10-02 | Texas Instruments Incorporated | Perceptual three-dimensional (3D) video coding based on depth information |
KR101865543B1 (en) * | 2011-12-06 | 2018-06-11 | 돌비 레버러토리즈 라이쎈싱 코오포레이션 | Device of improving the perceptual luminance nonlinearity-based image data exchange across different display capabilities |
US8934020B2 (en) | 2011-12-22 | 2015-01-13 | Pelco, Inc. | Integrated video quantization |
CN104509117A (en) * | 2012-08-24 | 2015-04-08 | I3研究所股份有限公司 | Receiving device, transmission device, and image transmission method |
RU2647636C2 (en) | 2013-02-21 | 2018-03-16 | Долби Лабораторис Лайсэнзин Корпорейшн | Video display control with extended dynamic range |
BR112015024172B1 (en) * | 2013-03-26 | 2023-01-31 | Dolby Laboratories Licensing Corporation | METHOD, SYSTEM AND COMPUTER READABLE STORAGE MEDIA |
KR101759954B1 (en) * | 2013-09-06 | 2017-07-21 | 엘지전자 주식회사 | Method and apparatus for transmitting and receiving ultra-high definition broadcasting signal for high dynamic range representation in digital broadcasting system |
US9936213B2 (en) * | 2013-09-19 | 2018-04-03 | Entropic Communications, Llc | Parallel decode of a progressive JPEG bitstream |
WO2015072754A1 (en) * | 2013-11-13 | 2015-05-21 | 엘지전자 주식회사 | Broadcast signal transmission method and apparatus for providing hdr broadcast service |
US20160366449A1 (en) * | 2014-02-21 | 2016-12-15 | Koninklijke Philips N.V. | High definition and high dynamic range capable video decoder |
KR102654563B1 (en) * | 2014-02-25 | 2024-04-05 | 애플 인크. | Adaptive transfer function for video encoding and decoding |
WO2015174026A1 (en) * | 2014-05-16 | 2015-11-19 | パナソニックIpマネジメント株式会社 | Conversion method and conversion device |
JP6571073B2 (en) * | 2014-05-20 | 2019-09-04 | エルジー エレクトロニクス インコーポレイティド | Video data processing method and apparatus for display adaptive video playback |
WO2015194102A1 (en) * | 2014-06-20 | 2015-12-23 | パナソニックIpマネジメント株式会社 | Playback method and playback apparatus |
CN105493490B (en) * | 2014-06-23 | 2019-11-29 | 松下知识产权经营株式会社 | Transform method and converting means |
EP3163894B1 (en) * | 2014-06-27 | 2020-08-19 | Panasonic Intellectual Property Management Co., Ltd. | Data output device, data output method, and data generation method |
US10582269B2 (en) * | 2014-07-11 | 2020-03-03 | Lg Electronics Inc. | Method and device for transmitting and receiving broadcast signal |
WO2016017961A1 (en) * | 2014-07-29 | 2016-02-04 | 엘지전자 주식회사 | Method and device for transmitting and receiving broadcast signal |
JP2016538745A (en) * | 2014-08-08 | 2016-12-08 | エルジー エレクトロニクス インコーポレイティド | Video data processing method and apparatus for display adaptive video playback |
EP3185572B1 (en) * | 2014-08-19 | 2023-03-08 | Panasonic Intellectual Property Management Co., Ltd. | Transmission method, reproduction method and reproduction device |
US10136133B2 (en) * | 2014-11-11 | 2018-11-20 | Dolby Laboratories Licensing Corporation | Rate control adaptation for high-dynamic range images |
JP6601729B2 (en) * | 2014-12-03 | 2019-11-06 | パナソニックIpマネジメント株式会社 | Data generation method, data reproduction method, data generation device, and data reproduction device |
WO2016089093A1 (en) * | 2014-12-04 | 2016-06-09 | 엘지전자 주식회사 | Broadcasting signal transmission and reception method and device |
WO2016129891A1 (en) * | 2015-02-11 | 2016-08-18 | 엘지전자 주식회사 | Method and device for transmitting and receiving broadcast signal |
JP6484347B2 (en) * | 2015-03-02 | 2019-03-13 | ドルビー ラボラトリーズ ライセンシング コーポレイション | Content adaptive perceptual quantizer for high dynamic range images |
US10848795B2 (en) * | 2015-05-12 | 2020-11-24 | Lg Electronics Inc. | Apparatus for transmitting broadcast signal, apparatus for receiving broadcast signal, method for transmitting broadcast signal and method for receiving broadcast signal |
CN105472205B (en) * | 2015-11-18 | 2020-01-24 | 腾讯科技(深圳)有限公司 | Real-time video noise reduction method and device in encoding process |
EP3762895A4 (en) * | 2018-04-02 | 2021-04-21 | Huawei Technologies Co., Ltd. | Video coding with successive codecs |
-
2016
- 2016-04-21 US US15/135,497 patent/US10735755B2/en active Active
- 2016-04-21 WO PCT/US2016/028721 patent/WO2016172394A1/en active Application Filing
- 2016-12-29 US US15/394,721 patent/US10735756B2/en not_active Expired - Fee Related
-
2020
- 2020-06-23 US US16/909,698 patent/US11368701B2/en active Active
-
2022
- 2022-05-18 US US17/747,337 patent/US20220279195A1/en not_active Abandoned
-
2023
- 2023-08-18 US US18/235,700 patent/US12096014B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020094127A1 (en) * | 2001-01-16 | 2002-07-18 | Mitchell Joan L. | Enhanced compression of documents |
US20030152165A1 (en) * | 2001-01-25 | 2003-08-14 | Tetsujiro Kondo | Data processing apparatus |
US20110280302A1 (en) * | 2010-05-14 | 2011-11-17 | Samsung Electronics Co., Ltd. | Method and apparatus for encoding video signal and method and apparatus for decoding video signal |
US20120314026A1 (en) * | 2011-06-09 | 2012-12-13 | Qualcomm Incorporated | Internal bit depth increase in video coding |
US11368701B2 (en) * | 2015-04-21 | 2022-06-21 | Arris Enterprises Llc | Adaptive perceptual mapping and signaling for video coding |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220256157A1 (en) * | 2019-10-18 | 2022-08-11 | Huawei Technologies Co., Ltd. | Method and apparatus for processing image signal conversion, and terminal device |
Also Published As
Publication number | Publication date |
---|---|
US20200322622A1 (en) | 2020-10-08 |
US11368701B2 (en) | 2022-06-21 |
US10735755B2 (en) | 2020-08-04 |
US20160316207A1 (en) | 2016-10-27 |
US20170163983A1 (en) | 2017-06-08 |
US20240048739A1 (en) | 2024-02-08 |
US12096014B2 (en) | 2024-09-17 |
US10735756B2 (en) | 2020-08-04 |
WO2016172394A1 (en) | 2016-10-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12096014B2 (en) | Adaptive perceptual mapping and signaling for video coding | |
US11375193B2 (en) | System for coding high dynamic range and wide color gamut sequences | |
US12028523B2 (en) | System and method for reshaping and adaptation of high dynamic range video data | |
EP3308541B1 (en) | System for coding high dynamic range and wide color gamut sequences | |
EP3272120B1 (en) | Adaptive perceptual mapping and signaling for video coding | |
US20240323382A1 (en) | System for coding high dynamic range and wide color gamut sequences |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE |