WO2024213421A1 - Method and device for energy reduction of visual content based on attenuation map using mpeg display adaptation - Google Patents
Method and device for energy reduction of visual content based on attenuation map using mpeg display adaptation Download PDFInfo
- Publication number
- WO2024213421A1 WO2024213421A1 PCT/EP2024/058768 EP2024058768W WO2024213421A1 WO 2024213421 A1 WO2024213421 A1 WO 2024213421A1 EP 2024058768 W EP2024058768 W EP 2024058768W WO 2024213421 A1 WO2024213421 A1 WO 2024213421A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- attenuation map
- attenuation
- image
- component
- components
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 126
- 230000009467 reduction Effects 0.000 title claims abstract description 95
- 230000006978 adaptation Effects 0.000 title claims abstract description 26
- 230000000007 visual effect Effects 0.000 title abstract description 10
- 238000007781 pre-processing Methods 0.000 claims abstract description 25
- 238000005070 sampling Methods 0.000 claims abstract description 16
- 241000023320 Luma <angiosperm> Species 0.000 claims description 36
- OSWPMRLSEDHDFF-UHFFFAOYSA-N methyl salicylate Chemical compound COC(=O)C1=CC=CC=C1O OSWPMRLSEDHDFF-UHFFFAOYSA-N 0.000 claims description 36
- 230000006870 function Effects 0.000 claims description 23
- 238000002156 mixing Methods 0.000 claims description 20
- 238000013507 mapping Methods 0.000 claims description 15
- 230000002238 attenuated effect Effects 0.000 claims description 6
- 230000035945 sensitivity Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 2
- 238000005265 energy consumption Methods 0.000 abstract description 18
- 238000009877 rendering Methods 0.000 abstract description 3
- 230000008569 process Effects 0.000 description 75
- 230000000875 corresponding effect Effects 0.000 description 39
- 230000015654 memory Effects 0.000 description 25
- 238000012545 processing Methods 0.000 description 24
- 238000004891 communication Methods 0.000 description 18
- 238000010586 diagram Methods 0.000 description 12
- 238000005516 engineering process Methods 0.000 description 8
- 238000012805 post-processing Methods 0.000 description 8
- 230000000670 limiting effect Effects 0.000 description 5
- 238000013139 quantization Methods 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 4
- 230000002093 peripheral effect Effects 0.000 description 4
- 230000002829 reductive effect Effects 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 238000001914 filtration Methods 0.000 description 3
- 230000036961 partial effect Effects 0.000 description 3
- 230000007727 signaling mechanism Effects 0.000 description 3
- 239000013598 vector Substances 0.000 description 3
- 230000003044 adaptive effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000002688 persistence Effects 0.000 description 2
- 230000011664 signaling Effects 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 230000003936 working memory Effects 0.000 description 2
- 101100290040 Avian musculoaponeurotic fibrosarcoma virus AS42 V-MAF gene Proteins 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000007667 floating Methods 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000002513 implantation Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 229920001690 polydopamine Polymers 0.000 description 1
- 238000013442 quality metrics Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 239000010409 thin film Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/117—Filters, e.g. for pre-processing or post-processing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/186—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- the disclosure is in the field of video compression, and at least one embodiment relates more specifically to encoding and decoding a video comprising attenuation map information and corresponding parameters carried through MPEG green metadata display adaptation syntax elements, the application of the attenuation map allowing a reduction of the energy consumption when using the video, for example when rendering it on a display.
- modem displays consume energy in a more controllable and efficient manner than older displays, they remain the most important source of energy consumption in a video chain.
- OLED Organic Light Emitting Diode
- TFT-LCDs Thin-Film Transistor Liquid Crystal Displays
- ISO/IEC 23001-11 specifies specific metadata, so-called Green Metadata, that enables the reduction of energy usage during media consumption, specifically at the display side.
- the metadata for display adaptation as defined in this specification are designed for a specific display and are particularly well tailored to transmissive display technologies embedding backlight illumination such as LCD displays.
- These metadata are designed to attain display energy reductions by using display adaptation techniques. They are composed of metrics made of RGB-component statistics and quality indicators of the video content. They can be used to perform RGB picture components rescaling to set the best compromise between backlight/voltage reduction and picture quality. Since the ISO/IEC 23001-11 document was published new emissive technologies have been introduced with the spread of emissive OLED displays, which allow a pixel-wise and more efficient control of their energy consumption, and consequently, a reduction of their energy consumption.
- the metadata already defined in the standard convey information for reduction of the energy consumed by displays, they also have the following drawbacks.
- At least one example of an embodiment involves a new metadata associated with a visual content and related to the use of an attenuation map dedicated to the reduction of the energy consumption when using the visual content, for example when rendering it on a display.
- Information about the types of displays compatible with the use of the attenuation map, the type of pre-processing (ex: up-sampling); the type of operation to use for the application of the attenuation map, metrics of the expected energy reduction and on the expected quality impact of the use of such an attenuation map are provided. These parameters are carried through MPEG green metadata display adaptation syntax elements.
- the attenuation map for one image of the video may be carried over as an auxiliary image of type AUX ALPHA (or of a specific type AUX ATTENUATION) and encoded conventionally.
- the attenuation map is a pixel-wise attenuation map. Encoding and decoding methods and devices are described.
- a first aspect is directed to a method comprising obtaining encoded data comprising at least an image, an attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation for applying the attenuation map to an image, and a second parameter representative of a mapping between components of the attenuation map and image components affected by the operation, applying the attenuation map to the image to reduce values of components of the image by performing an operation based on the first parameter on components of the image selected based on the second parameter; and providing an attenuated image, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
- a second aspect is directed to a method comprising obtaining an input image of a video, determining an attenuation map based on the input image according to a selected energy reduction rate, wherein applying the attenuation map to the input image reduces values of components of the input image, generating an encoded video comprising at least the input image, the attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation for applying the attenuation map to an image, and a second parameter representative of a mapping between components of the attenuation map and image components affected by the operation, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
- a third aspect is directed to a device comprising a processor configured to obtain encoded data comprising at least an image, an attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation for applying the attenuation map to an image, and a second parameter representative of a mapping between components of the attenuation map and image components affected by the operation, apply the attenuation map to the image to reduce values of components of the image by performing an operation based on the first parameter on components of the image selected based on the second parameter; and provide an attenuated image, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
- a fourth aspect is directed to a device comprising a processor configured to obtain an input image of a video, determine an attenuation map based on the input image according to a selected energy reduction rate, wherein applying the attenuation map to the input image reduces values of components of the input image, generate an encoded video comprising at least the input image, the attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation for applying the attenuation map to an image, and a second parameter representative of a mapping between components of the attenuation map and image components affected by the operation, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
- a fifth aspect is directed to a non-transitory computer readable medium containing data content generated according to the second aspect.
- a sixth aspect is directed to non-transitory computer readable medium containing comprising instructions which, when the program is executed by a computer, cause the computer to carry out the described embodiments related to the first and second aspect.
- a seventh aspect is directed to a computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out any of the described embodiments or variants related to the first and second aspect.
- Figure 1 illustrates a block diagram of a video encoder according to an embodiment.
- Figure 2 illustrates a block diagram of a video decoder according to an embodiment.
- Figure 3 illustrates a block diagram of an example of a system in which various aspects and embodiments are implemented.
- Figure 4 illustrates flowcharts of two examples of video encoding process using attenuation map information carried by an auxiliary picture of type AUX ALPHA according to at least one embodiment.
- Figure 5 illustrates a flowchart of an example of video decoding process using attenuation map information carried by an auxiliary picture of type AUX ALPHA according to at least one embodiment.
- Figure 6 illustrates flowcharts of two examples of video encoding process using attenuation map information carried by an auxiliary picture of type AUX ATTENUATION according to at least one embodiment.
- Figure 7 illustrates a flowchart of an example of video decoding process using attenuation map information carried by an auxiliary picture of type AUX ATTENUATION according to at least one embodiment.
- Figure 8 illustrates a flowchart of an example of video decoding process using attenuation map information based on multiple components carried by multiple separate auxiliary pictures of type AUX ALPHA according to at least one embodiment.
- Figure 9 illustrates examples of sequence diagrams representing the information exchange between a transmitter and a receiver according to further embodiments.
- VVC Very Video Coding
- HEVC High Efficiency Video Coding
- Figure 1 illustrates a block diagram of a video encoder according to an embodiment. Variations of this encoder 100 are contemplated, but the encoder 100 is described below for purposes of clarity without describing all expected variations.
- the video sequence may go through pre-encoding processing (101), for example, applying a color transform to the input color picture (e.g., conversion from RGB 4:4:4 to YCbCr 4:2:0), or performing a remapping of the input picture components in order to get a signal distribution more resilient to compression (for instance using a histogram equalization of one of the color components).
- Metadata can be associated with the pre-processing and attached to the bitstream.
- a picture is encoded by the encoder elements as described below.
- the picture to be encoded is partitioned (102) and processed in units of, for example, CUs.
- Each unit is encoded using, for example, either an intra or inter mode.
- intra prediction 160
- inter mode motion estimation (175) and compensation (170) are performed.
- the encoder decides (105) which one of the intra mode or inter mode to use for encoding the unit, and indicates the intra/inter decision by, for example, a prediction mode flag.
- Prediction residuals are calculated, for example, by subtracting (110) the predicted block from the original image block.
- the prediction residuals are then transformed (125) and quantized (130).
- the quantized transform coefficients, as well as motion vectors and other syntax elements, are entropy coded (145) to output a bitstream.
- the encoder can skip the transform and apply quantization directly to the non-transformed residual signal.
- the encoder can bypass both transform and quantization, i.e., the residual is coded directly without the application of the transform or quantization processes.
- the encoder decodes an encoded block to provide a reference for further predictions.
- the quantized transform coefficients are de-quantized (140) and inverse transformed (150) to decode prediction residuals.
- In-loop filters (165) are applied to the reconstructed picture to perform, for example, deblocking/SAO (Sample Adaptive Offset), Adaptive Loop-Filter (ALF) filtering to reduce encoding artifacts.
- the filtered image is stored at a reference picture buffer (180).
- Figure 2 illustrates a block diagram of a video decoder according to an embodiment. In the decoder 200, a bitstream is decoded by the decoder elements as described below.
- Video decoder 200 generally performs a decoding pass reciprocal to the encoding pass.
- the encoder 100 also generally performs video decoding as part of encoding video data.
- the input of the decoder includes a video bitstream, which can be generated by video encoder 100.
- the bitstream is first entropy decoded (230) to obtain transform coefficients, motion vectors, and other coded information.
- the picture partition information indicates how the picture is partitioned.
- the decoder may therefore divide (235) the picture according to the decoded picture partitioning information.
- the transform coefficients are de-quantized (240) and inverse transformed (250) to decode the prediction residuals. Combining (255) the decoded prediction residuals and the predicted block, an image block is reconstructed.
- the predicted block can be obtained (270) from intra prediction (260) or motion-compensated prediction (i.e., inter prediction) (275).
- In-loop filters (265) are applied to the reconstructed image.
- the filtered image is stored at a reference picture buffer (280).
- the decoded picture can further go through post-decoding processing (285), for example, an inverse color transform (e.g. conversion from YCbCr 4:2:0 to RGB 4:4:4) or an inverse remapping performing the inverse of the remapping process performed in the preencoding processing (101).
- post-decoding processing can use metadata derived in the pre-encoding processing and signaled in the bitstream.
- FIG. 3 illustrates a block diagram of an example of a system in which various aspects and embodiments are implemented.
- System 1000 can be embodied as a device including the various components described below and is configured to perform one or more of the aspects described in this document. Examples of such devices include, but are not limited to, various electronic devices such as personal computers, laptop computers, smartphones, tablet computers, digital multimedia set top boxes, digital television receivers, personal video recording systems, connected home appliances, and servers.
- Elements of system 1000, singly or in combination can be embodied in a single integrated circuit (IC), multiple ICs, and/or discrete components.
- the processing and encoder/decoder elements of system 1000 are distributed across multiple ICs and/or discrete components.
- system 1000 is communicatively coupled to one or more other systems, or other electronic devices, via, for example, a communications bus or through dedicated input and/or output ports.
- system 1000 is configured to implement one or more of the aspects described in this document.
- the system 1000 includes at least one processor 1010 configured to execute instructions loaded therein for implementing, for example, the various aspects described in this document.
- the processor 1010 may be a general-purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like.
- the processor 1010 can include embedded memory, input output interface, and various other circuitries as known in the art.
- the system 1000 includes at least one memory 1020 (e.g., a volatile memory device, and/or a non-volatile memory device).
- System 1000 includes a storage device 1040, which can include non-volatile memory and/or volatile memory, including, but not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Read-Only Memory (ROM), Programmable Read-Only Memory (PROM), Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), flash, magnetic disk drive, and/or optical disk drive.
- the storage device 1040 can include an internal storage device, an attached storage device (including detachable and non-detachable storage devices), and/or a network accessible storage device, as non-limiting examples.
- System 1000 includes an encoder/decoder module 1030 configured, for example, to process data to provide an encoded video or decoded video, and the encoder/decoder module 1030 can include its own processor and memory.
- the encoder/decoder module 1030 represents module(s) that can be included in a device to perform the encoding and/or decoding functions. As is known, a device can include one or both of the encoding and decoding modules. Additionally, encoder/decoder module 1030 can be implemented as a separate element of system 1000 or can be incorporated within processor 1010 as a combination of hardware and software as known to those skilled in the art.
- processor 1010 Program code to be loaded onto processor 1010 or encoder/decoder 1030 to perform the various aspects described in this document can be stored in storage device 1040 and subsequently loaded onto memory 1020 for execution by processor 1010.
- processor 1010, memory 1020, storage device 1040, and encoder/decoder module 1030 can store one or more of various items during the performance of the processes described in this document.
- Such stored items can include, but are not limited to, the input video, the decoded video or portions of the decoded video, the bitstream, matrices, variables, and intermediate or final results from the processing of equations, formulas, operations, and operational logic.
- memory inside of the processor 1010 and/or the encoder/decoder module 1030 is used to store instructions and to provide working memory for processing that is needed during encoding or decoding.
- a memory external to the processing device (for example, the processing device can be either the processor 1010 or the encoder/decoder module 1030) is used for one or more of these functions.
- the external memory can be the memory 1020 and/or the storage device 1040, for example, a dynamic volatile memory and/or a non-volatile flash memory.
- an external non-volatile flash memory is used to store the operating system of, for example, a television.
- a fast external dynamic volatile memory such as a RAM is used as working memory for video coding and decoding operations, such as for MPEG-2 (MPEG refers to the Moving Picture Experts Group, MPEG-2 is also referred to as ISO/IEC 13818, and 13818-1 is also known as H.222, and 13818-2 is also known as H.262), HEVC (HEVC refers to High Efficiency Video Coding, also known as H.265 and MPEG-H Part 2), or VVC (Versatile Video Coding, a new standard being developed by JVET, the Joint Video Experts Team).
- MPEG-2 MPEG refers to the Moving Picture Experts Group
- MPEG-2 is also referred to as ISO/IEC 13818
- 13818-1 is also known as H.222
- 13818-2 is also known as H.262
- HEVC High Efficiency Video Coding
- VVC Very Video Coding
- the input to the elements of system 1000 can be provided through various input devices as indicated in block 1130.
- Such input devices include, but are not limited to, (i) a radio frequency (RF) portion that receives an RF signal transmitted, for example, over the air by a broadcaster, (ii) a Component (COMP) input terminal (or a set of COMP input terminals), (iii) a Universal Serial Bus (USB) input terminal, and/or (iv) a High-Definition Multimedia Interface (HDMI) input terminal.
- RF radio frequency
- COMP Component
- USB Universal Serial Bus
- HDMI High-Definition Multimedia Interface
- the input devices of block 1130 have associated respective input processing elements as known in the art.
- the RF portion can be associated with elements suitable for (i) selecting a desired frequency (also referred to as selecting a signal, or band-limiting a signal to a band of frequencies), (ii) downconverting the selected signal, (iii) band-limiting again to a narrower band of frequencies to select (for example) a signal frequency band which can be referred to as a channel in certain embodiments, (iv) demodulating the downconverted and band-limited signal, (v) performing error correction, and (vi) demultiplexing to select the desired stream of data packets.
- the RF portion of various embodiments includes one or more elements to perform these functions, for example, frequency selectors, signal selectors, band-limiters, channel selectors, filters, downconverters, demodulators, error correctors, and demultiplexers.
- the RF portion can include a tuner that performs various of these functions, including, for example, downconverting the received signal to a lower frequency (for example, an intermediate frequency or a near-baseband frequency) or to baseband.
- the RF portion and its associated input processing element receives an RF signal transmitted over a wired (for example, cable) medium, and performs frequency selection by filtering, downconverting, and filtering again to a desired frequency band.
- Adding elements can include inserting elements in between existing elements, such as, for example, inserting amplifiers and an analog-to-digital converter.
- the RF portion includes an antenna.
- USB and/or HDMI terminals can include respective interface processors for connecting system 1000 to other electronic devices across USB and/or HDMI connections.
- various aspects of input processing for example, Reed- Solomon error correction
- aspects of USB or HDMI interface processing can be implemented within separate interface ICs or within processor 1010 as necessary.
- the demodulated, error corrected, and demultiplexed stream is provided to various processing elements, including, for example, processor 1010, and encoder/decoder 1030 operating in combination with the memory and storage elements to process the datastream as necessary for presentation on an output device.
- connection arrangement 1140 for example, an internal bus as known in the art, including the Inter-IC (I2C) bus, wiring, and printed circuit boards.
- I2C Inter-IC
- the system 1000 includes communication interface 1050 that enables communication with other devices via communication channel 1060.
- the communication interface 1050 can include, but is not limited to, a transceiver configured to transmit and to receive data over communication channel 1060.
- the communication interface 1050 can include, but is not limited to, a modem or network card and the communication channel 1060 can be implemented, for example, within a wired and/or a wireless medium.
- Wi-Fi Wireless Fidelity
- IEEE 802.11 IEEE refers to the Institute of Electrical and Electronics Engineers
- the Wi-Fi signal of these embodiments is received over the communications channel 1060 and the communications interface 1050 which are adapted for Wi-Fi communications.
- the communications channel 1060 of these embodiments is typically connected to an access point or router that provides access to external networks including the Internet for allowing streaming applications and other over- the-top communications.
- Other embodiments provide streamed data to the system 1000 using a set-top box that delivers the data over the HDMI connection of the input block 1130.
- Still other embodiments provide streamed data to the system 1000 using the RF connection of the input block 1130.
- various embodiments provide data in a non-streaming manner.
- various embodiments use wireless networks other than Wi-Fi, for example a cellular network or a Bluetooth network.
- the system 1000 can provide an output signal to various output devices, including a display 1100, speakers 1110, and other peripheral devices 1120.
- the display 1100 of various embodiments includes one or more of, for example, a touchscreen display, an organic lightemitting diode (OLED) display, a curved display, and/or a foldable display.
- the display 1100 can be for a television, a tablet, a laptop, a cell phone (mobile phone), or other devices.
- the display 1100 can also be integrated with other components (for example, as in a smart phone), or separate (for example, an external monitor for a laptop).
- the other peripheral devices 1120 include, in various examples of embodiments, one or more of a stand-alone digital video disc (or digital versatile disc) (DVR, for both terms), a disk player, a stereo system, and/or a lighting system.
- Various embodiments use one or more peripheral devices 1120 that provide a function based on the output of the system 1000. For example, a disk player performs the function of playing the output of the system 1000.
- control signals are communicated between the system 1000 and the display 1100, speakers 1110, or other peripheral devices 1120 using signaling such as AV.Link, Consumer Electronics Control (CEC), or other communications protocols that enable device-to-device control with or without user intervention.
- the output devices can be communicatively coupled to system 1000 via dedicated connections through respective interfaces 1070, 1080, and 1090. Alternatively, the output devices can be connected to system 1000 using the communications channel 1060 via the communications interface 1050.
- the display 1100 and speakers 1110 can be integrated in a single unit with the other components of system 1000 in an electronic device such as, for example, a television.
- the display interface 1070 includes a display driver, such as, for example, a timing controller (T Con) chip.
- the display 1100 and speaker 1110 can alternatively be separate from one or more of the other components, for example, if the RF portion of input 1130 is part of a separate set- top box.
- the output signal can be provided via dedicated output connections, including, for example, HDMI ports, USB ports, or COMP outputs.
- the embodiments can be carried out by computer software implemented by the processor 1010 or by hardware, or by a combination of hardware and software. As a nonlimiting example, the embodiments can be implemented by one or more integrated circuits.
- the memory 1020 can be of any type appropriate to the technical environment and can be implemented using any appropriate data storage technology, such as optical memory devices, magnetic memory devices, semiconductor-based memory devices, fixed memory, and removable memory, as non-limiting examples.
- the processor 1010 can be of any type appropriate to the technical environment, and can encompass one or more of microprocessors, general purpose computers, special purpose computers, and processors based on a multi-core architecture, as non-limiting examples.
- Part of this processing can be done independently of any information that is provided at the decoder or display side. Therefore, to be further in line with the goal of limiting the impact of the video chain on climate change, embodiments described herein propose to factorize only once the construction of the attenuation map at the encoder side, and to send this map together with the content, in the form of an auxiliary picture, instead of multiplying the processing in each device or server of the transmission chain. In this case, accompanying metadata are also required in the encoded bit-stream, to provide additional information on its use at the receiver side. Additionally, methods to compute attenuation maps can be costly.
- splitting the full process in two steps can advantageously enable its use even for devices with low computational resources such as smartphones.
- some information, such as the display type is however useful to adapt the application of the attenuation map on the picture to display. This suggests a process in two steps.
- the resulting images after energy reduction will not necessarily respect the content creator’ s intent because some areas might be impacted by the application of a dimming map created without any control from the content creator and the quality of experience could be degraded.
- content creators can assess that the processing is compliant with their quality of experience requirements and indicate areas that should not be affected by the processing.
- Embodiments described hereafter have been designed with the foregoing in mind and propose to solve these issues defining new metadata related to the use of a pixel-wise attenuation map dedicated to the reduction of the energy consumption at the receiver side when using, for example displaying, a visual content.
- information about the types of displays compatible with the use of the attenuation map, the type of pre-processing (ex: up-sampling) and operation to use for the application of the attenuation map and indicative metrics of the expected energy reduction and on the expected quality impact of the use of such an attenuation map are provided using MPEG green metadata display adaptation syntax elements.
- the pixel-wise attenuation map for one image of the visual content may be carried over as an auxiliary image and encoded conventionally.
- the attenuation map is carried by an auxiliary picture of a specific type “AUX ALPHA” conventionally used for alpha blending operations. In at least one embodiment, the attenuation map is carried by an auxiliary picture of a new specific type “AUX_ATTENUATION” dedicated to energy reduction of pictures.
- the attenuation map is designed so that, when applied to an input image, it produces a modified image that requires less energy for display than the input image.
- One simple implementation is to scale down the luminance according to a selected energy reduction rate. More complex implantation takes other parameters into account such as the similarity between the modified image and the input image, or the contrast sensitivity function of the human vision, or a smoothness characteristic that allows to downscale the attenuation map without introducing heavy artefacts when upscaling it on the decoder side, etc.
- Applying an attenuation map to an image is based on combining them together according to a selected type of operation.
- the values of the samples of the attenuation map and the type of operation are closely related together. Indeed, when the operation is an addition, the attenuation map comprises sample values having negative values, when the operation is a subtraction, the attenuation map comprises sample values having negative values, when the operation is a multiplication, the attenuation map comprises floating point sample values in a range between zero (pixel becomes black) and one (no attenuation).
- the granularity of the signaled metadata can be based on time (i.e., data are signaled per period/duration of the video content), temporal layers (i.e., data are signaled per temporal layer), slice type (intra and inter slices), picture, or parts of the picture (slices, tiles, sub pictures).
- the green metadata parameters related to the attenuation map are generated at the encoder. They are carried by an extension of the MPEG green metadata as specified in ISO/IEC 23001-11, and more particularly the display adaptation section, by introducing additional syntax aiming at defining additional green metadata parameters defining how to utilize the attenuation maps.
- Green Metadata Display Adaptation This section of ISO/IEC 23001-11, hereafter named Green Metadata Display Adaptation (GMDA) information, is thus extended to indicate how to obtain and use at the receiver side a precomputed attenuation map (or dimming map) associated with the visual content.
- the encoder signals this information in the bitstream.
- the attenuation map is decoded and the GMDA information is obtained by the decoder, and the decoded image is modified according to the GMDA information such that the energy consumption is reduced.
- a specific Alpha Channel Information SEI message (hereafter named as ACI-SEI) related to the use of auxiliary pictures of type AUX ALPHA indicates that the AUX ALPHA auxiliary picture should not be used conventionally for alpha blending purposes.
- Green metadata can be carried as specified in ISO/IEC 12818-1 or it can be carried in metadata tracks within the ISO base media file format (ISO/IEC 14496-12), as specified in ISO/IEC 23001-10.
- the transmitter sends a message to the receiver such as a display device implementing the system 1000 of figure 3.
- the GMDA information is inserted in the bitstream comprising the encoded video, for example in the program bitstream. The parameters of the GMDA information are applicable until the next GMDA information arrives, carrying new parameters.
- Table 1 below gives a basic example of the metadata carried by the GMDA information according to embodiments.
- This table describes the attenuation map information, in other words, the parameters necessary to use an attenuation map sent as auxiliary picture in the bitstream. These parameters are located in the second part of the table below and start with the prefix ‘ami’.
- the first part of the table is related to conventional display adaptation functions not described herein.
- Metadata are sent globally for the full bitstream, i.e., that they are shared for all auxiliary pictures carrying attenuation maps.
- These metadata can also be shared in a period, where the concept of period can correspond for example to a picture, an Intra period, Group of Pictures (GOP), number of pictures, time duration, and for all.
- the GMDA message is then typically inserted per period. It is transmitted at the start of an upcoming period. The next message containing the metadata will be transmitted at the start of the next upcoming period. Therefore, when the upcoming period is a picture, a message will be transmitted for each picture. However, when the upcoming period is a specified time interval or a specified number of pictures, the associated message will be transmitted with the first picture in the time interval or with the first picture in the specified number of pictures.
- part of the information related to the auxiliary picture can be sent globally for the full bitstream (e.g., information related to the display models compatible with the use of the attenuation maps) and other information can be sent with a different periodicity, e.g., for each picture.
- the information is sent for more than one picture (e.g., at slice level, or GOP level)
- motion vectors corresponding to the decoded picture and included in the bitstream can be applied to the upsampled Attenuation Map before applying it to the decoded picture.
- the metadata of table 1 related to energy reduction may be described as follows (the other parameters of the first part are the same as currently defined in ISO/IEC 23001-11).
- the ami display model parameter indicates on which type of display technology the attenuation map should be applied at the receiver side.
- This metadata is a bit field mask which indicates the display models on which the attenuation map sample values of the auxiliary picture as shown in the Table 2.
- this flag indicates that only ami_attenuation_use_idc[ 0 ], ami_attenuation_comp_idc[ 0 ], ami_preprocessing_flag[ 0 ], ami_preprocessing_type_idc[ 0 ], ami_preprocessing_scale_idc[ 0 ] shall be present.
- the ami map approximation model parameter indicates which type of interpolation models should be used to infer an attenuation map of a different reduction rate than the one(s) corresponding to the attenuation map(s) sent in the metadata.
- this parameter is equal to 0
- a linear scaling of the attenuation map sample values of the provided auxiliary picture given its respective ami energy reduction rate is considered to obtain corresponding attenuation map sample values for another energy reduction rate.
- the auxiliary picture with the lowest ami energy reduction rate is used for the linear scaling.
- this parameter When this parameter is equal to 1, a bilinear interpolation between the Attenuation Map sample values of the provided auxiliary picture(s) given their respective ami energy reduction rate should be considered to obtain corresponding Attenuation Map sample values for another energy reduction rate.
- other values of this parameter will specify the use of other models for the interpolation, as summarized in the Table 3 below.
- the ami map number parameter indicates the number of attenuation maps contained in the metadata. Indeed, it is interesting to send more than one map, that will serve as control points in further interpolation process, in case the end user desires to use an attenuation map with a different reduction rate than the one(s) provided.
- the ami video id parameter specifies the identifier of the program elementary stream that contains the video on which the Attenuation Maps should be applied.
- the ami map id [i] parameter specifies the identifier of the program elementary stream that contains the auxiliary picture that corresponds to the Attenuation Map of index i.
- the ami_energy_reduction_rate[i] parameter indicates for a given attenuation map the corresponding level of energy reduction that can be expected from the use of the attenuation map.
- the value of this parameter specifies the energy reduction rate expressed in percentage or as a value in Watts.
- the ami video qualityfi] parameter indicates for a given attenuation map the corresponding quality than can be expected after using the attenuation map to reduce the energy of the decoded image
- ami video qualityfi] can be PSNR, V-MAF or SSIM values for example.
- quality metrics can be computed by the decoder but, for the sake of reducing the energy consumption, they could also be inferred at the encoder side. In this case, they could correspond to values of expected minimal quality.
- the ami max valuefi] parameter indicates the maximum value of the attenuation map of index i. Such a maximal value can be optionally used to further adjust the dynamic of the encoded attenuation map in the scaling process.
- the ami_attenuation_use_idc[i] parameter indicates which type of processing should be used to apply the transmitted attenuation map of index i on the decoded image to be displayed, as summarized in Table 4 below. For example, the attenuation map can be subtracted, added, multiplied, or divided to the decoded image, so that it reduces the level of the pixel values of the decoded image.
- the attenuation map sample values of the decoded auxiliary picture of index i should be added to one or more associated primary picture decoded sample(s) before displayed on screen. This case implies that the values of the attenuation map are negative.
- this parameter is equal to 1
- the attenuation map sample values of the decoded auxiliary picture should be subtracted to one or more associated primary picture decoded sample(s).
- this parameter is equal to 2
- the attenuation map sample values of the decoded auxiliary picture should be multiplied by one or more associated primary picture decoded sample(s) before displayed on screen.
- the decoded sample(s) should be divided by the associated attenuation map sample values of the decoded auxiliary picture before displayed on screen.
- the attenuation map sample values of the decoded auxiliary picture should be used in the context of a contrast sensitivity function to determine the attenuation to be applied to one or more associated primary picture decoded sample(s) before displayed on screen.
- this parameter is equal to 5 the attenuation map sample values of the decoded auxiliary picture should be used according to a proprietary user defined process to modify the one or more associated primary picture decoded sample(s) before displayed on screen.
- the ami_attenuation_comp_idc[i] parameter specifies on which color component(s) of the associated primary picture(s) decoded samples the decoded auxiliary picture of index i should be applied using the process defined by ami_attenuation_use_idc[ i ]. It also specifies how many components the decoded auxiliary picture of index i should contain. When equal to 0, the decoded auxiliary picture of index i contains only one component and this component should be applied to the luma component of the associated primary picture(s) decoded samples.
- the decoded auxiliary picture of index i contains only one component and this component should be applied to the luma component and the chroma components of the associated primary picture(s) decoded samples.
- the decoded auxiliary picture of index i contains only one component and this component should be applied to the RGB components (after YUV to RGB conversion) of the associated primary picture(s) decoded samples.
- the decoded auxiliary picture of index i contains two components and the first component should be applied to the luma component of the associated primary picture(s) decoded samples and the second component should be applied to both chroma components of the associated primary picture(s) decoded samples.
- the decoded auxiliary picture of index i contains three components and that these components should be applied respectively to the luma and chroma components of the associated primary picture(s) decoded samples.
- the decoded auxiliary picture of index i contains three components and these components should be applied respectively to the RGB components (after YUV to RGB conversion) of the associated primary picture(s) decoded samples.
- the ami_preprocessing _flag[i] parameter when true, specifies the use of preprocessing on the attenuation map sample values of the decoded auxiliary picture of index i. In that case it is supposed that the pre-processing is an up-sampling operation and that the auxiliary coded picture(s) and the primary coded picture have different size.
- the ami_preprocessing_type_idc[i] parameter indicates which type of preprocessing is to be applied to the attenuation map sample values of the decoded auxiliary picture of index i.
- this parameter is equal to 0
- the interpolation between the attenuation map sample values of the provided auxiliary picture of index i considered to obtain the attenuation map sample values to apply to the sample values of the decoded picture is a bicubic interpolation to retrieve the same resolution as the one of the associated decoded picture.
- this parameter is equal to 1
- the interpolation is a bilinear interpolation and when equal to 2, the interpolation is of type Lanczos.
- this parameter is equal to 3, a proprietary user defined process should be used. This is summarized in the Table 6 below.
- the ami_preprocessing_scale_idc[i] parameter specifies the scaling that is to be applied to the attenuation map samples of index i to get the attenuation map sample values (float) before applying them on the sample values of the decoded picture.
- this parameter is equal to 0
- a scaling of — should be applied.
- this parameter is equal to 1
- a proprietary user defined scaling should be used. This is summarized in Table 7 below. Table 7
- the new metadata ami box xstart, ami_box_y start, ami box width, ami box height define the position and the size of a bounding box defining the region of the decoded picture to apply the attenuation map, respectively the x coordinate, y coordinate of the video top left corner for example, width and height of the bounding box.
- the attenuation map is not applied outside of this bounding box.
- An alternative to this embodiment for region-based attenuation maps is to set the sample values of an attenuation map to 0 (if these samples are added or subtracted) or 1 (if these samples are multiplied) out of the region on which the attenuation map should be applied. In this case, these additional metadata are not needed.
- an additional flag is used to encapsulate all other metadata related to the use of the Attenuation Map in a global structure.
- the syntax is modified as shown in Table 9. In this table, when ami structure flag equal to 1, Attenuation Map Information parameters follow and when ami structure flag equal to 0, no Attenuation Map Information parameters follow.
- the use of the backlight display adaptation can be disabled by the transmitter by setting the parameters num constant backlight voltage time intervals, num max variations and num quality levels to zero. In this case, no backlight operation will be performed and the Attenuation Map Information parameters which follow are used instead.
- an additional flag is used to encapsulate all other metadata related to the use of backlight display adaptation in a global structure.
- the syntax is modified as shown in Table 10.
- backlight structure flag equal to 1
- parameters related to backlight adaptation follow and when backlight structure flag equal to 0, no parameters related to backlight adaptation follow.
- ami structure flag equal to 1
- Attenuation Map Information parameters follow and when ami structure flag equal to 0, no Attenuation Map Information parameters follow.
- a single bit field named ami flags is used to convey several information flags.
- the bit 0 indicates whether all the following information data in the message have to be redefined for each decoded auxiliary picture of type AUX ALPHA or not. It corresponds to the previously defined ami global flag.
- the bit 1 indicates whether the following Attenuation maps in the message can be used for approximating other Attenuation Maps for other reduction rates. It corresponds to a new ami approximation flag.
- the bit 2 indicates whether preprocessing is required to use the following Attenuation Maps in the message. It corresponds to a new ami_preprocessing_global_flag.
- the bit 3 indicates that the Attenuation Maps in the message shall be applied to a region of the primary video. This region is defined respectively by the x, y coordinates of the top left corner and the width and height of the region bounding box. It corresponds to a new ami box flag.
- the bits 4-7 are reserved for future use.
- the ami flags bit field is summarized in Table 11.
- the ami flags bit field allows to reduce the size of the SEI message by not sending the metadata information that is not needed to use and apply the Attenuation Maps.
- the syntax in table 10 is then modified as illustrated in table 12 below.
- a single bit field ami flags is used to convey several information flags.
- the bit 0 indicates that the SEI message contains further information related to the use of Attenuation Maps. It corresponds to the previously defined ami structure flag.
- the bit 1 indicates whether all the following information data in the message have to be redefined for each decoded auxiliary picture of type AUX ALPHA or not. It corresponds to the previously defined ami global flag.
- the bit 2 indicates whether the following Attenuation maps in the message can be used for approximating other Attenuation Maps for other reduction rates. It corresponds to a new ami approximation flag.
- the bit 3 indicates whether preprocessing is required to use the following Attenuation Maps in the message. It corresponds to a new ami_preprocessing_global_flag.
- the bit 4 indicates that the Attenuation Maps in the message shall be applied to a region of the primary video. This region is defined respectively by the x, y coordinates of the top left corner and the width and height of the region bounding box. It corresponds to a new ami box flag.
- the bits 5-7 are reserved for future use.
- the ami flags bit field is summarized in Table 13.
- Table 14 In at least one embodiment, a single bit field display flags is used to convey several information flags.
- the bit 0 indicates that the SEI message contains further information related to the use of backlight to reduce their energy consumption. It corresponds to the previously defined b ackl i ght structure fl ag .
- the bit 1 indicates that the SEI message contains further information related to the use of Attenuation Maps. It corresponds to the previously defined ami structure flag.
- the bit 2 indicates whether all the following information data in the message have to be redefined for each decoded auxiliary picture of type AUX ALPHA or not. It corresponds to the previously defined ami global flag.
- the bit 3 indicates whether the following Attenuation maps in the message can be used for approximating other Attenuation Maps for other reduction rates. It corresponds to a new ami approximation flag.
- the bit 4 indicates whether preprocessing is required to use the following Attenuation Maps in the message. It corresponds to a new ami_preprocessing_global_flag.
- the bit 5 indicates that the Attenuation Maps in the message shall be applied to a region of the primary video. This region is defined respectively by the x, y coordinates of the top left corner and the width and height of the region bounding box. It corresponds to a new ami box flag.
- the bits 6-7 are reserved for future use.
- the display flags bit field is summarized in Table 11.
- the attenuation map is carried by an auxiliary picture of a specific type “AUX ALPHA” conventionally used for alpha blending operations. This is signaled by using an sdi aux id [ i ] equal to 1 (see table 18 below).
- the auxiliary picture of type AUX ALPHA is accompanied by a standardized Alpha Channel Information SEI message (ACI-SEI) which contains metadata (as defined in documents ISO/IEC 23002-3 or ITU-T H.265, ITU-T H.274) that determine how to use this auxiliary picture.
- ACI-SEI defines the following parameters.
- the alpha channel cancel flag parameter when equal to 1, indicates that the SEI message cancels the persistence of any previous ACI SEI message in output order that applies to the current layer.
- the alpha channel use idc parameter when equal to 0, indicates that for alpha blending purposes the decoded samples of the associated primary picture should be multiplied by the interpretation sample values of the decoded auxiliary picture in the display process after output from the decoding process.
- the alpha_channel_bit_depth_minus8 plus 8 parameter specifies the bit depth of the samples of the luma sample array of the auxiliary picture. This parameter shall be equal to the bit depth of the associated primary picture.
- the alpha transparent value parameter specifies the interpretation sample value of a decoded auxiliary picture luma sample for which the associated luma and chroma samples of the primary coded picture are considered transparent for purposes of alpha blending.
- the number of bits used for the representation of the alpha transparent value syntax element is alpha_channel_bit_depth_minus8 + 9.
- the alpha opaque value parameter specifies the interpretation sample value of a decoded auxiliary picture luma sample for which the associated luma and chroma samples of the primary coded picture are considered opaque for purposes of alpha blending.
- the number of bits used for the representation of the alpha opaque value syntax element is alpha_channel_bit_depth_minus8 + 9.
- a value of alpha opaque value that is equal to alpha transparent value indicates that the auxiliary coded picture is not intended for alpha blending purposes.
- alpha opaque value can be greater than alpha transparent value or it can be less than or equal to alpha transparent value.
- the alpha channel incr flag parameter when equal to 0, indicates that the interpretation sample value for each decoded auxiliary picture luma sample value is equal to the decoded auxiliary picture sample value for purposes of alpha blending.
- 1 indicates that, for purposes of alpha blending, after decoding the auxiliary picture samples, any auxiliary picture luma sample value that is greater than Min( alpha opaque value, alpha transparent value ) should be increased by one to obtain the interpretation sample value for the auxiliary picture sample and any auxiliary picture luma sample value that is less than or equal to Min( alpha opaque value, alpha transparent value ) should be used, without alteration, as the interpretation sample value for the decoded auxiliary picture sample value.
- alpha transparent value is equal to alpha opaque value or Log2( Abs( alpha opaque value-- alpha transparent value ) ) does not have an integer value
- alpha channel incr flag shall be equal to 0.
- the alpha channel clip flag parameter when equal to 0, indicates that no clipping operation is applied to obtain the interpretation sample values of the decoded auxiliary picture.
- When equal to 1 indicates that the interpretation sample values of the decoded auxiliary picture are altered according to the clipping process described by the alpha channel clip type flag syntax element.
- the alpha channel clip type flag parameter when equal to 0, indicates that, for purposes of alpha blending, after decoding the auxiliary picture samples, any auxiliary picture luma sample that is greater than ( alpha opaque value + alpha transparent value ) / 2 is set equal to Max( alpha transparent value, alpha opaque value ) to obtain the interpretation sample value for the auxiliary picture luma sample and any auxiliary picture luma sample that is less or equal than ( alpha opaque value + alpha transparent value ) / 2 is set equal to Min( alpha transparent value, alpha opaque value ) to obtain the interpretation sample value for the auxiliary picture luma sample.
- any auxiliary picture luma sample that is greater than Max( alpha transparent value, alpha opaque value ) is set equal to Max( alpha transparent value, alpha opaque value ) to obtain the interpretation sample value for the auxiliary picture luma sample and any auxiliary picture luma sample that is less than or equal to Min( alpha transparent value, alpha opaque value ) is set equal to Min( alpha transparent value, alpha opaque value ) to obtain the interpretation sample value for the auxiliary picture luma sample.
- the alpha channel use idc parameter of the ACI-SEI is set to 3. This indicates that the usage of the auxiliary picture is unspecified and that the decoder should ignore all subsequent information of the SEI. Note that when alpha channel use idc equals to 2, this indicates that the usage of the auxiliary picture is also unspecified, but the decoder does not ignore all subsequent information of the SEI. Values greater than 2 for alpha channel use idc are reserved for future use by ITU-T
- a compliant receiver receiving such ACI-SEI message will deduce that the alpha map should not be used conventionally for alpha blending purposes. Since the receiver will also receive GMDA information, it will determine that the auxiliary picture of type AUX ALPHA should be use as an attenuation map for energy reduction purposes and will apply the map according to the GMDA information. In case, no GMDA information is received, or the GMDA information does not contain any Attenuation Map related metadata, the decoder does not consider applying any attenuation to the decoded picture.
- auxiliary picture of type AUX ALPHA another type of auxiliary data may be used, as soon as the type fulfills the requirement to transmit the attenuation map or use a newly defined type.
- the attenuation map is carried by an auxiliary picture of a new specific type “AUX_ATTENUATION” dedicated to energyreduction of pictures.
- Auxiliary pictures are defined in ISO/IEC 23002-3 Auxiliary Video Data Representations.
- Table 18 illustrates the addition of a new specific type of auxiliary picture named “AUX ATTENUATION” that will be used in relation with the new metadata defined above. This new type comes in addition to the conventional types of auxiliary pictures related to alpha plane and picture depth information.
- the parameters for applying the attenuation are still carried by a GMDA information as described in Table 1.
- auxiliary pictures are used to transmit the Attenuation Map.
- three auxiliary pictures of type AUX_ATTENUATION are used, one per component Y, U, V of the Attenuation Map.
- two auxiliary pictures may be used only, one for Y and another one for U and V. Any other combination can be envisioned.
- the GMDA information several sets of metadata are sent, each corresponding to one Attenuation Map. A total number of ami map number is considered and each set corresponds to an index i in the metadata.
- auxiliary pictures that represent components of a full Attenuation Map
- each component of the full Attenuation Map will also correspond to an index within the ami map number metadata.
- additional metadata are needed that give the information of which auxiliary pictures should be used in combination, as individual components of the Attenuation Map.
- the ami comp nbfi] parameter indicates the number of additional auxiliary pictures needed to reconstruct the full Attenuation Map, from the Attenuation Map of index i, corresponding to one component of the full Attenuation Map.
- the ami_comp_idc[i][j] parameter indicates the index of one other set of metadata (i.e., related to the Attenuation Map of index j), and corresponding to another component of the full Attenuati on Map
- Table 19 The new definitions for table 19 with regard to table 5 are the values above 5.
- ami_attenuation_comp_idc[ i ] 6
- the decoded auxiliary picture of index i contains one component and this component should be applied to the first component of the associated primary picture(s) decoded samples.
- ami_attenuation_comp_idc[ i ] 7
- the decoded auxiliary picture of index i contains one component and this component should be applied to the second component of the associated primary picture(s) decoded samples.
- the decoded auxiliary picture of index i contains one component and this component should be applied to the third component of the associated primary picture(s) decoded samples.
- ami_attenuation_comp_idc[ i ] equal to 9
- the mapping between the components of the decoded auxiliary picture of index i and the components of which to apply the decoded auxiliary picture of index i corresponds to a proprietary user-defined process.
- Table 1 the syntax of Table 1 is modified as shown in Table 20, in order to insert the parameters supporting this embodiment using multiple auxiliary pictures.
- Figure 4 illustrates flowcharts of two examples of video encoding process using attenuation map information carried by an auxiliary picture of type AUX ALPHA according to at least one embodiment.
- the encoding processes 400 or 401 are implemented for example by an encoder 100 of figure 1 or a processor 1010 in a device 1000 of figure 3.
- the figure describes the metadata generation and the bitstream encapsulation process according to an embodiment, performed for example during the encoding, in compliance with the syntax introduced above.
- the metadata is inserted for a given picture, i.e., the considered period corresponds to one picture.
- the device encodes the picture conventionally, resulting in a partial bitstream.
- the device performs conventional partial bitstream decoding.
- step 430 an attenuation map corresponding to the decoded picture is computed for a selected energy reduction rate.
- data corresponding to the parameters described in table 1 are collected, for example about the use of the attenuation map, the expected energy reduction, the corresponding expected quality, the pre-processing operation, etc.
- step 450 the auxiliary picture corresponding to the attenuation map is generated, before being encoded, and inserted in the partial bitstream in step 460.
- step 470 the ACI-SEI message is generated in a conventional manner with for example values set according to table 10.
- the ACI-SEI message is then encoded and inserted in the bitstream in step 471.
- the GMDA information is generated based on the collection information, for example according to the syntax defined in table 1.
- the GMDA information is then inserted in an elementary bitstream of the program bitstream and carried using the file format specified in ISO/IEC 23001-10 or by using MPEG-2 systems as specified in ISO/IEC 13818-1.
- the ordering of some of the steps may be altered, still relying on the same principles.
- all the encoding steps may be performed in the same step.
- this bitstream comprises at least an image of the video and in relation with this image, an attenuation map that corresponds to a selected energy reduction rate as well as metadata describing how to use the attenuation map.
- a receiver such as a display device for example, to determine from this bitstream an image of the video and to apply an attenuation map on this image, that will allow a reduction of the energy consumption when using (e.g., displaying) the video.
- Attenuation maps are computed for different energy reduction rates. For example, two attenuation maps with energy reduction rates and R 2 may be computed. This allows to interpolate a corresponding attenuation map at the decoder side for any other reduction rate R such as R 1 ⁇ R ⁇ R 2 .
- the process 401 of figure 4 illustrates a flowchart of such embodiment. Most of the steps are identical to the steps of the process 400. The difference is related to the iteration 425 that is done for a selected set of energy reduction rates. Thus, an attenuation map is generated for each of the energy reduction rates, the set of corresponding auxiliary pictures are inserted into the bitstream and the corresponding ACI-SEI messages are generated.
- the bitstream comprises at least an image of the video and in relation with this image, a set of different attenuation maps that correspond to a set of selected energy reduction rates.
- a receiver such as a display device for example, to determine from this bitstream an image of the video that will allow a reduction of the energy consumption when displaying the video according to an energy reduction rate not in the list of reduction rates, thanks to the plurality of attention maps and parameters carried by the bitstream.
- all auxiliary pictures may be inserted at once, outside of the loop on energy reduction rates.
- Another example comprises generating and inserting the GMDA information for each auxiliary picture, i.e., inside the loop 425 on energy reduction rates.
- Another example comprises generating and inserting the ACI-SEI messages outside the loop 425, for all auxiliary pictures at once.
- the computation of the attenuation map and the collection of the associated metadata is realized outside of the encoder, for example in a dedicated device, and these data are for example stored in a database accessible by the encoder. These additional data are then provided to the encoder together with the input video in a process similar to the processes 400 and 401.
- the auxiliary data is sent without any accompanying metadata in the GMDA related to the use of the Attenuation Maps and thus without having in the bitstream the parameters to benefit from the attenuation map.
- This use case targets specific decoding devices that have a predetermined behavior.
- An example of such device is an advertisement display panel.
- a default mode is defined with the default values for these parameters. There is persistence of these values for the whole bitstream. The default values are as shown in Table 21.
- the ami video id is not present in the table since it is dynamic.
- the attenuation map should apply to all the video elementary streams of the program.
- Figure 5 illustrates a flowchart of an example of video decoding process using attenuation map information carried by an auxiliary picture of type AUX ALPHA according to at least one embodiment.
- the process 500 can be applied to a picture or to a group of pictures or to a part of a picture (slice, tile), according to the level of signaling of the metadata.
- the following description illustrates the case where the process is applied to a single picture and there is only one attenuation map, but the other cases are similar and based on the same steps.
- step 510 the picture is decoded from the bitstream, producing a decoded picture 511.
- step 520 the ACI-SEI message data are retrieved from the bitstream for a picture and decoded to provide the different parameter values as described according to table 10.
- step 523 the process verifies that the alpha channel use idc parameter of the ACI-SEI message is set to 3. Indeed, this indicates that the AUX ALPHA picture should not be used for conventional alpha blending purposes and that the decoder should ignore the rest of the ACI- SEI message. If it is not the case (branch “No”, parameter is not set to 3), then the processor jumps to step 530, and the decoding may be done conventionally without using an attenuation map.
- step 524 the processor obtains the GMDA information.
- step 525 the display model 521 of the end device is checked against the decoded ami display model parameter of the GMDA information. If the display model 521 of the end device is not compatible with the ami display model parameter (branch “No”), then the processor jumps to step 530, and the decoding may be done conventionally without using an attenuation map.
- a mapping with the attenuation map carried by the auxiliary data of type AUX ALPHA and its corresponding decoded picture is done by the use of ami video id which gives the identifier of the program elementary stream that contains the video to which the attenuation map should be applied and ami map idfi] which gives the identifier of the program elementary stream that contains the auxiliary picture corresponding to the Attenuation Map of index i.
- the auxiliary data corresponding to the picture is decoded and produces a decoded attenuation map 541.
- step 550 the ami_preprocessing_flag parameter is checked to get the information whether an upsampling process should be applied to the decoded attenuation map 541.
- step 560 in case an upsampling process is to be applied (branch “yes” of step 550), the decoded attenuation map 541 is upsampled according to the process given by the ami_preprocessing_idc parameter.
- step 565 it is further rescaled according to a scaling factor described by ami_preprocessing_scale_idc and ami max value parameters.
- step 580 the upsampled and rescaled attenuation map is applied on the decoded picture 511 according to the process described by ami attenuation use idc and ami attenuation comp idc to produce an energy-reduced image 581 that is further sent to the display in step 590.
- the attenuation map is first rescaled according to a scaling factor described by ami_preprocessing_scale_idc and ami max value in step 555.
- the rescaled attenuation map is applied on the decoded picture in step 570 according to the process described by ami attenuation use idc and ami attenuation comp idc to produce a reduced image 581 that is further sent to the display in step 590.
- Figure 6 illustrates flowcharts of two examples of video encoding process using attenuation map information carried by an auxiliary picture of type AUX ATTENUATION according to at least one embodiment.
- Such encoding processes 600 and 601 are very similar to the processes 400 and 401 of figure 4. The difference relies mainly on the use of an auxiliary picture of type AUX ATTENUATION generated in step 650 (instead of an auxiliary picture of type AUX ALPHA) as well as the encoding and insertion of this auxiliary picture in the bitstream in step 660.
- no ACI-SEI message is needed so that the steps 470 and 471 of figure 4 are not needed anymore.
- the other steps are identical to the corresponding steps of figure 4.
- Figure 7 illustrates a flowchart of an example of video decoding process using attenuation map information carried by an auxiliary picture of type AUX ATTENUATION according to at least one embodiment.
- This decoding process 700 is very similar to the process 500 of figure 5. The difference relies mainly on the use of an auxiliary picture of type AUX_ATTENUATION decoded in step 740 (instead of an auxiliary picture of type AUX_ALPHA). No ACI-SEI message is needed so that the steps 520 and 523 of figure 5 are not needed anymore. The other steps are identical to the corresponding steps of figure 5.
- Figure 8 illustrates a flowchart of an example of video decoding process using attenuation map information based on multiple components carried by multiple separate auxiliary pictures of type AUX ALPHA according to at least one embodiment.
- This decoding process 800 is very similar to the decoding process 500 of figure 5.
- the differences rely mainly on an additional iteration to retrieve the separate attenuation map components according to ami comp number, apply the separate attenuation map components in steps 870 and 880 to corresponding components of the decoded picture and combine the resulting components in step 885.
- This principle of using multiple components carried by separate auxiliary pictures can be adapted to use multiple separate auxiliary pictures of type AUX ATTENUATION. In such embodiment, the steps 820 and 823 are not present anymore.
- the use of the attenuation map is disabled if the expected quality for the end device is higher than the quality given by ami video quality.
- a given energy reduction rate R i.e., corresponding to an expected Energy Reduction Rate
- the corresponding attenuation map is applied to the decoded picture. If the energy reduction rate R is lower than the transmitted ami energy reduction rate, a new attenuation map corresponding to R is inferred by extrapolating this new attenuation map from the transmitted attenuation map according to the process described by ami map approximation model.
- An example of such a process can be a simple linear scaling of the attenuation map.
- a given energy reduction rate R is checked against rates R and R 2 . If this rate corresponds to one of rates R 1 and R 2 , the corresponding processed attenuation map (as described in the previous embodiment) is applied on the decoded picture. If the energy reduction rate R is such that that R 1 ⁇ R ⁇ R 2 , a new attenuation map corresponding to R is inferred by extrapolating this new attenuation map from the decoded attenuation maps corresponding to R 1 and R 2 according to the process described by ami map approximation model.
- An example of such a process can be a pixel wise linear or a bicubic interpolation between the two attenuation maps corresponding to and R 2 . If R is bigger than both R and R 2 , an interpolation process on the attenuation map by linear scaling from the largest energy reduction rate can also be envisioned, but with no warranty on the resulting quality of the reduced picture. This embodiment is easily extended to more than two transmitted attenuation maps.
- the use of the attenuation map is disabled for some content pictures depending on: the image category (e.g., sport images, gaming images, etc.), the display settings (e.g., cinema mode, etc.), etc.
- the image category e.g., sport images, gaming images, etc.
- the display settings e.g., cinema mode, etc.
- the method is disabled for specific content for which the energy reduction will not be significant. For example, dark content would lead to very low energy reduction whatever the technique.
- the total amount of luminance per picture is computed and, when lower than a given threshold, the energy reduction method is disabled. Alternatively, this might be disabled per GOP, per shot, per movie.
- an additional check is added to verify that a pixel spatially belongs to a subset of the image to be processed, for example belonging to a region where the energy reduction is not desired.
- a region or mask can be based on, for example, a spatiotemporal just noticeable difference (JND) map, a motion field, a saliency map, a gaze tracking information or other pixel-wise information.
- JND just noticeable difference
- the attenuation map is not applied to this pixel. This check can be done before or after the upsampling if any of the attenuation map.
- post-processing operations are taken into account while building the attenuation map. For example, this may be done in the process 400 by introducing an additional step between steps 420 and 430 to add a post processing operation on the decoded picture before building the attenuation map.
- the attenuation map should be applied after all the post-processing operations.
- these post-processing intervene independently of the encodingdecoding process and in this case, the attenuation map should be adapted to take these postprocessing into account, before applying it on the decoded and post processed picture.
- At least one embodiment further comprises checking that the color gamut of the decoded picture corresponds to the color gamut while computing the attenuation map. This can be done by sending the color gamut of the attenuation map’s computation together with the metadata.
- the attenuation map cannot be applied directly. However, it is possible to use the attenuation map to provide guidance to a backlight scaling algorithm. Indeed, for backlight displays, a strong contributor to the energy consumption of the display is the backlight.
- an attenuation map is applied on transmissive pixels displays by determining the minimal value, average value, or any other global value from the attenuation map and using this information to guide the backlight of the transmissive pixels displays.
- the attenuation map could be previously split in regions corresponding to the local dimming regions of the display, before determining the minimal value, average value, or any other global value from the attenuation map and using this information to guide the backlight.
- Figure 9 illustrates examples of sequence diagrams representing the information exchange between a transmitter and a receiver according to further embodiments. Indeed, the embodiments described above assume that no signaling mechanism exists from the receiver to the transmitter. In further embodiments, it is considered that a signaling mechanism exists from the receiver to the transmitter, in other words, a return channel is available.
- information such as the display type and/or a given energy reduction rate for example, can be transmitted back, in step 920, from the receiver to the transmitter.
- the transmitter takes such information into account to provide, in step 930, a visual content and corresponding metadata that better matches the information obtained from the receiver. For example, it may compute a specific attenuation map with an energy reduction rate provided by the receiver. It may adapt already precomputed attenuation maps to a specific energy reduction rate. It may compute a specific attenuation map corresponding to a video quality level provided by the receiver. It may compute a specific attenuation map corresponding to a display model provided by the receiver.
- the GMDA information may be adapted to take into account the specific information and constraints obtained from the receiver.
- the receiver provides information related to the energy reduction application to the transmitter using the syntax described in Table 22. As stated above, such information is identified by the ‘ami’ prefix and added to the information related to display power reduction as defined in ISO/IEC 23001-11. In this table, only one of the ami energy reduction rate or ami video quality is required. The other is optional and may be omitted.
- the transmitter does not need to send back the GMDA information as described in Table 1 but only a subset as shown in Table 23.
- the receiver may select one of the map id, in step 921.
- the syntax from the receiver to the transmitter is modified as shown in Table 24, with an ami map id corresponding for example to a selected reduction rate.
- the transmitter provides to the receiver the modified GMDA information as shown in table 25.
- the information corresponds to a selected map id.
- the receiver requests all information related to several ami map id (for example, it needs three auxiliary pictures to reconstruct one full Attenuation Map).
- the syntax from the receiver to the transmitter is shown in Table 26.
- the transmitter provides to the receiver the modified GMDA information as shown in table 27.
- the information corresponds to the attenuation maps that correspond to the list of ami map id sent by the receiver.
- the receiver display is of type OLED and therefore does not require any information related to backlight adaptation.
- the first part of the table 27 related to conventional display adaptation functions is discarded and the syntax is modified as shown in Table 28.
- the transmitter provides to the receiver the modified GMDA information as shown in table 29 that does not comprise anymore the information related to backlight adaptation.
- the decoding processes 500 of figure 5, 700 of figure 7 and 800 of figure 8 are implemented for example by a decoder 200 of figure 2, by a processor 1010 in a device 1000 of figure 3 or by various electronic devices such as smartphones, tablet computers, digital multimedia set top boxes, digital television receivers, personal video recording systems, connected home appliances, personal computers, laptop computers, and servers.
- At least one example of an embodiment can involve a device including an apparatus as described herein and at least one of (i) an antenna configured to receive a signal, the signal including data representative of the image information, (ii) a band limiter configured to limit the received signal to a band of frequencies that includes the data representative of the image information, and (iii) a display configured to display an image from the image information.
- At least one example of an embodiment can involve a device as described herein, wherein the device comprises one of a television, a television signal receiver, a set-top box, a gateway device, a mobile device, a cell phone, a tablet, a computer, a laptop, or other electronic device.
- the device comprises one of a television, a television signal receiver, a set-top box, a gateway device, a mobile device, a cell phone, a tablet, a computer, a laptop, or other electronic device.
- another example of an embodiment can involve a bitstream or signal formatted to include syntax elements and picture information, wherein the syntax elements are produced, and the picture information is encoded by processing based on any one or more of the examples of embodiments of methods in accordance with the present disclosure.
- one or more other examples of embodiments can also provide a computer readable storage medium, e.g., a non-volatile computer readable storage medium, having stored thereon instructions for encoding or decoding picture information such as video data according to the methods or the apparatus described herein.
- a computer readable storage medium having stored thereon a bitstream generated according to methods or apparatus described herein.
- One or more embodiments can also provide methods and apparatus for transmitting or receiving a bitstream or signal generated according to methods or apparatus described herein.
- Decoding can encompass all or part of the processes performed, for example, on a received encoded sequence in order to produce a final output suitable for display.
- processes include one or more of the processes typically performed by a decoder, for example, entropy decoding, inverse quantization, inverse transformation, and differential decoding.
- processes also, or alternatively, include processes performed by a decoder of various implementations described in this application.
- decoding refers only to entropy decoding
- decoding refers only to differential decoding
- decoding refers to a combination of entropy decoding and differential decoding.
- encoding can encompass all or part of the processes performed, for example, on an input video sequence in order to produce an encoded bitstream.
- processes include one or more of the processes typically performed by an encoder, for example, partitioning, differential encoding, transformation, quantization, and entropy encoding.
- encoding refers only to entropy encoding
- encoding refers only to differential encoding
- encoding refers to a combination of differential encoding and entropy encoding.
- syntax elements as used herein are descriptive terms. As such, they do not preclude the use of other syntax element names.
- the examples of embodiments, implementations, features, etc., described herein can be implemented in, for example, a method or a process, an apparatus, a software program, a data stream, or a signal. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method), the implementation of features discussed can also be implemented in other forms (for example, an apparatus or program).
- An apparatus can be implemented in, for example, appropriate hardware, software, and firmware.
- One or more examples of methods can be implemented in, for example, a processor, which refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device.
- Processors also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants ("PDAs"), and other devices that facilitate communication of information between end-users.
- PDAs portable/personal digital assistants
- processors are intended to broadly encompass various configurations of one processor or more than one processor.
- references to “one embodiment” or “an embodiment” or “one implementation” or “an implementation”, as well as other variations thereof, means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment.
- the appearances of the phrase “in one embodiment” or “in an embodiment” or “in one implementation” or “in an implementation”, as well any other variations, appearing in various places throughout this application are not necessarily all referring to the same embodiment.
- this application may refer to “determining” various pieces of information. Determining the information can include one or more of, for example, estimating the information, calculating the information, predicting the information, or retrieving the information from memory. Further, this application may refer to “accessing” various pieces of information. Accessing the information can include one or more of, for example, receiving the information, retrieving the information (for example, from memory), storing the information, moving the information, copying the information, calculating the information, determining the information, predicting the information, or estimating the information.
- this application may refer to “receiving” various pieces of information.
- Receiving is, as with “accessing”, intended to be a broad term.
- Receiving the information can include one or more of, for example, accessing the information, or retrieving the information (for example, from memory).
- “receiving” is typically involved, in one way or another, during operations such as, for example, storing the information, processing the information, transmitting the information, moving the information, copying the information, erasing the information, calculating the information, determining the information, predicting the information, or estimating the information.
- such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C).
- This may be extended, as is clear to one of ordinary skill in this and related arts, for as many items as are listed.
- implementations can produce a variety of signals formatted to carry information that can be, for example, stored or transmitted.
- the information can include, for example, instructions for performing a method, or data produced by one of the described implementations.
- a signal can be formatted to carry the bitstream of a described embodiment.
- Such a signal can be formatted, for example, as an electromagnetic wave (for example, using a radio frequency portion of spectrum) or as a baseband signal.
- the formatting can include, for example, encoding a data stream and modulating a carrier with the encoded data stream.
- the information that the signal carries can be, for example, analog or digital information.
- the signal can be transmitted over a variety of different wired or wireless links, as is known.
- the signal can be stored on a processor- readable medium.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
A new metadata associated with a visual content is related to the use of a pixel-wise attenuation map dedicated to the reduction of the energy consumption when using the visual content, for example when rendering it on a display. Information about the types of displays compatible with the use of the attenuation map, the type of pre-processing (ex: up-sampling); the type of operation to use for the application of the attenuation map, metrics of the expected energy reduction and on the expected quality impact of the use of such an attenuation map are provided. These parameters are carried through MPEG green metadata display adaptation syntax elements. The pixel-wise attenuation map for one image of the video may be carried over as an auxiliary image and encoded conventionally. Encoding and decoding methods and devices are described.
Description
METHOD AND DEVICE FOR ENERGY REDUCTION OF VISUAL CONTENT BASED ON ATTENUATION MAP USING MPEG DISPLAY ADAPTATION
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the priority to European Application N° 23305566.4 filed 14 April 2023, and European Application N° 23306068.0 filed 29 June 2023 which are incorporated herein by reference in their entirety.
TECHNICAL FIELD
The disclosure is in the field of video compression, and at least one embodiment relates more specifically to encoding and decoding a video comprising attenuation map information and corresponding parameters carried through MPEG green metadata display adaptation syntax elements, the application of the attenuation map allowing a reduction of the energy consumption when using the video, for example when rendering it on a display.
BACKGROUND ART
Reducing energy consumption of electronic devices has become a requirement not only for manufacturers of electronic devices but also to limit, as much as possible, the environmental impact and to contribute to the emergence of a sustainable display industry. The increase in display resolution from SD to HD, then to 4K and soon to 8K and beyond, as well as the introduction of high dynamic range imaging, has brought about a corresponding increase in energy requirements of display devices. This is not consistent with the global need to reduce energy consumption knowing that a huge number of devices has a display (i.e., TV, Mobile phones, tablets, etc.). Indeed, displays are the most important source of energy consumption, for consumer electronic devices, either battery-powered (e.g., smartphones, tablets, head-mounted displays, car display screens) or not (e.g., television sets, advertisement display panels).
Different display technologies have been developed in the recent years. Although modem displays consume energy in a more controllable and efficient manner than older displays, they remain the most important source of energy consumption in a video chain.
As far as backlight displays are concerned, their energy consumption is largely determined by the intensity of the backlight.
Organic Light Emitting Diode (OLED) is one example of display technology that is finding increasingly widespread use because of numerous advantages compared to former technologies such as Thin-Film Transistor Liquid Crystal Displays (TFT-LCDs). Rather than using a uniform backlight, OLED displays, as well as mini LEDS, are composed of individual directly emissive image pixels. OLEDs power consumption is therefore highly correlated to the image content and the power consumption for a given input image can be estimated by considering the values of the displayed image pixels. Although OLED displays consume energy in a more controllable and efficient manner, they are still the most important source of energy consumption in the video chain.
ISO/IEC 23001-11 specifies specific metadata, so-called Green Metadata, that enables the reduction of energy usage during media consumption, specifically at the display side. The metadata for display adaptation as defined in this specification are designed for a specific display and are particularly well tailored to transmissive display technologies embedding backlight illumination such as LCD displays. These metadata are designed to attain display energy reductions by using display adaptation techniques. They are composed of metrics made of RGB-component statistics and quality indicators of the video content. They can be used to perform RGB picture components rescaling to set the best compromise between backlight/voltage reduction and picture quality. Since the ISO/IEC 23001-11 document was published new emissive technologies have been introduced with the spread of emissive OLED displays, which allow a pixel-wise and more efficient control of their energy consumption, and consequently, a reduction of their energy consumption.
If the metadata already defined in the standard convey information for reduction of the energy consumed by displays, they also have the following drawbacks. First, they are tailored for backlit display technologies and therefore convey global information on statistics derived from the input content, and do not provide any information related to a pixel-wise attenuation map. Second, such global information is far from optimal when applied to more controllable directly emissive displays, for which it would be possible to apply energy reduction at the pixel level, allowing more precise control both on the energy reduction and quality of experience.
SUMMARY
In general, at least one example of an embodiment involves a new metadata associated with a visual content and related to the use of an attenuation map dedicated to the reduction of the energy consumption when using the visual content, for example when rendering it on a display. Information about the types of displays compatible with the use of the attenuation map, the type of pre-processing (ex: up-sampling); the type of operation to use for the application of the attenuation map, metrics of the expected energy reduction and on the expected quality impact of the use of such an attenuation map are provided. These parameters are carried through MPEG green metadata display adaptation syntax elements. The attenuation map for one image of the video may be carried over as an auxiliary image of type AUX ALPHA (or of a specific type AUX ATTENUATION) and encoded conventionally. In at least one embodiment, the attenuation map is a pixel-wise attenuation map. Encoding and decoding methods and devices are described.
A first aspect is directed to a method comprising obtaining encoded data comprising at least an image, an attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation for applying the attenuation map to an image, and a second parameter representative of a mapping between components of the attenuation map and image components affected by the operation, applying the attenuation map to the image to reduce values of components of the image by performing an operation based on the first parameter on components of the image selected based on the second parameter; and providing an attenuated image, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
A second aspect is directed to a method comprising obtaining an input image of a video, determining an attenuation map based on the input image according to a selected energy reduction rate, wherein applying the attenuation map to the input image reduces values of components of the input image, generating an encoded video comprising at least the input image, the attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation for applying the attenuation map to an image, and a second parameter representative of a mapping between components of the attenuation map and image components affected by the operation, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
A third aspect is directed to a device comprising a processor configured to obtain encoded data comprising at least an image, an attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation
for applying the attenuation map to an image, and a second parameter representative of a mapping between components of the attenuation map and image components affected by the operation, apply the attenuation map to the image to reduce values of components of the image by performing an operation based on the first parameter on components of the image selected based on the second parameter; and provide an attenuated image, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
A fourth aspect is directed to a device comprising a processor configured to obtain an input image of a video, determine an attenuation map based on the input image according to a selected energy reduction rate, wherein applying the attenuation map to the input image reduces values of components of the input image, generate an encoded video comprising at least the input image, the attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation for applying the attenuation map to an image, and a second parameter representative of a mapping between components of the attenuation map and image components affected by the operation, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
A fifth aspect is directed to a non-transitory computer readable medium containing data content generated according to the second aspect.
A sixth aspect is directed to non-transitory computer readable medium containing comprising instructions which, when the program is executed by a computer, cause the computer to carry out the described embodiments related to the first and second aspect.
A seventh aspect is directed to a computer program comprising instructions which, when the program is executed by a computer, cause the computer to carry out any of the described embodiments or variants related to the first and second aspect.
The above presents a simplified summary of the subject matter in order to provide a basic understanding of some aspects of the present disclosure. This summary is not an extensive overview of the subject matter. It is not intended to identify key/critical elements of the embodiments or to delineate the scope of the subject matter. Its sole purpose is to present some concepts of the subject matter in a simplified form as a prelude to the more detailed description provided below.
BRIEF SUMMARY OF THE DRAWINGS
The present disclosure may be better understood by consideration of the detailed description below in conjunction with the accompanying figures in which:
Figure 1 illustrates a block diagram of a video encoder according to an embodiment.
Figure 2 illustrates a block diagram of a video decoder according to an embodiment.
Figure 3 illustrates a block diagram of an example of a system in which various aspects and embodiments are implemented.
Figure 4 illustrates flowcharts of two examples of video encoding process using attenuation map information carried by an auxiliary picture of type AUX ALPHA according to at least one embodiment.
Figure 5 illustrates a flowchart of an example of video decoding process using attenuation map information carried by an auxiliary picture of type AUX ALPHA according to at least one embodiment.
Figure 6 illustrates flowcharts of two examples of video encoding process using attenuation map information carried by an auxiliary picture of type AUX ATTENUATION according to at least one embodiment.
Figure 7 illustrates a flowchart of an example of video decoding process using attenuation map information carried by an auxiliary picture of type AUX ATTENUATION according to at least one embodiment.
Figure 8 illustrates a flowchart of an example of video decoding process using attenuation map information based on multiple components carried by multiple separate auxiliary pictures of type AUX ALPHA according to at least one embodiment.
Figure 9 illustrates examples of sequence diagrams representing the information exchange between a transmitter and a receiver according to further embodiments.
It should be understood that the drawings are for purposes of illustrating examples of various aspects, features and embodiments in accordance with the present disclosure and are not necessarily the only possible configurations. Throughout the various figures, like reference designators refer to the same or similar features.
DETAILED DESCRIPTION
The present aspects, although describing principles related to particular drafts of VVC (Versatile Video Coding) or to HEVC (High Efficiency Video Coding) specifications, are not limited to VVC or HEVC, and can be applied, for example, to other standards and recommendations, whether pre-existing or future-developed, and extensions of any such standards and recommendations (including VVC and HEVC). Unless indicated otherwise, or
technically precluded, the aspects described in this application can be used individually or in combination.
Figure 1 illustrates a block diagram of a video encoder according to an embodiment. Variations of this encoder 100 are contemplated, but the encoder 100 is described below for purposes of clarity without describing all expected variations. Before being encoded, the video sequence may go through pre-encoding processing (101), for example, applying a color transform to the input color picture (e.g., conversion from RGB 4:4:4 to YCbCr 4:2:0), or performing a remapping of the input picture components in order to get a signal distribution more resilient to compression (for instance using a histogram equalization of one of the color components). Metadata can be associated with the pre-processing and attached to the bitstream.
In the encoder 100, a picture is encoded by the encoder elements as described below. The picture to be encoded is partitioned (102) and processed in units of, for example, CUs. Each unit is encoded using, for example, either an intra or inter mode. When a unit is encoded in an intra mode, it performs intra prediction (160). In an inter mode, motion estimation (175) and compensation (170) are performed. The encoder decides (105) which one of the intra mode or inter mode to use for encoding the unit, and indicates the intra/inter decision by, for example, a prediction mode flag. Prediction residuals are calculated, for example, by subtracting (110) the predicted block from the original image block.
The prediction residuals are then transformed (125) and quantized (130). The quantized transform coefficients, as well as motion vectors and other syntax elements, are entropy coded (145) to output a bitstream. The encoder can skip the transform and apply quantization directly to the non-transformed residual signal. The encoder can bypass both transform and quantization, i.e., the residual is coded directly without the application of the transform or quantization processes.
The encoder decodes an encoded block to provide a reference for further predictions. The quantized transform coefficients are de-quantized (140) and inverse transformed (150) to decode prediction residuals. Combining (155) the decoded prediction residuals and the predicted block, an image block is reconstructed. In-loop filters (165) are applied to the reconstructed picture to perform, for example, deblocking/SAO (Sample Adaptive Offset), Adaptive Loop-Filter (ALF) filtering to reduce encoding artifacts. The filtered image is stored at a reference picture buffer (180).
Figure 2 illustrates a block diagram of a video decoder according to an embodiment. In the decoder 200, a bitstream is decoded by the decoder elements as described below. Video decoder 200 generally performs a decoding pass reciprocal to the encoding pass. The encoder 100 also generally performs video decoding as part of encoding video data. In particular, the input of the decoder includes a video bitstream, which can be generated by video encoder 100. The bitstream is first entropy decoded (230) to obtain transform coefficients, motion vectors, and other coded information. The picture partition information indicates how the picture is partitioned. The decoder may therefore divide (235) the picture according to the decoded picture partitioning information. The transform coefficients are de-quantized (240) and inverse transformed (250) to decode the prediction residuals. Combining (255) the decoded prediction residuals and the predicted block, an image block is reconstructed. The predicted block can be obtained (270) from intra prediction (260) or motion-compensated prediction (i.e., inter prediction) (275). In-loop filters (265) are applied to the reconstructed image. The filtered image is stored at a reference picture buffer (280).
The decoded picture can further go through post-decoding processing (285), for example, an inverse color transform (e.g. conversion from YCbCr 4:2:0 to RGB 4:4:4) or an inverse remapping performing the inverse of the remapping process performed in the preencoding processing (101). The post-decoding processing can use metadata derived in the pre-encoding processing and signaled in the bitstream.
Figure 3 illustrates a block diagram of an example of a system in which various aspects and embodiments are implemented. System 1000 can be embodied as a device including the various components described below and is configured to perform one or more of the aspects described in this document. Examples of such devices include, but are not limited to, various electronic devices such as personal computers, laptop computers, smartphones, tablet computers, digital multimedia set top boxes, digital television receivers, personal video recording systems, connected home appliances, and servers. Elements of system 1000, singly or in combination, can be embodied in a single integrated circuit (IC), multiple ICs, and/or discrete components. For example, in at least one embodiment, the processing and encoder/decoder elements of system 1000 are distributed across multiple ICs and/or discrete components. In various embodiments, the system 1000 is communicatively coupled to one or more other systems, or other electronic devices, via, for example, a
communications bus or through dedicated input and/or output ports. In various embodiments, the system 1000 is configured to implement one or more of the aspects described in this document.
The system 1000 includes at least one processor 1010 configured to execute instructions loaded therein for implementing, for example, the various aspects described in this document. The processor 1010 may be a general-purpose processor, a special purpose processor, a conventional processor, a digital signal processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, Application Specific Integrated Circuits (ASICs), Field Programmable Gate Array (FPGAs) circuits, any other type of integrated circuit (IC), a state machine, and the like. The processor 1010 can include embedded memory, input output interface, and various other circuitries as known in the art. The system 1000 includes at least one memory 1020 (e.g., a volatile memory device, and/or a non-volatile memory device). System 1000 includes a storage device 1040, which can include non-volatile memory and/or volatile memory, including, but not limited to, Electrically Erasable Programmable Read-Only Memory (EEPROM), Read-Only Memory (ROM), Programmable Read-Only Memory (PROM), Random Access Memory (RAM), Dynamic Random Access Memory (DRAM), Static Random Access Memory (SRAM), flash, magnetic disk drive, and/or optical disk drive. The storage device 1040 can include an internal storage device, an attached storage device (including detachable and non-detachable storage devices), and/or a network accessible storage device, as non-limiting examples.
System 1000 includes an encoder/decoder module 1030 configured, for example, to process data to provide an encoded video or decoded video, and the encoder/decoder module 1030 can include its own processor and memory. The encoder/decoder module 1030 represents module(s) that can be included in a device to perform the encoding and/or decoding functions. As is known, a device can include one or both of the encoding and decoding modules. Additionally, encoder/decoder module 1030 can be implemented as a separate element of system 1000 or can be incorporated within processor 1010 as a combination of hardware and software as known to those skilled in the art.
Program code to be loaded onto processor 1010 or encoder/decoder 1030 to perform the various aspects described in this document can be stored in storage device 1040 and subsequently loaded onto memory 1020 for execution by processor 1010. In accordance with various embodiments, one or more of processor 1010, memory 1020, storage device 1040,
and encoder/decoder module 1030 can store one or more of various items during the performance of the processes described in this document. Such stored items can include, but are not limited to, the input video, the decoded video or portions of the decoded video, the bitstream, matrices, variables, and intermediate or final results from the processing of equations, formulas, operations, and operational logic.
In embodiments, memory inside of the processor 1010 and/or the encoder/decoder module 1030 is used to store instructions and to provide working memory for processing that is needed during encoding or decoding. In other embodiments, however, a memory external to the processing device (for example, the processing device can be either the processor 1010 or the encoder/decoder module 1030) is used for one or more of these functions. The external memory can be the memory 1020 and/or the storage device 1040, for example, a dynamic volatile memory and/or a non-volatile flash memory. In several embodiments, an external non-volatile flash memory is used to store the operating system of, for example, a television. In at least one embodiment, a fast external dynamic volatile memory such as a RAM is used as working memory for video coding and decoding operations, such as for MPEG-2 (MPEG refers to the Moving Picture Experts Group, MPEG-2 is also referred to as ISO/IEC 13818, and 13818-1 is also known as H.222, and 13818-2 is also known as H.262), HEVC (HEVC refers to High Efficiency Video Coding, also known as H.265 and MPEG-H Part 2), or VVC (Versatile Video Coding, a new standard being developed by JVET, the Joint Video Experts Team).
The input to the elements of system 1000 can be provided through various input devices as indicated in block 1130. Such input devices include, but are not limited to, (i) a radio frequency (RF) portion that receives an RF signal transmitted, for example, over the air by a broadcaster, (ii) a Component (COMP) input terminal (or a set of COMP input terminals), (iii) a Universal Serial Bus (USB) input terminal, and/or (iv) a High-Definition Multimedia Interface (HDMI) input terminal. Other examples, not shown in Figure 3, include composite video.
In various embodiments, the input devices of block 1130 have associated respective input processing elements as known in the art. For example, the RF portion can be associated with elements suitable for (i) selecting a desired frequency (also referred to as selecting a signal, or band-limiting a signal to a band of frequencies), (ii) downconverting the selected signal, (iii) band-limiting again to a narrower band of frequencies to select (for example) a signal frequency band which can be referred to as a channel in certain embodiments, (iv)
demodulating the downconverted and band-limited signal, (v) performing error correction, and (vi) demultiplexing to select the desired stream of data packets. The RF portion of various embodiments includes one or more elements to perform these functions, for example, frequency selectors, signal selectors, band-limiters, channel selectors, filters, downconverters, demodulators, error correctors, and demultiplexers. The RF portion can include a tuner that performs various of these functions, including, for example, downconverting the received signal to a lower frequency (for example, an intermediate frequency or a near-baseband frequency) or to baseband. In one set-top box embodiment, the RF portion and its associated input processing element receives an RF signal transmitted over a wired (for example, cable) medium, and performs frequency selection by filtering, downconverting, and filtering again to a desired frequency band. Various embodiments rearrange the order of the above-described (and other) elements, remove some of these elements, and/or add other elements performing similar or different functions. Adding elements can include inserting elements in between existing elements, such as, for example, inserting amplifiers and an analog-to-digital converter. In various embodiments, the RF portion includes an antenna.
Additionally, the USB and/or HDMI terminals can include respective interface processors for connecting system 1000 to other electronic devices across USB and/or HDMI connections. It is to be understood that various aspects of input processing, for example, Reed- Solomon error correction, can be implemented, for example, within a separate input processing IC or within processor 1010 as necessary. Similarly, aspects of USB or HDMI interface processing can be implemented within separate interface ICs or within processor 1010 as necessary. The demodulated, error corrected, and demultiplexed stream is provided to various processing elements, including, for example, processor 1010, and encoder/decoder 1030 operating in combination with the memory and storage elements to process the datastream as necessary for presentation on an output device.
Various elements of system 1000 can be provided within an integrated housing. Within the integrated housing, the various elements can be interconnected and transmit data therebetween using suitable connection arrangement 1140, for example, an internal bus as known in the art, including the Inter-IC (I2C) bus, wiring, and printed circuit boards.
The system 1000 includes communication interface 1050 that enables communication with other devices via communication channel 1060. The communication interface 1050 can include, but is not limited to, a transceiver configured to transmit and to receive data over communication channel 1060. The communication interface 1050 can include, but is not
limited to, a modem or network card and the communication channel 1060 can be implemented, for example, within a wired and/or a wireless medium.
Data is streamed, or otherwise provided, to the system 1000, in various embodiments, using a wireless network such as a Wi-Fi network, for example IEEE 802.11 (IEEE refers to the Institute of Electrical and Electronics Engineers). The Wi-Fi signal of these embodiments is received over the communications channel 1060 and the communications interface 1050 which are adapted for Wi-Fi communications. The communications channel 1060 of these embodiments is typically connected to an access point or router that provides access to external networks including the Internet for allowing streaming applications and other over- the-top communications. Other embodiments provide streamed data to the system 1000 using a set-top box that delivers the data over the HDMI connection of the input block 1130. Still other embodiments provide streamed data to the system 1000 using the RF connection of the input block 1130. As indicated above, various embodiments provide data in a non-streaming manner. Additionally, various embodiments use wireless networks other than Wi-Fi, for example a cellular network or a Bluetooth network.
The system 1000 can provide an output signal to various output devices, including a display 1100, speakers 1110, and other peripheral devices 1120. The display 1100 of various embodiments includes one or more of, for example, a touchscreen display, an organic lightemitting diode (OLED) display, a curved display, and/or a foldable display. The display 1100 can be for a television, a tablet, a laptop, a cell phone (mobile phone), or other devices. The display 1100 can also be integrated with other components (for example, as in a smart phone), or separate (for example, an external monitor for a laptop). The other peripheral devices 1120 include, in various examples of embodiments, one or more of a stand-alone digital video disc (or digital versatile disc) (DVR, for both terms), a disk player, a stereo system, and/or a lighting system. Various embodiments use one or more peripheral devices 1120 that provide a function based on the output of the system 1000. For example, a disk player performs the function of playing the output of the system 1000.
In various embodiments, control signals are communicated between the system 1000 and the display 1100, speakers 1110, or other peripheral devices 1120 using signaling such as AV.Link, Consumer Electronics Control (CEC), or other communications protocols that enable device-to-device control with or without user intervention. The output devices can be communicatively coupled to system 1000 via dedicated connections through respective interfaces 1070, 1080, and 1090. Alternatively, the output devices can be connected to system
1000 using the communications channel 1060 via the communications interface 1050. The display 1100 and speakers 1110 can be integrated in a single unit with the other components of system 1000 in an electronic device such as, for example, a television. In various embodiments, the display interface 1070 includes a display driver, such as, for example, a timing controller (T Con) chip.
The display 1100 and speaker 1110 can alternatively be separate from one or more of the other components, for example, if the RF portion of input 1130 is part of a separate set- top box. In various embodiments in which the display 1100 and speakers 1110 are external components, the output signal can be provided via dedicated output connections, including, for example, HDMI ports, USB ports, or COMP outputs.
The embodiments can be carried out by computer software implemented by the processor 1010 or by hardware, or by a combination of hardware and software. As a nonlimiting example, the embodiments can be implemented by one or more integrated circuits. The memory 1020 can be of any type appropriate to the technical environment and can be implemented using any appropriate data storage technology, such as optical memory devices, magnetic memory devices, semiconductor-based memory devices, fixed memory, and removable memory, as non-limiting examples. The processor 1010 can be of any type appropriate to the technical environment, and can encompass one or more of microprocessors, general purpose computers, special purpose computers, and processors based on a multi-core architecture, as non-limiting examples.
Implementation of pixel-wise dimming processes of images to reduce the energy are usually done either at the receiver (e.g., post-processing operation) or at the server side (e.g., pre-processing operation), with the consequence of having several limitations and drawbacks.
Part of this processing can be done independently of any information that is provided at the decoder or display side. Therefore, to be further in line with the goal of limiting the impact of the video chain on climate change, embodiments described herein propose to factorize only once the construction of the attenuation map at the encoder side, and to send this map together with the content, in the form of an auxiliary picture, instead of multiplying the processing in each device or server of the transmission chain. In this case, accompanying metadata are also required in the encoded bit-stream, to provide additional information on its use at the receiver side.
Additionally, methods to compute attenuation maps can be costly. In this case, splitting the full process in two steps (e.g., first the computation of the attenuation map at the encoder side, and second, its use at the decoder side) can advantageously enable its use even for devices with low computational resources such as smartphones. Furthermore, some information, such as the display type, is however useful to adapt the application of the attenuation map on the picture to display. This suggests a process in two steps.
If done fully at the receiver or display side, the resulting images after energy reduction will not necessarily respect the content creator’ s intent because some areas might be impacted by the application of a dimming map created without any control from the content creator and the quality of experience could be degraded. On the other hand, if prepared during the content creation, content creators can assess that the processing is compliant with their quality of experience requirements and indicate areas that should not be affected by the processing.
Embodiments described hereafter have been designed with the foregoing in mind and propose to solve these issues defining new metadata related to the use of a pixel-wise attenuation map dedicated to the reduction of the energy consumption at the receiver side when using, for example displaying, a visual content. For example, information about the types of displays compatible with the use of the attenuation map, the type of pre-processing (ex: up-sampling) and operation to use for the application of the attenuation map and indicative metrics of the expected energy reduction and on the expected quality impact of the use of such an attenuation map are provided using MPEG green metadata display adaptation syntax elements. The pixel-wise attenuation map for one image of the visual content may be carried over as an auxiliary image and encoded conventionally. In at least one embodiment, the attenuation map is carried by an auxiliary picture of a specific type “AUX ALPHA” conventionally used for alpha blending operations. In at least one embodiment, the attenuation map is carried by an auxiliary picture of a new specific type “AUX_ATTENUATION” dedicated to energy reduction of pictures.
The attenuation map is designed so that, when applied to an input image, it produces a modified image that requires less energy for display than the input image. One simple implementation is to scale down the luminance according to a selected energy reduction rate. More complex implantation takes other parameters into account such as the similarity between the modified image and the input image, or the contrast sensitivity function of the human
vision, or a smoothness characteristic that allows to downscale the attenuation map without introducing heavy artefacts when upscaling it on the decoder side, etc.
Applying an attenuation map to an image is based on combining them together according to a selected type of operation. The values of the samples of the attenuation map and the type of operation are closely related together. Indeed, when the operation is an addition, the attenuation map comprises sample values having negative values, when the operation is a subtraction, the attenuation map comprises sample values having negative values, when the operation is a multiplication, the attenuation map comprises floating point sample values in a range between zero (pixel becomes black) and one (no attenuation).
The granularity of the signaled metadata can be based on time (i.e., data are signaled per period/duration of the video content), temporal layers (i.e., data are signaled per temporal layer), slice type (intra and inter slices), picture, or parts of the picture (slices, tiles, sub pictures). The green metadata parameters related to the attenuation map are generated at the encoder. They are carried by an extension of the MPEG green metadata as specified in ISO/IEC 23001-11, and more particularly the display adaptation section, by introducing additional syntax aiming at defining additional green metadata parameters defining how to utilize the attenuation maps. This section of ISO/IEC 23001-11, hereafter named Green Metadata Display Adaptation (GMDA) information, is thus extended to indicate how to obtain and use at the receiver side a precomputed attenuation map (or dimming map) associated with the visual content. The encoder signals this information in the bitstream. The attenuation map is decoded and the GMDA information is obtained by the decoder, and the decoded image is modified according to the GMDA information such that the energy consumption is reduced.
In addition to the GMDA information, in the case where the attenuation map is carried by an auxiliary picture of a specific type “AUX ALPHA” conventionally used for alpha blending purposes, a specific Alpha Channel Information SEI message (hereafter named as ACI-SEI) related to the use of auxiliary pictures of type AUX ALPHA indicates that the AUX ALPHA auxiliary picture should not be used conventionally for alpha blending purposes.
Green metadata can be carried as specified in ISO/IEC 12818-1 or it can be carried in metadata tracks within the ISO base media file format (ISO/IEC 14496-12), as specified in ISO/IEC 23001-10. Using the format as described in the tables below, the transmitter sends a
message to the receiver such as a display device implementing the system 1000 of figure 3. In at least one embodiment, the GMDA information is inserted in the bitstream comprising the encoded video, for example in the program bitstream. The parameters of the GMDA information are applicable until the next GMDA information arrives, carrying new parameters.
Table 1 below gives a basic example of the metadata carried by the GMDA information according to embodiments. This table describes the attenuation map information, in other words, the parameters necessary to use an attenuation map sent as auxiliary picture in the bitstream. These parameters are located in the second part of the table below and start with the prefix ‘ami’. The first part of the table is related to conventional display adaptation functions not described herein.
In this example, it is considered that metadata are sent globally for the full bitstream, i.e., that they are shared for all auxiliary pictures carrying attenuation maps. These metadata can also be shared in a period, where the concept of period can correspond for example to a picture, an Intra period, Group of Pictures (GOP), number of pictures, time duration, and for all. The GMDA message is then typically inserted per period. It is transmitted at the start of an upcoming period. The next message containing the metadata will be transmitted at the start of the next upcoming period. Therefore, when the upcoming period is a picture, a message will be transmitted for each picture. However, when the upcoming period is a specified time interval or a specified number of pictures, the associated message will be transmitted with the first picture in the time interval or with the first picture in the specified number of pictures.
In embodiments, part of the information related to the auxiliary picture can be sent globally for the full bitstream (e.g., information related to the display models compatible with the use of the attenuation maps) and other information can be sent with a different periodicity, e.g., for each picture. In embodiments, if the information is sent for more than one picture (e.g., at slice level, or GOP level), motion vectors corresponding to the decoded picture and included in the bitstream can be applied to the upsampled Attenuation Map before applying it to the decoded picture.
Table 1
The metadata of table 1 related to energy reduction may be described as follows (the other parameters of the first part are the same as currently defined in ISO/IEC 23001-11). The ami display model parameter indicates on which type of display technology the attenuation map should be applied at the receiver side. This metadata is a bit field mask which indicates
the display models on which the attenuation map sample values of the auxiliary picture as shown in the Table 2. For example, ami_display_model=”0011” means the attenuation map Information can be used for both “Transmissive pixel” (i.e., backlit) and “Emissive pixel” display models.
Table 2
The ami global flag parameter indicates whether all the following information data have to be redefined for each decoded auxiliary picture or not. When this flag is equal to 0, it indicates that ami_attenuation_use_idc[ i ], ami_attenuation_comp_idc[ i ], ami_preprocessing_flag[ i ], ami_preprocessing_type_idc[ i ], ami_preprocessing_scale_idc[ i ], for i=0.. ami map number, shall be present.
When this flag is equal to 1, it indicates that only ami_attenuation_use_idc[ 0 ], ami_attenuation_comp_idc[ 0 ], ami_preprocessing_flag[ 0 ], ami_preprocessing_type_idc[ 0 ], ami_preprocessing_scale_idc[ 0 ] shall be present.
The ami map approximation model parameter indicates which type of interpolation models should be used to infer an attenuation map of a different reduction rate than the one(s) corresponding to the attenuation map(s) sent in the metadata. When this parameter is equal to 0, a linear scaling of the attenuation map sample values of the provided auxiliary picture given its respective ami energy reduction rate is considered to obtain corresponding attenuation map sample values for another energy reduction rate. In case several auxiliary pictures of type attenuation map are provided, the auxiliary picture with the lowest ami energy reduction rate is used for the linear scaling. When this parameter is equal to 1, a bilinear interpolation between the Attenuation Map sample values of the provided auxiliary picture(s) given their respective ami energy reduction rate should be considered to obtain corresponding Attenuation Map sample values for another energy reduction rate. Similarly, other values of this parameter will specify the use of other models for the interpolation, as summarized in the Table 3 below.
Table 3
The ami map number parameter indicates the number of attenuation maps contained in the metadata. Indeed, it is interesting to send more than one map, that will serve as control points in further interpolation process, in case the end user desires to use an attenuation map with a different reduction rate than the one(s) provided.
The ami video id parameter specifies the identifier of the program elementary stream that contains the video on which the Attenuation Maps should be applied.
The ami map id [i] parameter specifies the identifier of the program elementary stream that contains the auxiliary picture that corresponds to the Attenuation Map of index i.
The ami_energy_reduction_rate[i] parameter indicates for a given attenuation map the corresponding level of energy reduction that can be expected from the use of the attenuation map. The value of this parameter specifies the energy reduction rate expressed in percentage or as a value in Watts.
The ami video qualityfi] parameter indicates for a given attenuation map the corresponding quality than can be expected after using the attenuation map to reduce the energy of the decoded image, ami video qualityfi] can be PSNR, V-MAF or SSIM values for example. Such quality metrics can be computed by the decoder but, for the sake of reducing the energy consumption, they could also be inferred at the encoder side. In this case, they could correspond to values of expected minimal quality.
The ami max valuefi] parameter indicates the maximum value of the attenuation map of index i. Such a maximal value can be optionally used to further adjust the dynamic of the encoded attenuation map in the scaling process.
The ami_attenuation_use_idc[i] parameter indicates which type of processing should be used to apply the transmitted attenuation map of index i on the decoded image to be displayed, as summarized in Table 4 below. For example, the attenuation map can be subtracted, added, multiplied, or divided to the decoded image, so that it reduces the level of the pixel values of the decoded image. When this parameter is equal to 0, the attenuation map sample values of the decoded auxiliary picture of index i should be added to one or more associated primary picture decoded sample(s) before displayed on screen. This case implies that the values of the attenuation map are negative. When this parameter is equal to 1, the attenuation map sample values of the decoded auxiliary picture should be subtracted to one or more associated primary picture decoded sample(s). When this parameter is equal to 2, the attenuation map sample values of the decoded auxiliary picture should be multiplied by one or more associated primary picture decoded sample(s) before displayed on screen. When this parameter is equal to 3, the decoded sample(s) should be divided by the associated attenuation map sample values of the decoded auxiliary picture before displayed on screen. When this parameter is equal to 4 the attenuation map sample values of the decoded auxiliary picture should be used in the context of a contrast sensitivity function to determine the attenuation to be applied to one or more associated primary picture decoded sample(s) before displayed on screen. When this parameter is equal to 5 the attenuation map sample values of the decoded auxiliary picture should be used according to a proprietary user defined process to modify the one or more associated primary picture decoded sample(s) before displayed on screen.
Table 4
The ami_attenuation_comp_idc[i] parameter specifies on which color component(s) of the associated primary picture(s) decoded samples the decoded auxiliary picture of index i should be applied using the process defined by ami_attenuation_use_idc[ i ]. It also specifies how many components the decoded auxiliary picture of index i should contain.
When equal to 0, the decoded auxiliary picture of index i contains only one component and this component should be applied to the luma component of the associated primary picture(s) decoded samples.
When equal to 1 the decoded auxiliary picture of index i contains only one component and this component should be applied to the luma component and the chroma components of the associated primary picture(s) decoded samples.
When equal to 2 the decoded auxiliary picture of index i contains only one component and this component should be applied to the RGB components (after YUV to RGB conversion) of the associated primary picture(s) decoded samples.
When equal to 3, the decoded auxiliary picture of index i contains two components and the first component should be applied to the luma component of the associated primary picture(s) decoded samples and the second component should be applied to both chroma components of the associated primary picture(s) decoded samples.
When equal to 4 the decoded auxiliary picture of index i contains three components and that these components should be applied respectively to the luma and chroma components of the associated primary picture(s) decoded samples.
When equal to 5 the decoded auxiliary picture of index i contains three components and these components should be applied respectively to the RGB components (after YUV to RGB conversion) of the associated primary picture(s) decoded samples.
When equal to 6 the mapping between the components of the decoded auxiliary picture of index i and the components of which to apply the decoded auxiliary picture of index i corresponds to a proprietary user-defined process. This is summarized in Table 5 below.
Table 5
The ami_preprocessing _flag[i] parameter, when true, specifies the use of preprocessing on the attenuation map sample values of the decoded auxiliary picture of index i. In that case it is supposed that the pre-processing is an up-sampling operation and that the auxiliary coded picture(s) and the primary coded picture have different size.
The ami_preprocessing_type_idc[i] parameter indicates which type of preprocessing is to be applied to the attenuation map sample values of the decoded auxiliary picture of index i. When this parameter is equal to 0, the interpolation between the attenuation map sample values of the provided auxiliary picture of index i considered to obtain the attenuation map sample values to apply to the sample values of the decoded picture is a bicubic interpolation to retrieve the same resolution as the one of the associated decoded picture. When this parameter is equal to 1, the interpolation is a bilinear interpolation and when equal to 2, the interpolation is of type Lanczos. When this parameter is equal to 3, a proprietary user defined process should be used. This is summarized in the Table 6 below.
Table 6
The ami_preprocessing_scale_idc[i] parameter specifies the scaling that is to be applied to the attenuation map samples of index i to get the attenuation map sample values (float) before applying them on the sample values of the decoded picture. When this parameter
is equal to 0, a scaling of — should be applied. When this parameter is equal to 1, a
proprietary user defined scaling should be used. This is summarized in Table 7 below.
Table 7
In the case where the periodicity of the GMDA information corresponds to parts of the pictures, additional metadata are added that define to which region of the decoded picture the attenuation map sample values of the auxiliary picture should be applied. In this case, the syntax is then modified as illustrated in Table 8 below.
Table 8
The new metadata ami box xstart, ami_box_y start, ami box width, ami box height define the position and the size of a bounding box defining the region of the decoded picture to apply the attenuation map, respectively the x coordinate, y coordinate of the video top left corner for example, width and height of the bounding box. The attenuation map is not applied outside of this bounding box.
An alternative to this embodiment for region-based attenuation maps is to set the sample values of an attenuation map to 0 (if these samples are added or subtracted) or 1 (if these samples are multiplied) out of the region on which the attenuation map should be applied. In this case, these additional metadata are not needed.
In at least one embodiment, an additional flag is used to encapsulate all other metadata related to the use of the Attenuation Map in a global structure. The syntax is modified as shown in Table 9. In this table, when ami structure flag equal to 1, Attenuation Map Information parameters follow and when ami structure flag equal to 0, no Attenuation Map Information parameters follow.
In this embodiment, the use of the backlight display adaptation can be disabled by the transmitter by setting the parameters num constant backlight voltage time intervals, num max variations and num quality levels to zero. In this case, no backlight operation will be performed and the Attenuation Map Information parameters which follow are used instead.
Table 9
In at least one embodiment, an additional flag is used to encapsulate all other metadata related to the use of backlight display adaptation in a global structure. The syntax is modified as shown in Table 10. In this table, when backlight structure flag equal to 1, parameters related to backlight adaptation follow and when backlight structure flag equal to 0, no parameters related to backlight adaptation follow. Similarly, when ami structure flag equal to 1, Attenuation Map Information parameters follow and when ami structure flag equal to 0, no Attenuation Map Information parameters follow. When using those two flags, the encoding is optimal since only the parameters to be used by the decoder are inserted in the bitstream.
Table 10
In at least one embodiment, a single bit field named ami flags is used to convey several information flags. The bit 0 indicates whether all the following information data in the message have to be redefined for each decoded auxiliary picture of type AUX ALPHA or not. It corresponds to the previously defined ami global flag.
The bit 1 indicates whether the following Attenuation maps in the message can be used for approximating other Attenuation Maps for other reduction rates. It corresponds to a new ami approximation flag.
The bit 2 indicates whether preprocessing is required to use the following Attenuation Maps in the message. It corresponds to a new ami_preprocessing_global_flag.
The bit 3 indicates that the Attenuation Maps in the message shall be applied to a region of the primary video. This region is defined respectively by the x, y coordinates of the top left corner and the width and height of the region bounding box. It corresponds to a new ami box flag. The bits 4-7 are reserved for future use.
Table 11
The ami flags bit field allows to reduce the size of the SEI message by not sending the metadata information that is not needed to use and apply the Attenuation Maps. In this case, the syntax in table 10 is then modified as illustrated in table 12 below.
Table 12
In at least one embodiment, a single bit field ami flags is used to convey several information flags. The bit 0 indicates that the SEI message contains further information related to the use of Attenuation Maps. It corresponds to the previously defined ami structure flag.
The bit 1 indicates whether all the following information data in the message have to be redefined for each decoded auxiliary picture of type AUX ALPHA or not. It corresponds to the previously defined ami global flag.
The bit 2 indicates whether the following Attenuation maps in the message can be used for approximating other Attenuation Maps for other reduction rates. It corresponds to a new ami approximation flag.
The bit 3 indicates whether preprocessing is required to use the following Attenuation Maps in the message. It corresponds to a new ami_preprocessing_global_flag.
The bit 4 indicates that the Attenuation Maps in the message shall be applied to a region of the primary video. This region is defined respectively by the x, y coordinates of the top left corner and the width and height of the region bounding box. It corresponds to a new ami box flag.
The bits 5-7 are reserved for future use.
Table 13
Table 14
In at least one embodiment, a single bit field display flags is used to convey several information flags.
The bit 0 indicates that the SEI message contains further information related to the use of backlight to reduce their energy consumption. It corresponds to the previously defined b ackl i ght structure fl ag .
The bit 1 indicates that the SEI message contains further information related to the use of Attenuation Maps. It corresponds to the previously defined ami structure flag.
The bit 2 indicates whether all the following information data in the message have to be redefined for each decoded auxiliary picture of type AUX ALPHA or not. It corresponds to the previously defined ami global flag.
The bit 3 indicates whether the following Attenuation maps in the message can be used for approximating other Attenuation Maps for other reduction rates. It corresponds to a new ami approximation flag.
The bit 4 indicates whether preprocessing is required to use the following Attenuation Maps in the message. It corresponds to a new ami_preprocessing_global_flag.
The bit 5 indicates that the Attenuation Maps in the message shall be applied to a region of the primary video. This region is defined respectively by the x, y coordinates of the top left corner and the width and height of the region bounding box. It corresponds to a new ami box flag.
The bits 6-7 are reserved for future use.
Table 16
In at least one embodiment, the attenuation map is carried by an auxiliary picture of a specific type “AUX ALPHA” conventionally used for alpha blending operations. This is signaled by using an sdi aux id [ i ] equal to 1 (see table 18 below). The auxiliary picture of type AUX ALPHA is accompanied by a standardized Alpha Channel Information SEI message (ACI-SEI) which contains metadata (as defined in documents ISO/IEC 23002-3 or ITU-T H.265, ITU-T H.274) that determine how to use this auxiliary picture. Such ACI-SEI defines the following parameters.
The alpha channel cancel flag parameter, when equal to 1, indicates that the SEI message cancels the persistence of any previous ACI SEI message in output order that applies to the current layer. When equal to 0 indicates that ACI follows.
The alpha channel use idc parameter, when equal to 0, indicates that for alpha blending purposes the decoded samples of the associated primary picture should be multiplied by the interpretation sample values of the decoded auxiliary picture in the display process after output from the decoding process. When equal to 1 indicates that for alpha blending purposes the decoded samples of the associated primary picture should not be multiplied by the interpretation sample values of the decoded auxiliary picture in the display process after output from the decoding process. When equal to 2 indicates that the usage of the auxiliary picture is unspecified. Values greater than 2 for alpha channel use idc are reserved for future use by ITU-T | ISO/IEC. When not present, the value of alpha channel use idc is inferred to be equal to 2. Decoders shall ignore alpha channel information SEI messages in which alpha channel use idc is greater than 2.
The alpha_channel_bit_depth_minus8 plus 8 parameter specifies the bit depth of the samples of the luma sample array of the auxiliary picture. This parameter shall be equal to the bit depth of the associated primary picture.
The alpha transparent value parameter specifies the interpretation sample value of a decoded auxiliary picture luma sample for which the associated luma and chroma samples of the primary coded picture are considered transparent for purposes of alpha blending. The number of bits used for the representation of the alpha transparent value syntax element is alpha_channel_bit_depth_minus8 + 9.
The alpha opaque value parameter specifies the interpretation sample value of a decoded auxiliary picture luma sample for which the associated luma and chroma samples of the primary coded picture are considered opaque for purposes of alpha blending. The number of bits used for the representation of the alpha opaque value syntax element is alpha_channel_bit_depth_minus8 + 9. A value of alpha opaque value that is equal to alpha transparent value indicates that the auxiliary coded picture is not intended for alpha blending purposes. For alpha blending purposes, alpha opaque value can be greater than alpha transparent value or it can be less than or equal to alpha transparent value.
The alpha channel incr flag parameter when equal to 0, indicates that the interpretation sample value for each decoded auxiliary picture luma sample value is equal to the decoded auxiliary picture sample value for purposes of alpha blending. When equal to 1 indicates that, for purposes of alpha blending, after decoding the auxiliary picture samples, any auxiliary picture luma sample value that is greater than Min( alpha opaque value, alpha transparent value ) should be increased by one to obtain the interpretation sample value for the auxiliary picture sample and any auxiliary picture luma sample value that is less than or equal to Min( alpha opaque value, alpha transparent value ) should be used, without alteration, as the interpretation sample value for the decoded auxiliary picture sample value. When alpha transparent value is equal to alpha opaque value or Log2( Abs( alpha opaque value-- alpha transparent value ) ) does not have an integer value, alpha channel incr flag shall be equal to 0.
The alpha channel clip flag parameter, when equal to 0, indicates that no clipping operation is applied to obtain the interpretation sample values of the decoded auxiliary picture. When equal to 1 indicates that the interpretation sample values of the decoded auxiliary picture are altered according to the clipping process described by the alpha channel clip type flag syntax element.
The alpha channel clip type flag parameter, when equal to 0, indicates that, for purposes of alpha blending, after decoding the auxiliary picture samples, any auxiliary picture luma sample that is greater than ( alpha opaque value + alpha transparent value ) / 2 is set
equal to Max( alpha transparent value, alpha opaque value ) to obtain the interpretation sample value for the auxiliary picture luma sample and any auxiliary picture luma sample that is less or equal than ( alpha opaque value + alpha transparent value ) / 2 is set equal to Min( alpha transparent value, alpha opaque value ) to obtain the interpretation sample value for the auxiliary picture luma sample. When equal to 1 indicates that, for purposes of alpha blending, after decoding the auxiliary picture samples, any auxiliary picture luma sample that is greater than Max( alpha transparent value, alpha opaque value ) is set equal to Max( alpha transparent value, alpha opaque value ) to obtain the interpretation sample value for the auxiliary picture luma sample and any auxiliary picture luma sample that is less than or equal to Min( alpha transparent value, alpha opaque value ) is set equal to Min( alpha transparent value, alpha opaque value ) to obtain the interpretation sample value for the auxiliary picture luma sample.
In embodiments, the alpha channel use idc parameter of the ACI-SEI is set to 3. This indicates that the usage of the auxiliary picture is unspecified and that the decoder should ignore all subsequent information of the SEI. Note that when alpha channel use idc equals to 2, this indicates that the usage of the auxiliary picture is also unspecified, but the decoder does not ignore all subsequent information of the SEI. Values greater than 2 for alpha channel use idc are reserved for future use by ITU-T | ISO/IEC. When not present, the value of alpha channel use idc is inferred to be equal to 2. Therefore, to prevent the decoder of using the alpha channel conventionally, the alpha channel use idc is set equal to 3 and the other parameters of the ACI-SEI message can be set for example to the values of table 17.
A compliant receiver receiving such ACI-SEI message will deduce that the alpha map should not be used conventionally for alpha blending purposes. Since the receiver will also receive GMDA information, it will determine that the auxiliary picture of type AUX ALPHA
should be use as an attenuation map for energy reduction purposes and will apply the map according to the GMDA information. In case, no GMDA information is received, or the GMDA information does not contain any Attenuation Map related metadata, the decoder does not consider applying any attenuation to the decoded picture.
Instead of using an auxiliary picture of type AUX ALPHA, another type of auxiliary data may be used, as soon as the type fulfills the requirement to transmit the attenuation map or use a newly defined type. In at least one embodiment, the attenuation map is carried by an auxiliary picture of a new specific type “AUX_ATTENUATION” dedicated to energyreduction of pictures. Auxiliary pictures are defined in ISO/IEC 23002-3 Auxiliary Video Data Representations. Table 18 illustrates the addition of a new specific type of auxiliary picture named “AUX ATTENUATION” that will be used in relation with the new metadata defined above. This new type comes in addition to the conventional types of auxiliary pictures related to alpha plane and picture depth information. In such embodiment, the parameters for applying the attenuation are still carried by a GMDA information as described in Table 1.
Table 18
In at least one embodiment, several auxiliary pictures (of type AUX ALPHA or AUX ATTENUATION) are used to transmit the Attenuation Map. For example, three auxiliary pictures of type AUX_ATTENUATION are used, one per component Y, U, V of the Attenuation Map. In a variant, two auxiliary pictures may be used only, one for Y and another one for U and V. Any other combination can be envisioned. In the GMDA information, several sets of metadata are sent, each corresponding to one Attenuation Map. A total number of ami map number is considered and each set corresponds to an index i in the metadata. In such embodiment, additional information is needed to link the auxiliary pictures that represent components of a full Attenuation Map, so that the full Attenuation Map can be reconstructed by recombining the components (as described in figure 8) before being applied to the primary
image. Therefore, each component of the full Attenuation Map will also correspond to an index within the ami map number metadata. Furthermore, additional metadata are needed that give the information of which auxiliary pictures should be used in combination, as individual components of the Attenuation Map. The ami comp nbfi] parameter indicates the number of additional auxiliary pictures needed to reconstruct the full Attenuation Map, from the Attenuation Map of index i, corresponding to one component of the full Attenuation Map.
The ami_comp_idc[i][j] parameter indicates the index of one other set of metadata (i.e., related to the Attenuation Map of index j), and corresponding to another component of the full Attenuati on Map
The list of ami_comp_idc[i][j] corresponds to the other indices of Attenuation Maps that, while linked with the Attenuation Map of index i, will enable the reconstruction of the full Attenuation Map. Additional values are also proposed for the metadata ami_attenuation_comp_idc[i] and Table 5 is modified as shown in Table 19.
Table 19
The new definitions for table 19 with regard to table 5 are the values above 5. When ami_attenuation_comp_idc[ i ] equal to 6, the decoded auxiliary picture of index i contains one component and this component should be applied to the first component of the associated primary picture(s) decoded samples. When ami_attenuation_comp_idc[ i ] equal to 7, the decoded auxiliary picture of index i contains one component and this component should be applied to the second component of the associated primary picture(s) decoded samples. When ami_attenuation_comp_idc[ i ] equal to 8, the decoded auxiliary picture of index i contains one component and this component should be applied to the third component of the associated primary picture(s) decoded samples. When ami_attenuation_comp_idc[ i ] equal to 9, the mapping between the components of the decoded auxiliary picture of index i and the components of which to apply the decoded auxiliary picture of index i corresponds to a proprietary user-defined process.
In such embodiment, the syntax of Table 1 is modified as shown in Table 20, in order to insert the parameters supporting this embodiment using multiple auxiliary pictures.
Table 20
Figure 4 illustrates flowcharts of two examples of video encoding process using attenuation map information carried by an auxiliary picture of type AUX ALPHA according to at least one embodiment. The encoding processes 400 or 401 are implemented for example by an encoder 100 of figure 1 or a processor 1010 in a device 1000 of figure 3. The figure describes the metadata generation and the bitstream encapsulation process according to an embodiment, performed for example during the encoding, in compliance with the syntax introduced above. In this embodiment, the metadata is inserted for a given picture, i.e., the considered period corresponds to one picture. For the encoding process 400, in step 410, the device encodes the picture conventionally, resulting in a partial bitstream. In step 420, the device performs conventional partial bitstream decoding.
In step 430, an attenuation map corresponding to the decoded picture is computed for a selected energy reduction rate. In step 440, data corresponding to the parameters described in table 1 are collected, for example about the use of the attenuation map, the expected energy reduction, the corresponding expected quality, the pre-processing operation, etc. In step 450,
the auxiliary picture corresponding to the attenuation map is generated, before being encoded, and inserted in the partial bitstream in step 460.
In step 470, the ACI-SEI message is generated in a conventional manner with for example values set according to table 10. The ACI-SEI message is then encoded and inserted in the bitstream in step 471.
In step 475, the GMDA information is generated based on the collection information, for example according to the syntax defined in table 1. In step 480, the GMDA information is then inserted in an elementary bitstream of the program bitstream and carried using the file format specified in ISO/IEC 23001-10 or by using MPEG-2 systems as specified in ISO/IEC 13818-1.
In variant embodiments, the ordering of some of the steps may be altered, still relying on the same principles. For example, all the encoding steps may be performed in the same step.
As introduced earlier, this bitstream comprises at least an image of the video and in relation with this image, an attenuation map that corresponds to a selected energy reduction rate as well as metadata describing how to use the attenuation map. This allows a receiver, such as a display device for example, to determine from this bitstream an image of the video and to apply an attenuation map on this image, that will allow a reduction of the energy consumption when using (e.g., displaying) the video.
In at least one embodiment, several attenuation maps are computed for different energy reduction rates. For example, two attenuation maps with energy reduction rates
and R2 may be computed. This allows to interpolate a corresponding attenuation map at the decoder side for any other reduction rate R such as R1 < R < R2. The process 401 of figure 4 illustrates a flowchart of such embodiment. Most of the steps are identical to the steps of the process 400. The difference is related to the iteration 425 that is done for a selected set of energy reduction rates. Thus, an attenuation map is generated for each of the energy reduction rates, the set of corresponding auxiliary pictures are inserted into the bitstream and the corresponding ACI-SEI messages are generated.
As a result of this embodiment, the bitstream comprises at least an image of the video and in relation with this image, a set of different attenuation maps that correspond to a set of selected energy reduction rates. This allows a receiver, such as a display device for example, to determine from this bitstream an image of the video that will allow a reduction of the energy
consumption when displaying the video according to an energy reduction rate not in the list of reduction rates, thanks to the plurality of attention maps and parameters carried by the bitstream.
Again, the ordering of some of the steps can be changed while still being based on the same principles. For example, all auxiliary pictures may be inserted at once, outside of the loop on energy reduction rates. Another example comprises generating and inserting the GMDA information for each auxiliary picture, i.e., inside the loop 425 on energy reduction rates. Another example comprises generating and inserting the ACI-SEI messages outside the loop 425, for all auxiliary pictures at once.
In another embodiment, the computation of the attenuation map and the collection of the associated metadata is realized outside of the encoder, for example in a dedicated device, and these data are for example stored in a database accessible by the encoder. These additional data are then provided to the encoder together with the input video in a process similar to the processes 400 and 401.
In another embodiment, the auxiliary data is sent without any accompanying metadata in the GMDA related to the use of the Attenuation Maps and thus without having in the bitstream the parameters to benefit from the attenuation map. This use case targets specific decoding devices that have a predetermined behavior. An example of such device is an advertisement display panel. In this case, a default mode is defined with the default values for these parameters. There is persistence of these values for the whole bitstream. The default values are as shown in Table 21. The ami video id is not present in the table since it is dynamic. By default, the attenuation map should apply to all the video elementary streams of the program.
Table 21
Figure 5 illustrates a flowchart of an example of video decoding process using attenuation map information carried by an auxiliary picture of type AUX ALPHA according to at least one embodiment.
As introduced above, the usage of the attenuation map is guided by the accompanying metadata carried by GMDA information together with an ACI-SEI message. The process 500 can be applied to a picture or to a group of pictures or to a part of a picture (slice, tile), according to the level of signaling of the metadata. The following description illustrates the case where the process is applied to a single picture and there is only one attenuation map, but the other cases are similar and based on the same steps.
In step 510, the picture is decoded from the bitstream, producing a decoded picture 511. In step 520, the ACI-SEI message data are retrieved from the bitstream for a picture and decoded to provide the different parameter values as described according to table 10. In step 523, the process verifies that the alpha channel use idc parameter of the ACI-SEI message is set to 3. Indeed, this indicates that the AUX ALPHA picture should not be used for conventional alpha blending purposes and that the decoder should ignore the rest of the ACI- SEI message. If it is not the case (branch “No”, parameter is not set to 3), then the processor jumps to step 530, and the decoding may be done conventionally without using an attenuation map. Else, in step 524, the processor obtains the GMDA information. In step 525, the display model 521 of the end device is checked against the decoded ami display model parameter of the GMDA information. If the display model 521 of the end device is not compatible with the ami display model parameter (branch “No”), then the processor jumps to step 530, and the decoding may be done conventionally without using an attenuation map. Otherwise, in step 535, a mapping with the attenuation map carried by the auxiliary data of type AUX ALPHA and its corresponding decoded picture is done by the use of ami video id which gives the identifier of the program elementary stream that contains the video to which the attenuation map should be applied and ami map idfi] which gives the identifier of the program elementary stream that contains the auxiliary picture corresponding to the Attenuation Map of index i. In step 540, the auxiliary data corresponding to the picture is decoded and produces a decoded attenuation map 541. In step 550, the ami_preprocessing_flag parameter is checked
to get the information whether an upsampling process should be applied to the decoded attenuation map 541. In step 560, in case an upsampling process is to be applied (branch “yes” of step 550), the decoded attenuation map 541 is upsampled according to the process given by the ami_preprocessing_idc parameter. In step 565, it is further rescaled according to a scaling factor described by ami_preprocessing_scale_idc and ami max value parameters. Then, in step 580, the upsampled and rescaled attenuation map is applied on the decoded picture 511 according to the process described by ami attenuation use idc and ami attenuation comp idc to produce an energy-reduced image 581 that is further sent to the display in step 590. In case no upsampling of the attenuation map 541 is required (branch “no” of step 550), the attenuation map is first rescaled according to a scaling factor described by ami_preprocessing_scale_idc and ami max value in step 555. Then, the rescaled attenuation map is applied on the decoded picture in step 570 according to the process described by ami attenuation use idc and ami attenuation comp idc to produce a reduced image 581 that is further sent to the display in step 590.
Figure 6 illustrates flowcharts of two examples of video encoding process using attenuation map information carried by an auxiliary picture of type AUX ATTENUATION according to at least one embodiment. Such encoding processes 600 and 601 are very similar to the processes 400 and 401 of figure 4. The difference relies mainly on the use of an auxiliary picture of type AUX ATTENUATION generated in step 650 (instead of an auxiliary picture of type AUX ALPHA) as well as the encoding and insertion of this auxiliary picture in the bitstream in step 660. In such embodiment, no ACI-SEI message is needed so that the steps 470 and 471 of figure 4 are not needed anymore. The other steps are identical to the corresponding steps of figure 4.
Figure 7 illustrates a flowchart of an example of video decoding process using attenuation map information carried by an auxiliary picture of type AUX ATTENUATION according to at least one embodiment. This decoding process 700 is very similar to the process 500 of figure 5. The difference relies mainly on the use of an auxiliary picture of type AUX_ATTENUATION decoded in step 740 (instead of an auxiliary picture of type AUX_ALPHA). No ACI-SEI message is needed so that the steps 520 and 523 of figure 5 are not needed anymore. The other steps are identical to the corresponding steps of figure 5.
Figure 8 illustrates a flowchart of an example of video decoding process using attenuation map information based on multiple components carried by multiple separate auxiliary pictures of type AUX ALPHA according to at least one embodiment. This decoding process 800 is very similar to the decoding process 500 of figure 5. In this embodiment, the differences rely mainly on an additional iteration to retrieve the separate attenuation map components according to ami comp number, apply the separate attenuation map components in steps 870 and 880 to corresponding components of the decoded picture and combine the resulting components in step 885.
This principle of using multiple components carried by separate auxiliary pictures can be adapted to use multiple separate auxiliary pictures of type AUX ATTENUATION. In such embodiment, the steps 820 and 823 are not present anymore.
In at least one embodiment, the use of the attenuation map is disabled if the expected quality for the end device is higher than the quality given by ami video quality.
In at least one embodiment, in case a single attenuation map is provided, a given energy reduction rate R (i.e., corresponding to an expected Energy Reduction Rate) is checked against the ami energy reduction rate provided in the SEI message. If this rate corresponds to the ami energy reduction rate of the transmitted attenuation map, the corresponding attenuation map is applied to the decoded picture. If the energy reduction rate R is lower than the transmitted ami energy reduction rate, a new attenuation map corresponding to R is inferred by extrapolating this new attenuation map from the transmitted attenuation map according to the process described by ami map approximation model. An example of such a process can be a simple linear scaling of the attenuation map.
In at least one embodiment, if two attenuation maps are provided for two different ami energy reduction rates R and R2, a given energy reduction rate R is checked against rates R and R2 . If this rate corresponds to one of rates R1 and R2 , the corresponding processed attenuation map (as described in the previous embodiment) is applied on the decoded picture. If the energy reduction rate R is such that that R1 < R < R2, a new attenuation map corresponding to R is inferred by extrapolating this new attenuation map from the decoded attenuation maps corresponding to R1 and R2 according to the process described by ami map approximation model. An example of such a process can be a pixel
wise linear or a bicubic interpolation between the two attenuation maps corresponding to
and R2. If R is bigger than both R and R2, an interpolation process on the attenuation map by linear scaling from the largest energy reduction rate can also be envisioned, but with no warranty on the resulting quality of the reduced picture. This embodiment is easily extended to more than two transmitted attenuation maps.
In at least one embodiment, the use of the attenuation map is disabled for some content pictures depending on: the image category (e.g., sport images, gaming images, etc.), the display settings (e.g., cinema mode, etc.), etc.
In at least one embodiment, the method is disabled for specific content for which the energy reduction will not be significant. For example, dark content would lead to very low energy reduction whatever the technique. In this embodiment, the total amount of luminance per picture is computed and, when lower than a given threshold, the energy reduction method is disabled. Alternatively, this might be disabled per GOP, per shot, per movie.
In at least one embodiment, an additional check is added to verify that a pixel spatially belongs to a subset of the image to be processed, for example belonging to a region where the energy reduction is not desired. Such a region or mask can be based on, for example, a spatiotemporal just noticeable difference (JND) map, a motion field, a saliency map, a gaze tracking information or other pixel-wise information. When a pixel does not belong to this region or mask, the attenuation map is not applied to this pixel. This check can be done before or after the upsampling if any of the attenuation map.
In the case where a post-processing process is applied on the decoded picture, such post-processing should be taken into account before applying the attenuation map. Two embodiments are possible. In at least one embodiment, post-processing operations are taken into account while building the attenuation map. For example, this may be done in the process 400 by introducing an additional step between steps 420 and 430 to add a post processing operation on the decoded picture before building the attenuation map. In this case, at the receiver side, the attenuation map should be applied after all the post-processing operations. In at least one embodiment, these post-processing intervene independently of the encodingdecoding process and in this case, the attenuation map should be adapted to take these postprocessing into account, before applying it on the decoded and post processed picture.
In the case where the attenuation map was computed on an input image in a given color gamut, the decoded picture should be converted to this color gamut before applying the attenuation map. At least one embodiment further comprises checking that the color gamut of
the decoded picture corresponds to the color gamut while computing the attenuation map. This can be done by sending the color gamut of the attenuation map’s computation together with the metadata.
For transmissive pixel displays, such as backlight displays for example, the attenuation map cannot be applied directly. However, it is possible to use the attenuation map to provide guidance to a backlight scaling algorithm. Indeed, for backlight displays, a strong contributor to the energy consumption of the display is the backlight.
In at least one embodiment, an attenuation map is applied on transmissive pixels displays by determining the minimal value, average value, or any other global value from the attenuation map and using this information to guide the backlight of the transmissive pixels displays. In the particular case of local dimming displays where the backlighting is split into different regions, the attenuation map could be previously split in regions corresponding to the local dimming regions of the display, before determining the minimal value, average value, or any other global value from the attenuation map and using this information to guide the backlight.
Figure 9 illustrates examples of sequence diagrams representing the information exchange between a transmitter and a receiver according to further embodiments. Indeed, the embodiments described above assume that no signaling mechanism exists from the receiver to the transmitter. In further embodiments, it is considered that a signaling mechanism exists from the receiver to the transmitter, in other words, a return channel is available.
In at least one embodiment illustrated by the exchange 900, information, such as the display type and/or a given energy reduction rate for example, can be transmitted back, in step 920, from the receiver to the transmitter. The transmitter takes such information into account to provide, in step 930, a visual content and corresponding metadata that better matches the information obtained from the receiver. For example, it may compute a specific attenuation map with an energy reduction rate provided by the receiver. It may adapt already precomputed attenuation maps to a specific energy reduction rate. It may compute a specific attenuation map corresponding to a video quality level provided by the receiver. It may compute a specific attenuation map corresponding to a display model provided by the receiver. In this case, the GMDA information may be adapted to take into account the specific information and constraints obtained from the receiver.
In such embodiment, the receiver provides information related to the energy reduction application to the transmitter using the syntax described in Table 22. As stated above, such information is identified by the ‘ami’ prefix and added to the information related to display power reduction as defined in ISO/IEC 23001-11. In this table, only one of the ami energy reduction rate or ami video quality is required. The other is optional and may be omitted.
Table 22 In response, the transmitter does not need to send back the GMDA information as described in Table 1 but only a subset as shown in Table 23.
Table 23
In at least one embodiment illustrated by the exchange 901, the receiver first obtains, in step 911, a list of ids corresponding to a set of attenuation maps (for example, ami_map_id=0 for an ami reduction rate = 10%, ami_map_id=l for an ami reduction rate = 20%, ami_map_id=2 for an ami_reduction_rate = 40%). In this case, the receiver may select one of the map id, in step 921. The syntax from the receiver to the transmitter is modified as shown in Table 24, with an ami map id corresponding for example to a selected reduction rate.
Table 24
In response, in step 931, the transmitter provides to the receiver the modified GMDA information as shown in table 25. In this table, the information corresponds to a selected map id.
Table 25
In at least one embodiment, the receiver requests all information related to several ami map id (for example, it needs three auxiliary pictures to reconstruct one full Attenuation Map). In this case the syntax from the receiver to the transmitter is shown in Table 26.
Table 26
In response, the transmitter provides to the receiver the modified GMDA information as shown in table 27. In this table, the information corresponds to the attenuation maps that correspond to the list of ami map id sent by the receiver.
Table 27
In at least one embodiment, the receiver display is of type OLED and therefore does not require any information related to backlight adaptation. In this case, the first part of the table 27 related to conventional display adaptation functions is discarded and the syntax is modified as shown in Table 28.
Table 28
In response, the transmitter provides to the receiver the modified GMDA information as shown in table 29 that does not comprise anymore the information related to backlight adaptation.
Table 29
All previous embodiments in the case where a signaling mechanism exists from the receiver to the transmitter apply to this embodiment as described in table 28 and 29.
The decoding processes 500 of figure 5, 700 of figure 7 and 800 of figure 8 are implemented for example by a decoder 200 of figure 2, by a processor 1010 in a device 1000 of figure 3 or by various electronic devices such as smartphones, tablet computers, digital multimedia set top boxes, digital television receivers, personal video recording systems, connected home appliances, personal computers, laptop computers, and servers.
At least one example of an embodiment can involve a device including an apparatus as described herein and at least one of (i) an antenna configured to receive a signal, the signal including data representative of the image information, (ii) a band limiter configured to limit the received signal to a band of frequencies that includes the data representative of the image information, and (iii) a display configured to display an image from the image information.
At least one example of an embodiment can involve a device as described herein, wherein the device comprises one of a television, a television signal receiver, a set-top box, a gateway device, a mobile device, a cell phone, a tablet, a computer, a laptop, or other electronic device.
In general, another example of an embodiment can involve a bitstream or signal formatted to include syntax elements and picture information, wherein the syntax elements are produced, and the picture information is encoded by processing based on any one or more of the examples of embodiments of methods in accordance with the present disclosure.
In general, one or more other examples of embodiments can also provide a computer readable storage medium, e.g., a non-volatile computer readable storage medium, having stored thereon instructions for encoding or decoding picture information such as video data
according to the methods or the apparatus described herein. One or more embodiments can also provide a computer readable storage medium having stored thereon a bitstream generated according to methods or apparatus described herein. One or more embodiments can also provide methods and apparatus for transmitting or receiving a bitstream or signal generated according to methods or apparatus described herein.
Many of the examples of embodiments described herein are described with specificity and, at least to show the individual characteristics, are often described in a manner that may sound limiting. However, this is for purposes of clarity in description, and does not limit the application or scope of those aspects. Indeed, all of the different aspects can be combined and interchanged to provide further aspects. Moreover, the embodiments, features, etc. can be combined and interchanged with others described in earlier filings as well.
Various implementations involve decoding. “Decoding”, as used in this application, can encompass all or part of the processes performed, for example, on a received encoded sequence in order to produce a final output suitable for display. In various embodiments, such processes include one or more of the processes typically performed by a decoder, for example, entropy decoding, inverse quantization, inverse transformation, and differential decoding. In various embodiments, such processes also, or alternatively, include processes performed by a decoder of various implementations described in this application.
As further examples, in one embodiment “decoding” refers only to entropy decoding, in another embodiment “decoding” refers only to differential decoding, and in another embodiment “decoding” refers to a combination of entropy decoding and differential decoding. Whether the phrase “decoding process” is intended to refer specifically to a subset of operations or generally to the broader decoding process will be clear based on the context of the specific descriptions and is believed to be well understood by those skilled in the art.
Various implementations involve encoding. In an analogous way to the above discussion about “decoding”, “encoding” as used in this application can encompass all or part of the processes performed, for example, on an input video sequence in order to produce an encoded bitstream. In various embodiments, such processes include one or more of the processes typically performed by an encoder, for example, partitioning, differential encoding, transformation, quantization, and entropy encoding.
As further examples, in one embodiment “encoding” refers only to entropy encoding, in another embodiment “encoding” refers only to differential encoding, and in another embodiment “encoding” refers to a combination of differential encoding and entropy
encoding. Whether the phrase “encoding process” is intended to refer specifically to a subset of operations or generally to the broader encoding process will be clear based on the context of the specific descriptions and is believed to be well understood by those skilled in the art.
Note that the syntax elements as used herein are descriptive terms. As such, they do not preclude the use of other syntax element names.
When a figure is presented as a flow diagram, it should be understood that it also provides a block diagram of a corresponding apparatus. Similarly, when a figure is presented as a block diagram, it should be understood that it also provides a flow diagram of a corresponding method/process.
In general, the examples of embodiments, implementations, features, etc., described herein can be implemented in, for example, a method or a process, an apparatus, a software program, a data stream, or a signal. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method), the implementation of features discussed can also be implemented in other forms (for example, an apparatus or program). An apparatus can be implemented in, for example, appropriate hardware, software, and firmware. One or more examples of methods can be implemented in, for example, a processor, which refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device. Processors also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants ("PDAs"), and other devices that facilitate communication of information between end-users. Also, use of the term "processor" herein is intended to broadly encompass various configurations of one processor or more than one processor.
Reference to “one embodiment” or “an embodiment” or “one implementation” or “an implementation”, as well as other variations thereof, means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment. Thus, the appearances of the phrase “in one embodiment” or “in an embodiment” or “in one implementation” or “in an implementation”, as well any other variations, appearing in various places throughout this application are not necessarily all referring to the same embodiment.
Additionally, this application may refer to “determining” various pieces of information. Determining the information can include one or more of, for example, estimating the information, calculating the information, predicting the information, or retrieving the information from memory.
Further, this application may refer to “accessing” various pieces of information. Accessing the information can include one or more of, for example, receiving the information, retrieving the information (for example, from memory), storing the information, moving the information, copying the information, calculating the information, determining the information, predicting the information, or estimating the information.
Additionally, this application may refer to “receiving” various pieces of information. Receiving is, as with “accessing”, intended to be a broad term. Receiving the information can include one or more of, for example, accessing the information, or retrieving the information (for example, from memory). Further, “receiving” is typically involved, in one way or another, during operations such as, for example, storing the information, processing the information, transmitting the information, moving the information, copying the information, erasing the information, calculating the information, determining the information, predicting the information, or estimating the information.
It is to be appreciated that the use of any of the following “/”, “and/or”, and “at least one of’, for example, in the cases of “A/B”, “A and/or B” and “at least one of A and B”, is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of both options (A and B). As a further example, in the cases of “A, B, and/or C” and “at least one of A, B, and C”, such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C). This may be extended, as is clear to one of ordinary skill in this and related arts, for as many items as are listed.
As will be evident to one of ordinary skill in the art, implementations can produce a variety of signals formatted to carry information that can be, for example, stored or transmitted. The information can include, for example, instructions for performing a method, or data produced by one of the described implementations. For example, a signal can be formatted to carry the bitstream of a described embodiment. Such a signal can be formatted, for example, as an electromagnetic wave (for example, using a radio frequency portion of spectrum) or as a baseband signal. The formatting can include, for example, encoding a data stream and modulating a carrier with the encoded data stream. The information that the signal carries can be, for example, analog or digital information. The signal can be transmitted over a variety
of different wired or wireless links, as is known. The signal can be stored on a processor- readable medium.
Various embodiments are described herein. Features of these embodiments can be provided alone or in any combination, across various claim categories and types.
Claims
1. A method comprising: obtaining encoded data comprising an image, an attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation for applying the attenuation map to the image, and a second parameter representative of a mapping between components of the attenuation map and components of the image affected by the operation; obtaining an attenuated image by applying the attenuation map to the image to reduce values of components of the image by performing an operation based on the first parameter on components of the image selected based on the second parameter; and providing the attenuated image, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
2. The method of claim 1 wherein the set of parameters comprises at least a third parameter representative of a selected energy reduction rate.
3. The method of claim 1 or 2 wherein the set of parameters comprises at least a fourth parameter representative of the attenuation map.
4. The method of any of claims 1 to 3 wherein the first parameter is selected in a set of operations comprising an addition, a subtraction, a multiplication, a contrast sensitivity function and a division.
5. The method of any of claims 1 to 4 wherein the second parameter is selected in a set comprising applying a component of the attenuation map to luma component, applying a component of the attenuation map to luma and both chroma components, applying a component of the attenuation map to three RGB components, applying a first component of the attenuation map to luma component and a second component of the attenuation map to both chroma components, applying three components of the attenuation map respectively to luma component and both chroma components, applying three components of the attenuation map respectively to three RGB components, applying one component of the attenuation map
to the first component, applying one component of the attenuation map to the second component, and applying one component of the attenuation map to the third component.
6. The method of any of claims 1 to 5 wherein a plurality of attenuation maps is used, wherein each of the attenuation maps is associated with a respective energy reduction rate.
7. The method of claim 6 further comprising performing an interpolation between a first attenuation map associated with a first energy reduction rate and a second attenuation map associated with a second energy reduction rate to obtain an attenuation map corresponding to a third attenuation rate comprised in a range comprised between the first and the second energy reduction rates.
8. The method of any of claims 1 to 7 wherein the set of parameters further comprises a fifth parameter representative of a pre-processing operation and wherein the pre-processing operation based on third parameter is performed on the attenuation map prior to its application to the image.
9. The method of claim 8 wherein the pre-processing operation is an up-sampling operation from a resolution of the attenuation map to a resolution of the image and the fifth parameter determines a type of up-sampling function selected among a set of up-sampling functions comprising at least linear scaling, bilinear interpolation, Lanczos, and bicubic.
10. The method of any of claim 1 to 9 further comprising using multiple attenuation maps for multiple components of the image and further comprising parameters representative of the mapping between the components of the multiple attenuation maps and the image components affected by the operation.
11. The method of any of claim 1 to 10, wherein applying the attenuation map to the image reduces the energy required to display the image.
12. The method of any of claims 1 to 11 wherein the auxiliary picture carrying the attenuation map is an auxiliary picture for alpha blending identified as AUX ALPHA.
13. The method of any of claims 1 to 11 wherein the auxiliary picture carrying the attenuation map is an auxiliary picture of a specific type identified as AUX ATTENUATION.
14. A method comprising: obtaining an input image of a video; determining an attenuation map based on the input image according to a selected energy reduction rate, wherein applying the attenuation map to the input image reduces values of components of the input image; and generating an encoded video comprising at least the input image, the attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation for applying the attenuation map to an image, and a second parameter representative of a mapping between components of the attenuation map and components of the image affected by the operation, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
15. The method of claim 14 wherein the set of parameters comprises at least a third parameter representative of a selected energy reduction rate.
16. The method of claim 14 or 15 wherein the set of parameters comprises at least a fourth parameter representative of the attenuation map.
17. The method of any of claims 14 to 16 wherein the first parameter is selected in a set of operations comprising an addition, a subtraction, a multiplication, a contrast sensitivity function and a division.
18. The method of any of claims 14 to 17 wherein the second parameter is selected in a set comprising applying a component of the attenuation map to luma component, applying a component of the attenuation map to luma and both chroma components, applying a component of the attenuation map to three RGB components, applying a first component of the attenuation map to luma component and a second component of the attenuation map to both chroma components, applying three components of the attenuation map respectively to
luma component and both chroma components, applying three components of the attenuation map respectively to three RGB components, applying one component of the attenuation map to the first component, applying one component of the attenuation map to the second component, and applying one component of the attenuation map to the third component.
19. The method of any of claims 14 to 18 wherein a plurality of attenuation maps is used and each of the attenuation maps is associated with a respective energy reduction.
20. The method of claim 19 further comprising performing an interpolation between a first attenuation map associated with a first energy reduction rate and a second attenuation map associated with a second energy reduction rate to obtain an attenuation map corresponding to a third attenuation rate comprised in a range comprised between the first and the second energy reduction rates.
21. The method of any of claims 14 to 20 wherein the set of parameters further comprises a fifth parameter representative of a pre-processing operation and wherein the pre-processing operation based on third parameter is performed on the attenuation map prior to its application to the image.
22. The method of claim 21 wherein the pre-processing operation is an up-sampling operation from a resolution of the attenuation map to a resolution of the image and the fifth parameter determines a type of up-sampling function selected among a set of up-sampling functions comprising at least linear scaling, bilinear interpolation, Lanczos, and bicubic.
23. The method of any of claim 14 to 22 further comprising using multiple attenuation maps for multiple components of the image and further comprising parameters representative of the mapping between the components of the multiple attenuation maps and the image components affected by the operation.
24. The method of any of claim 14 to 23, wherein applying the attenuation map to the image reduces the energy required to display the image.
25. The method of any of claims 14 to 24 wherein the auxiliary picture carrying the attenuation map is an auxiliary picture for alpha blending identified as AUX ALPHA.
26. The method of any of claims 14 to 24 wherein the auxiliary picture carrying the attenuation map is an auxiliary picture of specific type identified as AUX ATTENUATION.
27. An apparatus comprising a processor configured to: obtain encoded data comprising an image, an attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation for applying the attenuation map to the image, and a second parameter representative of a mapping between components of the attenuation map and components of the image affected by the operation; obtain an attenuated image by applying the attenuation map to the image to reduce values of components of the image by performing an operation based on the first parameter on components of the image selected based on the second parameter; and provide the attenuated image, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
28. The apparatus of claim 27 wherein the set of parameters comprises at least a third parameter representative of a selected energy reduction rate.
29. The apparatus of claim 27 or 28 wherein the set of parameters comprises at least a fourth parameter representative of the attenuation map.
30. The apparatus of any of claims 27 to 29 wherein the first parameter is selected in a set of operations comprising an addition, a subtraction, a multiplication, a contrast sensitivity function and a division.
31. The apparatus of any of claims 27 to 30 wherein the second parameter is selected in a set comprising applying a component of the attenuation map to luma component, applying a component of the attenuation map to luma and both chroma components, applying a component of the attenuation map to three RGB components, applying a first component of
the attenuation map to luma component and a second component of the attenuation map to both chroma components, applying three components of the attenuation map respectively to luma component and both chroma components, applying three components of the attenuation map respectively to three RGB components, applying one component of the attenuation map to the first component, applying one component of the attenuation map to the second component, and applying one component of the attenuation map to the third component.
32. The apparatus of any of claims 27 to 31 wherein a plurality of attenuation maps is used, wherein each of the attenuation maps is associated with a respective energy reduction rate.
33. The apparatus of claim 32 further comprising performing an interpolation between a first attenuation map associated with a first energy reduction rate and a second attenuation map associated with a second energy reduction rate to obtain an attenuation map corresponding to a third attenuation rate comprised in a range comprised between the first and the second energy reduction rates.
34. The apparatus of any of claims 27 to 33 wherein the set of parameters further comprises a fifth parameter representative of a pre-processing operation and wherein the pre-processing operation based on third parameter is performed on the attenuation map prior to its application to the image.
35. The apparatus of claim 34 wherein the pre-processing operation is an up-sampling operation from a resolution of the attenuation map to a resolution of the image and the fifth parameter determines a type of up-sampling function selected among a set of up-sampling functions comprising at least linear scaling, bilinear interpolation, Lanczos, and bicubic.
36. The apparatus of any of claim 27 to 35 further comprising using multiple attenuation maps for multiple components of the image and further comprising parameters representative of the mapping between the components of the multiple attenuation maps and the image components affected by the operation.
37. The apparatus of any of claim 27 to 36, wherein applying the attenuation map to the image reduces the energy required to display the image.
38. The apparatus of any of claims 27 to 37 wherein the auxiliary picture carrying the attenuation map is an auxiliary picture for alpha blending identified as AUX ALPHA.
39. The apparatus of any of claims 27 to 37 wherein the auxiliary picture carrying the attenuation map is an auxiliary picture of specific type identified as AUX ATTENUATION.
40. An apparatus comprising a processor configured to: obtain an input image of a video; determine an attenuation map based on the input image according to a selected energy reduction rate, wherein applying the attenuation map to the input image reduces values of components of the input image; and generate an encoded video comprising at least the input image, the attenuation map and a set of parameters, wherein the set of parameters comprises at least a first parameter representative of an operation for applying the attenuation map to an image, and a second parameter representative of a mapping between components of the attenuation map and components of the image affected by the operation, wherein the parameters are carried through MPEG green metadata display adaptation syntax elements.
41. The apparatus of claim 40 wherein the set of parameters comprises at least a third parameter representative of the selected energy reduction rate.
42. The apparatus of claim 40 or 41 wherein the set of parameters comprises at least a fourth parameter representative of the attenuation map.
43. The apparatus of any of claims 40 to 42 wherein the first parameter is selected in a set of operations comprising an addition, a subtraction, a multiplication, a contrast sensitivity function and a division.
44. The apparatus of any of claims 40 to 43 wherein the second parameter is selected in a set comprising applying a component of the attenuation map to luma component, applying a component of the attenuation map to luma and both chroma components, applying a component of the attenuation map to three RGB components, applying a first component of the attenuation map to luma component and a second component of the attenuation map to both chroma components, applying three components of the attenuation map respectively to luma component and both chroma components, applying three components of the attenuation map respectively to three RGB components, applying one component of the attenuation map to the first component, applying one component of the attenuation map to the second component, and applying one component of the attenuation map to the third component.
45. The apparatus of any of claims 40 to 44 wherein a plurality of attenuation maps is used and each of the attenuation maps is associated with a respective energy reduction.
46. The apparatus of claim 45 further comprising performing an interpolation between a first attenuation map associated with a first energy reduction rate and a second attenuation map associated with a second energy reduction rate to obtain an attenuation map corresponding to a third attenuation rate comprised in a range comprised between the first and the second energy reduction rates.
47. The apparatus of any of claims 40 to 46 wherein the set of parameters further comprises a fifth parameter representative of a pre-processing operation and wherein the pre-processing operation based on third parameter is performed on the attenuation map prior to its application to the image.
48. The apparatus of claim 47 wherein the pre-processing operation is an up-sampling operation from a resolution of the attenuation map to a resolution of the image and the fifth parameter determines a type of up-sampling function selected among a set of up-sampling functions comprising at least linear scaling, bilinear interpolation, Lanczos, and bicubic.
49. The apparatus of any of claim 40 to 48 further comprising using multiple attenuation maps for multiple components of the image and further comprising parameters representative of the
mapping between the components of the multiple attenuation maps and the image components affected by the operation.
50. The apparatus of any of claim 40 to 49, wherein applying the attenuation map to the image reduces the energy required to display the image.
51. The apparatus of any of claims 40 to 50 wherein the auxiliary picture carrying the attenuation map is an auxiliary picture for alpha blending identified as AUX ALPHA.
52. The apparatus of any of claims 40 to 50 wherein the auxiliary picture carrying the attenuation map is an auxiliary picture of specific type identified as AUX ATTENUATION.
53. The apparatus of any of claims 27 to 52, wherein the device is one of a television, a television signal receiver, a set-top box, a gateway device, a mobile device, a cell phone, a tablet, a computer, a laptop, or other electronic device.
54. A bitstream, formatted to include syntax elements in accordance with the method of any of claims 11 to 20.
55. A signal comprising data generated according to the method of any of claims 14 to 26.
56. A computer program comprising instructions, which, when executed by a computer, cause the computer to carry out the method according to any of claims 1 to 26.
57. A non-transitory computer readable medium storing executable program instructions to cause a computer executing the instructions to perform a method according to any of claims 1 to 26.
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP23305566.4 | 2023-04-14 | ||
EP23305566 | 2023-04-14 | ||
EP23306068.0 | 2023-06-29 | ||
EP23306068 | 2023-06-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024213421A1 true WO2024213421A1 (en) | 2024-10-17 |
Family
ID=90718556
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/EP2024/058768 WO2024213421A1 (en) | 2023-04-14 | 2024-03-29 | Method and device for energy reduction of visual content based on attenuation map using mpeg display adaptation |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024213421A1 (en) |
-
2024
- 2024-03-29 WO PCT/EP2024/058768 patent/WO2024213421A1/en unknown
Non-Patent Citations (6)
Title |
---|
"WD of ISO/IEC 23001-11 AMD 2 Energy-efficient media consumption for new display power reduction metadata", no. n22636, 5 June 2023 (2023-06-05), XP030310949, Retrieved from the Internet <URL:https://dms.mpeg.expert/doc_end_user/documents/142_Antalya/wg11/MDS22636_WG03_N00893.zip WD_ISO_IEC_23001-11_ed.3_ProposedAmendment.docx> [retrieved on 20230605] * |
C-H DEMARTY (INTERDIGITAL) ET AL: "AHG9: Attenuation Map Information SEI for reducing energy consumption of displays", no. JVET-AC0122 ; m61700, 13 January 2023 (2023-01-13), XP030306682, Retrieved from the Internet <URL:https://jvet-experts.org/doc_end_user/documents/29_Teleconference/wg11/JVET-AC0122-v4.zip 2023-01-AttenuationMap-v2.docx> [retrieved on 20230113] * |
FRANCOIS (INTERDIGITAL) E ET AL: "[19] BoG Report on Carriage of Green metadata", no. m62358, 20 January 2023 (2023-01-20), XP030308207, Retrieved from the Internet <URL:https://dms.mpeg.expert/doc_end_user/documents/141_OnLine/wg11/m62358-v3-m62358_BogGreenReport_v3.zip m62358_BogGreenReport_v3.docx> [retrieved on 20230120] * |
HERGLOTZ (FAU) C ET AL: "AHG9: Green Metadata SEI message for VVC", no. JVET-W0071 ; m57185, 1 July 2021 (2021-07-01), XP030295936, Retrieved from the Internet <URL:https://jvet-experts.org/doc_end_user/documents/23_Teleconference/wg11/JVET-W0071-v2.zip JVET-V2006-v1-greenMetadataAdded.docx> [retrieved on 20210701] * |
J-R OHM: "Meeting Report of the 29th JVET Meeting", no. JVET-AC1000 ; m62425, 17 February 2023 (2023-02-17), XP030308361, Retrieved from the Internet <URL:https://jvet-experts.org/doc_end_user/documents/29_Teleconference/wg11/JVET-AC1000-v1.zip JVET-AC1000-v1.docx> [retrieved on 20230217] * |
LE MEUR (INTERDIGITAL) O ET AL: "AHG9: Attenuation Map Information SEI for reducing energy consumption of displays", no. JVET-AD0121 ; m62762, 14 April 2023 (2023-04-14), XP030308770, Retrieved from the Internet <URL:https://jvet-experts.org/doc_end_user/documents/30_Antalya/wg11/JVET-AD0121-v1.zip JVET-AD0121-v1.docx> [retrieved on 20230414] * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10356444B2 (en) | Method and apparatus for encoding and decoding high dynamic range (HDR) videos | |
CN107258082B (en) | Method and apparatus for encoding color mapping information and processing image based on color mapping information | |
US20180005357A1 (en) | Method and device for mapping a hdr picture to a sdr picture and corresponding sdr to hdr mapping method and device | |
EP3051486A1 (en) | Method and apparatus for encoding and decoding high dynamic range (HDR) videos | |
WO2022167322A1 (en) | Spatial local illumination compensation | |
WO2022221374A9 (en) | A method and an apparatus for encoding/decoding images and videos using artificial neural network based tools | |
EP3939322A1 (en) | In-loop reshaping adaptive reshaper direction | |
US20240275960A1 (en) | High-level syntax for picture resampling | |
WO2020117781A1 (en) | Method and apparatus for video encoding and decoding with adjusting the quantization parameter to block size | |
EP3672219A1 (en) | Method and device for determining control parameters for mapping an input image with a high dynamic range to an output image with a lower dynamic range | |
WO2024213421A1 (en) | Method and device for energy reduction of visual content based on attenuation map using mpeg display adaptation | |
WO2024213420A1 (en) | Method and device for encoding and decoding attenuation map based on green mpeg for energy aware images | |
WO2024126030A1 (en) | Method and device for encoding and decoding attenuation map for energy aware images | |
CN116438795A (en) | Spatial resolution adaptation for in-loop filtering and post-filtering of compressed video using metadata | |
US20220385928A1 (en) | Processing a point cloud | |
WO2023155576A1 (en) | Encoding/decoding video picture data | |
US12047612B2 (en) | Luma mapping with chroma scaling (LMCS) lut extension and clipping | |
WO2024001471A1 (en) | Encoding/decoding a video sequence associated with alpha channel information | |
US20230394636A1 (en) | Method, device and apparatus for avoiding chroma clipping in a tone mapper while maintaining saturation and preserving hue | |
US20230262268A1 (en) | Chroma format dependent quantization matrices for video encoding and decoding | |
US20220368912A1 (en) | Derivation of quantization matrices for joint cb-br coding | |
WO2024156544A1 (en) | Energy aware sl-hdr | |
WO2024213419A1 (en) | A processing method of an image for determining a frequency map and corresponding apparatus | |
US20220272356A1 (en) | Luma to chroma quantization parameter table signaling | |
WO2024126057A1 (en) | Reference picture marking process based on temporal identifier |