CN118476231A

CN118476231A - Film grain parameter adaptation based on viewing environment

Info

Publication number: CN118476231A
Application number: CN202280084924.7A
Authority: CN
Inventors: 苏冠铭; H·卡杜; 尹鹏
Original assignee: Dolby Laboratories Licensing Corp
Current assignee: Dolby Laboratories Licensing Corp
Priority date: 2021-12-22
Filing date: 2022-12-19
Publication date: 2024-08-09

Abstract

Methods, systems, and bitstream syntax are described for metadata signaling and film grain parameter adaptation based on a viewing environment that may be different from a reference environment. An example adaptation model is provided for viewing parameters including: ambient room lighting, viewing distance, and number of pixels per inch of the target display. Example systems include single reference viewing environment models and multi-reference viewing environment models that support adaptation of film grain model parameters by an adaptation function or interpolation.

Description

Film grain parameter adaptation based on viewing environment

Cross Reference to Related Applications

The present application claims priority from U.S. provisional patent application 63/292,654 filed on 22 nd 12 th 2021 and European application 22152455.6 filed on 20 th 1 st 2022, all of which are incorporated herein by reference in their entireties.

Technical Field

This document relates generally to images. More particularly, embodiments of the present invention relate to adaptation of film grain (FILM GRAIN) parameters for image and video sequences based on a viewing environment.

Background

WO 2021/127628 A1 discloses an apparatus and method for providing a software and hardware based solution to the problem of digital image synthesis noise. According to one aspect, a probability image is generated and noise blocks are randomly placed in the probability image at locations having probability values compared to a threshold criterion, thereby creating a composite noise image. Features include generating a composite film grain image and synthesizing a digital camera noise image.

WO 2021/122367 A1 discloses a decoder which obtains film grain model syntax elements from a set of parameters in a coded data representation. The decoder determines the film grain model value by decoding the film grain model syntax element. The decoder decodes the current image from the encoded data representation. The decoder generates an output picture by applying the generated film grain to the current picture. The decoder outputs the output picture.

Film grain is generally defined as random optical texture in a processed photographic film due to the presence of small metallic silver particles or dye clouds developed from silver halide that has received sufficient photons. In the entertainment industry, particularly in movies, film grain is considered part of the authoring process and intent. Thus, while digital cameras do not produce film grain, it is not uncommon for analog film grain to be added to the material captured by the digital video camera to mimic the "film look and feel".

Film grain presents challenges to image and video compression algorithms due to its randomness, because: a) As with random noise, it may reduce the compression efficiency of the coding algorithm used for moving image coding and distribution, and b) the original film grain may be filtered and/or altered due to the lossy compression characteristics of the coding algorithm, thereby altering the director's creative intent. Therefore, it is important to maintain the director's intention of film look and feel of the motion picture while maintaining the encoding efficiency in the compression process when encoding the moving picture.

To more effectively process film grain, coding standards such as AVC, HEVC, VVC, AV (see references 1-4) employ Film Grain Technology (FGT). FGT in media workflow consists of two main parts: film grain modeling and film grain synthesis. In the encoder, film grain is removed from the content and modeled according to the film grain model and film grain model parameters are transmitted as metadata in the bitstream. This section allows for more efficient encoding. At the decoder, the film grain is simulated according to the model parameters and reinserted into the decoded image prior to display, thereby preserving the authoring intent.

The term "metadata" herein relates to any ancillary information transmitted as part of the encoded bitstream and assists the decoder in rendering the decoded image. Such metadata may include, but is not limited to, color space or color gamut information, reference display parameters, and film grain modeling parameters, as described herein.

The film grain technique is not limited to content that contains true film grain. FGT can also be used to conceal compression artifacts of the decoder by adding artificial film grain, which is very useful for very low bit rate applications, especially for mobile media.

The main purpose of FGT is to synthesize film grain to approximate the original film grain look and feel that the colorist would recognize under the reference viewing environment. The actual viewing environment may be quite different from the reference viewing environment for the end user. Experiments by the inventors have shown that the viewing environment can change the perception of film grain. As recognized by the present inventors, it is desirable to maintain consistent film grain look and feel under various viewing environments, and thus improved techniques for film grain parameter adaptation based on viewing environments are described herein.

The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Thus, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, unless otherwise indicated, it should not be assumed that the problem posed for one or more methods has been recognized in any prior art in light of this section.

Disclosure of Invention

The invention is defined by the independent claims. The dependent claims relate to optional features of some embodiments of the invention.

Drawings

Embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which like reference numerals refer to similar elements and in which:

FIG. 1A illustrates an exemplary end-to-end flow of film grain techniques when film grain is likely to be part of the original input video;

fig. 1B illustrates an exemplary end-to-end flow of film grain techniques when film grain may not be part of the original input video but rather added in the decoder;

FIG. 2 illustrates an example of a process flow for updating film grain metadata parameters based on a viewing environment in accordance with an embodiment of the present invention; and

Fig. 3A, 3B and 3C illustrate an example process flow for updating film grain metadata parameters based on a viewing environment in accordance with an embodiment of the present invention.

Detailed Description

Example embodiments are described herein that relate to film grain parameter adaptation (adaptation) to a viewing environment. In the following description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the various embodiments of the invention. It may be evident, however, that the various embodiments of the invention may be practiced without these specific details. In other instances, well-known structures and devices are not described in detail to avoid unnecessarily obscuring, or obscuring embodiments of the present invention.

SUMMARY

Example embodiments described herein relate to film grain parameter adaptation based on a viewing environment. In one embodiment, a processor receives an input video bitstream and associated input film grain information. The processor:

parsing the input film grain information to generate input film grain parameters (301) for generating film noise for the target display;

Accessing measured viewing parameters of the target display (312);

Accessing a reference viewing parameter of a reference display;

adjusting (315) one or more input film grain parameters based on the measured viewing parameter and the reference viewing parameter to generate an adjusted film grain parameter;

generating output film noise based at least on the adjusted film grain parameters;

decoding an input video bitstream to generate decoded video pictures; and

Output film noise is mixed (320) with the decoded video picture to generate an output video picture on the target display.

In a second embodiment, a processor receives an input video bitstream and two or more sets of associated input film grain information (321), each set of information corresponding to a different target viewing environment, and each set of information including film grain parameters for generating film noise for a target display. The processor:

Accessing measured viewing parameters of the target display (312);

selecting (325) a selected one of the two or more sets of input film grain information having parameters closest to the measured viewing parameter based on the measured viewing parameter;

Parsing the selected set of input film grain information to generate output film grain parameters for generating film noise for a target display;

Generating output film noise based at least on the output film grain parameters;

Decoding an input video bitstream to generate decoded video pictures; and

In a third embodiment, a processor receives an input video bitstream and two or more sets of associated input film grain information (331), each set of information corresponding to a different target viewing environment, and each set of information including film grain parameters for generating film noise for a target display. The processor:

Accessing measured viewing parameters of the target display (312);

interpolating (340) parameters in two or more sets of input film grain information based on the measured viewing parameters to generate output film grain parameters;

Decoding an input video bitstream to generate decoded video pictures; and

Film grain data stream process

With reference to existing coding standards, in AVC, HEVC, and VVC (references [1-3] and reference [6 ]) (collectively referred to as MPEG or MPEG video standards), film grain model parameters are carried in film grain specific Supplemental Enhancement Information (SEI) messages. SEI messaging (including film grain SEI messaging) is not normative. In SMPTE-RDD-5-2006 (reference [5 ]), the film grain technical decoder specification specifies a bit-accurate film grain simulation. In AV1 (reference [4 ]), the film grain model parameters are carried as part of the "film grain parameter syntax" part of the bitstream. Unlike the MPEG standard, film grain synthesis in AV1 is normative.

Fig. 1A depicts an exemplary end-to-end flow (100A) of film grain techniques when film grain may be part of the original input video. As shown in fig. 1A, during the encoding process (80), given an input video sequence (102), a film grain removal step (105) analyzes the video, applies denoising or other filtering techniques known in the art to reduce or remove film grain and generates a film grain-free video (107). The film grain-free video is then encoded by encoder 110 (e.g., using AVC, HEVC, AV a, etc.). In parallel, the estimates of film grain noise (109) (e.g., extracted by the input video 102) are processed by a film grain modeling process 120 to generate parameters that can be used by a decoding process (90) to reproduce a close approximation of the original film grain according to the film grain model. These parameters are embedded as metadata (122) in the encoded bitstream (112). The metadata may be part of the bitstream syntax or part of supplemental information (e.g., SEI messages, etc.).

During the decoding process (90), a video decoder (130) (e.g., AVC, HEVC, AV a1 and similar decoders) receives the encoded bitstream (112) and corresponding film grain metadata (122) to generate a decoded video bitstream (132) and FG parameters (134), typically the same parameters as generated in step 120 of the encoding process. The film grain synthesis process 140 applies these FG parameters to generate synthetic film grain (142), which when added to the decoded film grain-free video (132) generates an output video (152) that is a close approximation of the input video (102).

Fig. 1B depicts an exemplary end-to-end flow of film grain techniques (100B) when film grain may not be part of the original input video but may be added during the decoding process. As shown in fig. 1B, during encoding (95), given an input video sequence (102) that may not contain film grain, the content analysis step (160) may consider the input characteristics and the encoder (110) encoding characteristics to determine which type of composite film grain noise may improve video quality or mimic a "film look and feel" when added to the decoded video. The output of this analysis is a set of film grain model parameters that can be embedded as metadata (122) in the encoded bitstream (112). These metadata may be part of the bitstream syntax or part of the supplemental information (e.g., SEI message, etc.).

The decoding process (90) in process 100B is the same as the decoding process in process 100A. After decoding the encoded bitstream (112), a film grain synthesis process (140) applies the extracted FG parameters to generate synthetic film grain (142), which when film grain (142) is added to the decoded film grain-free video (132) will generate an output video 152, which output video 152 is a close approximation of the input video (102).

MPEG film grain metadata

In AVC, HEVC, and VVC (references [1-3] and reference [6 ]) which are collectively referred to as MPEG or MPEG video for ease of discussion, film grain model parameters are part of a syntax related to Film Grain Characteristics (FGC) or Film Grain (FG) SEI messaging. Film grain composition (FGS) is characterized primarily by the following parameter sets:

-film grain model: including frequency filtering models or Autoregressive (AR) models

-Hybrid mode: involving addition or multiplication modes

Intensity interval: comprising a lower limit and an upper limit for each interval

-Component model values: syntax parameters defining the characteristics of the method used in a particular particle model.

As an example, table 1 lists some key parameters supported in the FGC SEI of AVC.

In the VVC SEI (reference [6 ]), these parameters may be named slightly differently, e.g. fg_model_id, fg_ separate _color_description_presentation_flag, fg_blending_mode_id, etc.

Table 1 film grain characteristics in MPEG SEI parameters

Wherein,

For film_gain u model_id=0, range_0: [0, limit_model_0];

For film_gain u model_id=1, rang_1: [ -limit model_1, limit_model_1-1];

Wherein limit_model_0=2 ^{(filmGrainBitDepth[c])} -1, and

limit_model_1＝2^{(filmGrainBitDepth[c]-1)}.

Note that: for HEVC and VVC, film_grain_characteristics_repetition_period is replaced with fg_characteristics_period_flag encoded with u (1) (1 bit unsigned integer).

In table 1, component values are used to specify the strength, shape, density, or other characteristics of the film grain. For example, for a frequency model (e.g., film_grain_model_id=0), the parameters in comp_model_value define the following film grain parameters:

comp_model_value [0]: sigma (sigma) (standard deviation of Gaussian noise generator)

Comp_model_value [1]: horizontal high cut-off frequency

Comp_model_value [2]: vertical high cut-off frequency

Comp_model_value [3]: horizontal low cut-off frequency

Comp_model_value [4]: vertical low cut-off frequency

Similarly, for an Autoregressive (AR) model (e.g., when film_grain_model_id=1):

Comp_model_value [1]: first order correlation of adjacent samples (x-1, y) and (x, y-1)

Comp_model_value [2]: correlation between successive color components

Comp_model_value [3]: first order correlation of adjacent samples (x-1, y-1) and (x+1, y-1)

Comp_model_value [4]: aspect ratio of modeled particles

Comp_model_value [5]: second order correlation of adjacent samples (x-2, y) and (x, y-2)

The synthetic particles G [ c ] [ x ] [ y ] can be calculated for each color component [ c ] at the sample position [ x ] [ y ], as follows:

G[c][x][y]＝σ*n[c][x][y]+a_-1,0*G[c][x-1][y]+a_0,-1*G[c][x][y-1]+a_-1,-1*G[c][x-1][y-1]+a_1,-1*G[c][x+1][y-1])+a_-2,0*G[c][x-2][y]+a_0,-2*G[c][x][y-2]+b*G[c-1][x][y],

Where n is a random value with a regularized gaussian distribution. The values of parameters sigma, a _-1,0,a_0,-1,a_-1,-1,a_1,-1,a_-2,0,a_0,-2, and b are determined for each intensity interval from the corresponding model values signaled in the FGCSEI message.

The addition of film grain can provide a number of benefits including: providing film-like look and feel, increasing sharpness, reducing coding artifacts, and reducing banding artifacts. For film grain content, the colorist approves the film grain look and feel in a reference viewing environment (e.g., reference [7 ]); however, the viewing environment of the viewer may be completely different. It is widely recognized that the surrounding viewing environment in which viewers experience video content can have a significant impact on perceived quality. For example, the experience of watching video in a dark cinema, a typical home (night or daytime), or outdoors may vary greatly. The viewing experience may also be affected by the type of target display (e.g., television, cell phone, tablet, notebook, etc.) and the distance or viewing angle of the viewer from the display.

Experimental studies by the inventors have shown that as the viewing environment changes, the perception of film grain also changes. The main influencing factors include:

ambient (e.g. dark or bright environment)

Display characteristics (e.g. screen size, resolution, display density, contrast, brightness, etc.)

Viewing distance (or viewing angle) of the viewer from the display

For example, in a typical film grain modeling environment, a colorist may define film grain parameters based on viewing content in a darkroom at a normal viewing distance (e.g., 2 times screen height) and with reference to a high definition display. In a decoder, film grain composition may need to be adjusted depending on the viewing environment. For example, when the room is dimly lit, film grain needs to be denser and smaller for shorter viewing distances and lower resolution displays. On the other hand, when the viewer is in a bright room, at a far viewing distance, and uses a high resolution display, the larger film grain may provide a better user experience.

Another important factor is the number of viewers. For cell phones or small resolution displays, multiple viewer use cases are likely not present, but special care should be taken in a group viewing environment (e.g., in front of a living room's television). In such a case, one may wish to prohibit changing film grain synthesis parameters related to viewing distance (e.g., using only default values), but only allow FG model parameters to be adjusted in accordance with ambient light and display parameters.

In an embodiment, film grain model adaptation may be performed manually through a user interface, or automatically through sensors in the room or on a display device, or by using a combination of both methods. For example, the user may send viewing parameters to the decoder through a decoder interface or a mobile application, etc. Or ambient light and distance sensors (on a display or other device) may automatically capture such information and provide it to the film grain composition model.

Fig. 2 depicts an example embodiment of film grain parameter adaptation. As shown in fig. 2, the receiver can also receive metadata related to the reference environment and film grain model given the coded bitstream (210). In addition, the receiver may also receive viewing environment parameters (205) (e.g., ambient light, viewing distance, and display parameters) via sensors and/or user inputs. In view of these inputs, the receiver can apply a film grain model adjustment function (215) to generate updated film grain parameters (220) for film grain synthesis and blending.

Example embodiments of film grain parameter adaptation include three alternative methods, which will be described in further detail later. In a first embodiment, a single FG model is signaled for each frame. The FG model may be associated with a reference viewing environment, or determined by other (optional) SEI messages, or may be based on known criteria, such as reference [7] or reference [8], etc. If the display viewing environment is different from the reference environment, the decoder (or user) may apply the proposed adaptation method to adjust the FG model. In a second embodiment, FG model lists (for various viewing scenes) may be signaled and the decoder (or user) may select the model closest to the actual viewing environment. In a third method, FG models are classified. For each category, one or more models may be specified. The decoder (or user) may select a category classification according to the viewing parameters and then apply interpolation to generate FG models to obtain the best viewing experience.

To simplify the description, without loss of generality, the exemplary adaptation function is described only for the following three viewing environment parameters: ambient light, display pixels per inch/dot count (ppi or dpi), and viewing distance; the method can be easily extended to include other factors or parameters such as the contrast of the display, the brightness of the display, the viewing angle, the display mode (e.g., vivid, movie, normal), etc. It should also be noted that while the examples use MPEG SEI messaging parameters, the proposed embodiments are not limited to any particular FG model and are applicable to existing models (e.g., described by MPEG and AV 1) and future FG models.

Single film grain reference viewing model

In MPEG, FG model parameters may be conveyed to the decoder using film grain SEI messages; but such messaging does not contain any information about the reference viewing environment. In MPEG, additional SEI messages may be used to describe parameters related to the viewing environment. For example, one such SEI message relates to the "master display color capacity characteristics" as shown in table 2. This SEI message identifies the color capacities (color primaries, white point, and luminance ranges) of the displays that are considered to be the master displays for the associated video content—e.g., the color capacities of the displays for viewing when authoring the video content. As another example, the second SEI message relates to "surrounding viewing environment", as shown in table 3. It identifies characteristics of a nominal surrounding viewing environment for the associated video content.

Table 2 master display color capability SEI message syntax in vsei

Table 3 ambient viewing environment SEI message syntax in vsei

In a first embodiment, a method is proposed that specifies a reference viewing environment associated with FGC SEI messages and recommends how to adjust FG model parameters when the actual viewing environment differs from the reference viewing environment.

For simplicity and without loss of generality, consider three viewing parameters that need to be specified: ppi of the display, ambient light level, and viewing distance. When there is a master display color capability SEI message, it specifies the display information of the FGC SEI. When there is an ambient viewing environment SEI message, it specifies the ambient viewing environment. Metadata may be added to indicate the appropriate viewing distance or it may be assumed that best practices are used, such as defined in references [7,8 ]. For example, for a high definition display, the distance of the master reference display from the viewer should be about 3 to 3.2 times the height of the display image (reference [8 ]). For UHD resolution displays, the standard guidelines suggest that the reference display should be located at a distance 1.6 to 3.2 times the picture height (reference [7 ]). If there is no SEI message related to the hosting environment, it can be assumed that best practices are applied according to the spatial resolution of the incoming picture.

When the actual viewing environment is different from the reference viewing environment, it is proposed to update the original FG model parameters as follows:

the darker the room, the smaller the particles should be, the darker. The brighter the room, the larger and brighter the particles can be

The smaller the ppi (pixels per inch), the smaller the granularity should be. The larger the ppi, the larger the particle size

The closer the viewing distance, the smaller the granularity should be. The larger the viewing distance, the larger the granularity may be.

In one embodiment, one functional model may be built for each rule, and then the three functional models are multiplied to form the final FG model. For example, the reference ambient light measurement is denoted as L ^r, the reference pixel per inch is denoted as p ^r, and the reference viewing distance is denoted as d ^r. Representing the corresponding measured values as L ^m、p^m and d ^m, the ratio between the reference parameter and the measured parameter can be defined as

The film grain parameters can then be adjusted by some predefined function. For example, for frequency model parameters in MPEG FG SEI models:

σ′＝f_σ(σ，L，p，d)，

(2)

in one embodiment, examples of the functional model may include:

f(σ，L，p，d)＝σ*(a_L+b_L*L)*(a_p+b_p*p)*(a_d+b_d*d)，

(3)

Example values include, among others: a _L＝1,b_L＝0.01,a_p＝1,b_p＝0.5,a_d＝1,b_d =2, and σ is typically in the range of [1,4], or

f(σ,L,p,d)＝σ*(a_L*exp(b_L*L))*(a_p*exp(b_p*p))*((a_d*exp(b_a*d)),(4)

Or (b)

f(σ，L，p，d)＝σ*(a_L*exp(b_L*L))*(a_p+b_p*p)*(a_d+b_d*d).(5)

In order to make the noise stronger, the σ value should be increased. To increase the film grain size, the low cut-off frequency and the high cut-off frequency in both the horizontal and vertical directions should be reduced to contain fewer high frequency DCT coefficients. When the value of the cutoff frequency is an integer, rounding may be applied as follows:

f(σ，L，p，d)＝clip3(round(f(σ，L，p，d))，0，block_size-1)，(6)

Wherein the method comprises the steps of

Other parameters, such as intensity_ inteval _lower_bound and intensity_ inteval _upper_bound, may also be adjusted by similar functional models.

For the AR model, the noise standard deviation σ may be adjusted in the same manner as in the frequency filtering model. To amplify film grain, in one embodiment, the AR coefficients of the far pixels, e.g., a _-2,0 and a _0,-2, can be increased and the coefficients of the near pixels, e.g., a _-1,0 and a _0,-1, can be decreased. Coefficients a _-1,-1 and a _1,-1 may remain unchanged or be only slightly adjusted:

σ′＝f_σ(σ，L，p，d)，

(7)

b′＝f_b(b，L，p，d).

note that the sum of the final coefficients should be equal to 1. Therefore, the coefficients need to be normalized with the sum m of all AR filter coefficients:

m＝a′_-2,0+a′_0,-2+a′_-1,0+a′_0,-1+a_-1,-1′+a_1,-1′+b′.

Fig. 3A depicts an example data processing for film grain parameter adaptation based on this adaptation method. As shown in fig. 3A, a receiver may receive film grain related metadata as part of a bitstream (301). The receiver may then search for additional metadata related to the preferred viewing environment. If such metadata is found, the metadata is read in step 310, otherwise the recipient may employ best practices (305) or predefined default parameters to generate reference viewing parameters (307). Next, the receiver determines the actual viewing parameters (312) (e.g., via user input, sensors, etc.), and in step 315, the input FG model parameters (301) may be adjusted according to the viewing environment (312) and the reference parameters (from the input metadata or 307). Finally, in step 320, the adapted FG parameters are used to generate composite film grain (or film noise) that is mixed with the decoded bitstream data for display on the target display.

Multi-film grain reference viewing model

In another embodiment, the bitstream may include multiple sets of film grain related metadata of the encoded frame, wherein each FG model provides FG model parameters for the target viewing environment. The decoder may then select the model that is most suitable for the viewer environment.

For example, consider the MPEG SEI parameter set, as shown in table 4 (reference [6 ]). Currently, table 4 does not have any syntax elements to specify the viewing environment. To address this limitation, in one embodiment, parameters in the existing grammar may be reinterpreted to specify a different viewing environment. In one example, for display information, syntax under fg_ separate _color_description_presentation_flag (see table 4) may be reused to support providing target display information in a viewing environment. The benefit of this approach is that the current FGC SEI syntax table can be reused and backward compatibility can also be maintained. One way to maintain backward compatibility is to always have the fg_ separate _color_description_present_flag of FGC SEI equal to 0. Then, when fg_ separate _color_description_present_flag is equal to 1, the available bit may be used to specify a new parameter.

Fg_ separate _color/u description_present_flag the following relevant grammars of interest include:

-fg color primary for specifying primary color of the target display

-Fg_transfer_characteristics for specifying the transfer characteristics of the target display. For example, it can be used to identify whether the target display is SDR or HDR (under PQ or HLG encoding)

In one embodiment, other four syntax parameters (fg_bit_depth_luma_minus8, fg_bit_depth_chroma_minus8, fg_full_range_flag, and fg_matrix_coeffs) may be used to represent other viewing environment related information. The first three syntax elements may provide 7 bits of information and the last syntax element may provide 8 bits of information. These bits may be used to provide other viewing environment parameters. For example, the first 2 bits may be used to signal information related to viewing distance from the display. The subsequent 2 bits may be used to signal display density (ppi) information. Test results indicate that the most important is the ratio of pixel distance to viewing distance, so the first 4 bits can alternatively be used to directly signal such a ratio. The next 3 bits may then be used to signal the maximum brightness of the target display. A total of 7 bits are required. The last 8 bits may be used to signal the surrounding environment. An example is shown in table 1, where the newly proposed syntax elements are represented in italic font.

Film grain characteristics SEI message syntax in Table 4VSEI

Table 5 FG View context grammar example

Fg_pixel_view_distance_ratio_idc represents a ratio of a pixel distance (ppi: pixel per inch) to a viewing distance in units of 20.

Note that: as the inventors have appreciated, one important parameter of FG adaptation is the ratio of ppi to viewing distance, i.e. the ratio of pixels per inch in the display to viewing distance, which indicates how many pixels the viewer can see on the screen.

Fg_display_max_luminance_idc represents the maximum brightness of the display. Examples are given in table 6 via a look-up table.

Table 6 fg_display_max_luminance_idc mapping example

fg_display_max_luminance_idc	Display maximum brightness information (nit)
		0	100
1	300
		2	600
3	800
		4	1000
55	2000
		6	4000
7	10000

Fg_event_illuminance specifies the ambient illuminance of the surrounding viewing environment in 7 lux units.

Note that: in one embodiment, it is desirable to cover an ambient light level from about 10 lux to about 400 lux. With the proposed precision, the range from 0 to 63 x 7 = 441 lux can be covered. The grammar can be adapted to cover alternative scopes.

Fg_current_ chromaticity _idc represents the background chromaticity of the surrounding viewing environment. Table 7 shows one mapping example.

Table 7 fg_ambient_chromaticity_idc mapping example

fg_ambient_chromaticity_idc	Background chrominance information
		0	D65
1	D93
		2	D50
3	Reservation of

In another embodiment, it may be decided to add additional syntax elements to specify the viewing environment. Table 8 shows one example.

fg_target_display_primaries_x[c],fg_target_display_primaries_y[c],fg_target_display_white_point_x,fg_target_display_white_point_y,fg_target_display_max_luminance,fg_target_display_min_luminance Having the same semantics as those specified in the master display color capability SEI message.

Fg_target_display_density specifies the physical pixel count per inch (PPI) of the target display.

Fg_event_il luminance, fg_event_light_x, fg_event_light_y have the same semantics as those specified in the surrounding viewing environment SEI message.

Fg_view_distance specifies the distance from the viewer to the display in 0.001 feet.

In such an adaptation scenario, the decoder needs to choose which FGC SEI to choose from among multiple sets based on its own viewing information. In one embodiment, it is assumed that for scene i, those viewing environment related parameters are all taken as one vector m _i, there may be K sets.

For the user's environment m, optimal settings can be generated

i_opt＝arg min||w^T(m-m_i)‖，

(8)

Where w is the weighting factor vector for each FG parameter (see equation (8)). The value of w may be trained based on experimental data or otherwise. In its simplest form, it may be a unit vector for three FG parameters (e.g., [1, 1] ^T).

In another embodiment, the best model may be selected based only on the most critical viewing environment parameters (e.g., room illuminance, ratio of display density to viewing distance, etc.), and then the value closest to the user's environment may be selected.

In another embodiment, the decoder may have pre-stored a set of film grain models, each model for a different set of viewing parameters. In this case, the encoder may simply signal the decoder with an index pointer to the group. For example, in table 9, if fg_view_environment_description_presentation_flag= 1, the viewing environment and the corresponding FG model are signaled, otherwise, only the index fg_target_view_model_idx of the pre-stored model is signaled.

Table 9 view_env_film_grain_characteristics SEI, an example of a grammar table, has pointers to pre-stored FG models

Fg_target_view_model_idx specifies an index of a predetermined set of film grain models.

Fig. 3B shows an example procedure of FG parameter adaptation according to an embodiment of the method. Given the FG-related metadata (321) of the K viewing environments of the input (transmission or pre-stored) and the parameters (312) related to the current environment, the receiver identifies the best match (e.g., i opt) among the K environments in step 325 and applies the selected parameters for film grain composition and blending in step 320.

Classified film grain reference viewing model

In another embodiment, FG models are classified. For each category, one or more FG models may be specified with metadata. The user/decoder may select a category and apply interpolation techniques to generate FG models that best fit their actual viewing conditions.

In an example embodiment, the category may be based on room illuminance. Given a fixed room brightness value, the FG model may be signaled for the case of two extreme pixel_view_distance_ratio. For any value between the two extreme pixel_view_distance_ratio values, other values may be generated by simple interpolation. For example, given boundary values a and B, the parameters of a < x < B can be interpolated as:

z＝δB+(1-δ)A，

(9)

Wherein the method comprises the steps of

Table 10 shows an example of signaling maximum and minimum pixel_view_distance_ratio values. The function fg_model_parameters () signals FG model parameters based on the value of pixel_view_distance_ratio. These model parameters may be any FG model, such as a frequency model, an Autoregressive (AR) model in AVC/HEVC/VVC or AV1, or any other model. The model may be explicitly described in terms of variance/energy/brightness, shape, related parameters, or in terms of curves.

Table 10 view_env_film_grain_characteristics SEI examples of grammar tables

Fg_max_pixel_view_distance_ratio specifies the maximum ratio of the pixel distance (PPI: pixels per inch) to the viewing distance in units of 0.000001.

Fg_min_pixel_view_distance_ratio specifies the minimum ratio of the pixel distance (PPI: pixels per inch) to the viewing distance in units of 0.000001.

In another embodiment, the category may be based on both room illuminance and pixel_view_distance_ratio. Given a fixed room illuminance and pixel_view_distance_ratio, several target display maximum luminance values may be signaled to the model. For a given display with a maximum luminance value, the FG model may be interpolated between the models with the two closest maximum luminance values.

Table 11 shows an example of signaling a set of target display maximum luminance values. The function fg_model_parameters () signals FG model parameters based on the target display maximum luminance value. The model parameters may be any FG model, such as a frequency model, an AR model in AVC/HEVC/VVC or AV1, or any other model. The model may be explicitly described in terms of variance, shape, related parameters, or in terms of curves.

Table 11view_env_film_grain_characteristics SEI examples of grammar tables

Fg_num_target_display_minus1 plus 1 specifies the number of target_displays described in the FG model.

Fg_target_display_max_brightness [ i ] specifies the ith target display maximum brightness. It has the same semantics as specified for the mdcv _max_display_mastering_luminance element.

Weighted combination example of film grain models

Consider receiving metadata defining a plurality of reference viewing environments using film grain parameters. In one embodiment, for the kth environment, consider a reference value defined by ambient light measurement L ^rk, pixel per inch p ^rk, and viewing distance d ^rk. The corresponding measured parameters are expressed as: l ^m、p^m and d ^m. If all three parameters are considered, the three-dimensional distance between the measured parameter and the reference parameter can be calculated as follows:

D^k＝(w_L(L^m-L^rk)²+w_p(p^m-p^rk)²+w_d(d^m-d^rk)²)^0.5,

(10)

Where w _L、w_p and w _d are optional weighting factors for assigning weighted importance to different measurements. For example, depending on the viewing environment, some weights may be set to 0 (e.g., w _p =0 and/or w _d =0), or all weights may be set to 1. By looking for two ks with the minimum value D ^k, the two nearest viewing environments can be found. This can be done easily by ordering { D ^k } in ascending order and selecting the first two.

The first and second viewing environments may be used as the most recent viewing environments without loss of generality. The first and second reference ambient light measurements are denoted as L ^r1 and L ^r2, the pixels per inch as p ^r1 and p ^r2, and the viewing distances as d ^r1 and d ^r2. The corresponding measured parameters are denoted Lm, pm and dm. Film grain parameters for each associated environment are expressed asAnd

The distance between the measured parameter and the reference parameter is:

D¹＝(W_L(L^m-L^r1)²+w_p(p^m-p^r1)²+w_d(d^m-d^r1)²)^0.5,(11)

D²＝(w_L(L^m-L^r2)²+w_p(p^m-p^r2)²+w_d(d^m-d^r2)²)^0.5,

The film grain parameters can be adjusted by some predefined function:

σ′＝h_σ(σ¹,σ²,L^r1,p^r1,d^r1,L^r2,p^r2,d^r2,L^m,p^m,d^m),

12)

for example, consider a model for σ', an example function is

When the measured parameter (L ^m、p^m、d^m) is the same as the reference #1 (L ^r1、p^r1、d^r1), D ¹ =0, and thus σ' =σ ¹. Similarly, σ' =σ ² when the measured parameter is the same as reference # 2. When the measured value does not coincide with any reference value, σ' can be calculated using the weighting equation described above. The same method can be used to calculate the other four parameters in equation (12).

Fig. 3C depicts an example data flow for FG parameter adaptation according to an embodiment of this method. The receiver accesses parameters of one or more classified FG models (either via metadata or pre-stored). Then, given the parameters 312 of the viewing environment, interpolation techniques are applied to generate FG parameters that best match the viewing environment in step 340. Finally, in step 320, the interpolated parameters are applied to perform film grain synthesis and blending.

Various embodiments of the present disclosure have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the disclosure. Thus, the present invention may be embodied in any of the forms described herein, including but not limited to the exemplary embodiments (EEEs) listed below, which describe the structure, features, and functions of certain portions of the present invention:

EEE 1. A method of processing film grain metadata, the method comprising:

receiving an input video bitstream and associated input film grain information;

Accessing measured viewing parameters of the target display (312);

Accessing a reference viewing parameter of a reference display;

decoding an input video bitstream to generate decoded video pictures; and

EEE 2.EEE 1, wherein a reference viewing parameter for a reference display is generated (310) by parsing metadata associated with an input video bitstream.

EEE 3.EEE 1 wherein the reference viewing parameter of the reference display is generated based on a predetermined value or a value known by recommended practice in color grading.

EEE 4.EEE 3 method, wherein the recommended practice in color grading includes standard BT.2035 or SMPTE 2080.

The method of any of EEEs 1-4, wherein the viewing parameters comprise an ambient light value, a pixel per inch value, and a viewing distance value.

A method of EEE 6.eee5 wherein adjusting the one or more input film grain parameters comprises:

Generating a ratio value between corresponding ones of the measured viewing parameters and the reference viewing parameters; and

The one or more input film grain parameters are adjusted based on the ratio values.

A method of EEE 7.eee6 wherein adjusting comprises:

Adjusting film grain size and/or intensity in proportion to the ratio of measured ambient brightness to reference ambient brightness;

adjusting film grain size and/or intensity in proportion to the ratio of the measured viewing distance to the reference viewing distance; and

The film grain size and/or intensity is adjusted in proportion to the ratio of the measured pixels per inch to the reference pixels per inch.

The method of EEE 8.eee7 wherein the adjustment function of film grain noise comprises the calculation of:

f(σ，L，p，d)＝σ*(a_L+b_L*L)*(a_p+b_p*p)*(a_d+b_d*d)，

Where σ represents the noise standard deviation determined in the input film grain parameters, L, p and d represent the ratio values between the measured viewing parameter and the corresponding one of the reference viewing parameters for ambient brightness, pixels per inch and viewing distance, and a _L,b_L,a_p,b_p,a_d, and b _d represent the film grain model adaptation constants.

EEE 9. A method of processing film grain metadata, the method comprising:

Receiving an input video bitstream and accessing two or more sets of associated input film grain information (321), each set of information corresponding to a different target viewing environment, and each set of information including film grain parameters for generating film noise for a target display;

Accessing measured viewing parameters of the target display (312);

Decoding an input video bitstream to generate decoded video pictures; and

A method of EEE 10.eee9 wherein the viewing environment may be determined by parameters comprising one or more of: viewing an ambient light value in the environment; viewing distance from the target display in the viewing environment; maximum luminance value of the target display; number of pixels per inch in the target display; a ratio of pixels per inch in the target display to viewing distance from the target display; viewing an ambient chromaticity value in the environment; and the x and y primary colors of the target display.

A method of EEE 11, EEE 9 or EEE 10 wherein selecting the selected set of input film grain information is based on minimizing an error function between a measured viewing parameter and a corresponding parameter in the two or more sets of input film grain information.

An EEE 12. A method of processing film grain metadata, the method comprising:

receiving an input video bitstream and accessing two or more sets of associated input film grain information (331), each set of information corresponding to a different target viewing environment, and each set of information including film grain parameters for generating film noise for a target display;

Accessing measured viewing parameters of the target display (312);

interpolating (325) parameters from the two or more sets of input film grain information based on the measured viewing parameters to generate output film grain parameters;

Decoding an input video bitstream to generate decoded video pictures; and

A method in EEE 13.eee12 wherein generating output film grain parameters comprises:

Selecting, based on the measured viewing parameter, a first set and a second set of input film grain information having parameters closest to the measured viewing parameter from the two or more sets of input film grain information; and

The output film grain parameters are generated by applying interpolation functions based on corresponding parameters in the first and second sets of input film grain information.

The method of EEE 14.EEE 13 wherein the first and second sets of input film grain information are selected as two of the two or more sets of input film grain information, the distance measure between the measured viewing parameter and the corresponding parameter in the two or more sets of input film grain information having two minimum distance measure values.

The method of EEE 15.eee14 wherein calculating the distance measure comprises:

Wherein w _k, k=1, the combination of the first and second components, M, representing normalized weighting factors (M > 0) in [0,1] for M viewing environment parameters, P (k) ^m represents a measured value of a kth viewing parameter, and P (k) ^r(i) represents a corresponding viewing parameter value in the ith set of film grain information.

The method of EEE 16.eee15 wherein calculating an interpolation function for the P-th film grain parameter comprises calculating:

Where D1 and D2 represent distortion values calculated for the selected first and second sets of input film grain information and P (P ₁) and P (P ₂) represent values of the P-th film grain parameter defined in the first and second sets of input film grain information.

The method of any of EEEs 14-16, wherein the M viewing environment parameters comprise one or more of:

viewing an ambient light value in the environment;

viewing distance from the target display in the viewing environment;

maximum luminance value of the target display;

The number of pixels per inch of the target display; and

A ratio of pixels per inch of the target display to a viewing distance from the target display.

The EEE 18. The method of EEE 12 wherein only two sets of input film grain information are received, one set corresponding to a lower limit of the ambient viewing parameter and one set corresponding to an upper limit of the ambient viewing parameter, wherein calculating the interpolation function for the P-th film grain parameter comprises calculating f (P) =δ x P (P _U)+(1-δ)*P(P_L),

Wherein the method comprises the steps of

Where P _m represents measured values of the environmental viewing parameters between P _L and P _U, P _L and P _U represent lower and upper limits of the environmental viewing parameters, and P (P _L) and P (P _U) represent corresponding film grain parameters in the two sets of input film grain information received.

The method of any of EEEs 1-18 wherein the input film grain information comprises film grain Supplemental Enhancement Information (SEI).

The method of any of EEEs 9-18 wherein two or more sets of input film grain information are received with the input video bitstream via metadata or they are pre-stored in a decoder to decode the input video bitstream.

The method of any of EEEs 9-18 wherein at least one of the two or more sets of input film grain information is pre-stored in a decoder to decode the input video bitstream and identified by an index parameter in metadata in the input video bitstream.

EEE 22. A non-transitory computer-readable storage medium having stored thereon computer-executable instructions for performing a method according to any one of EEEs 1-21 with one or more processors.

EEE 23. An apparatus comprising a processor and configured to perform any of the methods described in EEEs 1-21.

Reference to the literature

Each of the references listed herein is incorporated by reference in its entirety.

[1]Advanced Video Coding，Rec.ITU-T H.264，May2019，ITU.

[2]High Efficiency Video Coding，Rec.ITU-T H.265，November 2019，ITU.

[3]Versatile Video Coding，Rec.ITU-T H.266，August 2020，ITU.

[4]AV1 Bitstream and Decoding Process Specification,by P.deRivaz et al.,Version 1.0.0with Errata,2019-01-08.

[5]RDD 5-2006-SMPTE Registered Disclosure Doc-Film Grain Technology-Specifications for H.264|MPEG-4AVC Bitstreams,Mar.2006,SMPTE.

[6]Versatile supplemental enhancement information messages for coded video bitstreams,Rec.ITU-T H.274,Aug.2020,ITU.

[7]A reference viewing environment for evaluation of HDTV program material or completed programmes,Rec.ITU-R BT.2035(07/2013),ITU.

[8]SMPTE 2080-3：2017，“Reference viewing environment for evaluation of HDTV images，”SMPTE.

Example computer System implementation

Embodiments of the invention may be implemented using a computer system, a system configured in electronic circuits and elements, an Integrated Circuit (IC) device such as a microcontroller, a Field Programmable Gate Array (FPGA) or other configurable or Programmable Logic Device (PLD), a discrete-time or Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), and/or an apparatus including one or more of such systems, devices or components. The computer and/or IC can execute, control, or carry out instructions related to film grain parameter adaptation for a viewing environment, such as those described herein. The computer and/or IC can calculate various parameters or values related to the film grain parameter adaptation for the viewing environment described herein. Image and video embodiments may be implemented in hardware, software, firmware, and various combinations thereof.

Some implementations of the invention include a computer processor executing software instructions that cause the processor to perform the method of the invention. For example, one or more processors in a display, encoder, set-top box, transcoder, etc. may implement the methods described above in connection with film grain parameter adaptation for a viewing environment by executing software instructions in a program accessible to the processor. Embodiments of the present invention may also be provided in the form of a program product. The program product may comprise any non-transitory and tangible medium carrying a set of computer readable signals comprising instructions that, when executed by a data processor, cause the data processor to perform the method of the invention. The program product according to the invention may take a variety of non-transitory and tangible forms. The program product may include, for example, physical media such as magnetic data storage media (including floppy disks, hard disk drives), optical data storage media (including CD ROMs, DVDs), electronic data storage media (including ROMs, flash RAM, etc.). The computer readable signal on the program product may optionally be compressed or encrypted. When reference is made above to a certain component (e.g., a software module, a processor, a component, a device, a circuit, etc.), unless otherwise indicated, reference to that component (including a reference to "a means") should be interpreted as including as equivalents of that component (e.g., functionally equivalent) any component which performs the function of the described component, including components which are not structurally equivalent to the disclosed structure which performs the function in the illustrated exemplary embodiments of the invention.

Equivalent, expanded, substitute and other clauses

Example embodiments are thus described relating to film grain parameter adaptation for a viewing environment. In the foregoing specification, embodiments of the application have been described with reference to numerous specific details that may vary from embodiment to embodiment. Thus, the sole and exclusive indicator of what is the application, and is intended by the applicants to be the application, is the set of claims that issue from this description, in the specific form in which such claims issue, including any subsequent correction. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Thus, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of such claim in any way. The specification and drawings are, accordingly, to be regarded in an illustrative rather than a restrictive sense.

Claims

1. A method of processing film grain metadata, the method comprising:

receiving an input video bitstream and associated input film grain information;

parsing the input film grain information to generate input film grain parameters (301) for generating film noise for a target display;

Accessing measured viewing parameters (312) of the target display, wherein the viewing parameters include an ambient light value, a pixel per inch value, and a viewing distance value;

Accessing a reference viewing parameter of a reference display;

adjusting (315) one or more input film grain parameters based on the measured viewing parameter and the reference viewing parameter to generate an adjusted film grain parameter, wherein the adjusting comprises:

Generating a ratio value between corresponding ones of the measured viewing parameters and the reference viewing parameters;

Adjusting film grain size and/or intensity in proportion to the ratio of measured pixels per inch to a reference pixels per inch;

decoding an input video bitstream to generate decoded video pictures; and

2. The method of claim 1, wherein the reference viewing parameters of the reference display are generated (310) by parsing metadata associated with the input video bitstream.

3. The method of claim 1, wherein the reference viewing parameter of the reference display is generated based on a predetermined value or a value recommended by a standard bt.2035 or SMPTE 2080.

4. The method of any of claims 1-3, wherein the adjustment function of film grain noise comprises calculating:

f(σ，L，p，d)＝σ*(a_L+b_L*L)*(a_p+b_p*p)*(a_d+b_d*d)，

5. A method of processing film grain metadata, the method comprising:

Accessing measured viewing parameters of the target display (312);

Decoding an input video bitstream to generate decoded video pictures; and

6. The method of claim 5, wherein the viewing environment may be determined by parameters comprising one or more of:

viewing an ambient light value in the environment;

viewing distance from the target display in the viewing environment;

maximum luminance value of the target display;

Number of pixels per inch in the target display;

A ratio of pixels per inch in the target display to viewing distance from the target display;

viewing an ambient chromaticity value in the environment; and

The x and y primary colors of the target display.

7. The method of claim 5 or 6, wherein selecting the selected set of input film grain information is based on minimizing an error function between a measured viewing parameter and a corresponding parameter in the two or more sets of input film grain information.

8. A method of processing film grain metadata, the method comprising:

Accessing measured viewing parameters of the target display (312);

Decoding an input video bitstream to generate decoded video pictures; and

9. The method of claim 8, wherein generating output film grain parameters comprises:

10. The method of claim 9, wherein the first and second sets of input film grain information are selected as two of the two or more sets of input film grain information, the distance measure between the measured viewing parameter and the corresponding parameter in the two or more sets of input film grain information having two minimum distance measure values.

11. The method of claim 10, wherein calculating a distance measure comprises:

Wherein w _k, k=1, the combination of the first and second components, M, representing normalized weighting factors in 0,1 for the M viewing environment parameters, M >0, P (k) ^m represents a measured value of a kth viewing parameter, and P (k) ^r(i) represents a corresponding viewing parameter value in the ith set of film grain information.

12. The method of claim 11 wherein calculating an interpolation function for the P-th film grain parameter comprises calculating:

13. The method of any of claims 10-12, wherein the M viewing environment parameters comprise one or more of:

viewing an ambient light value in the environment;

viewing distance from the target display in the viewing environment;

maximum luminance value of the target display;

The number of pixels per inch of the target display; and

14. The method of any of claims 8-13, wherein only two sets of input film grain information are received, one set corresponding to a lower limit of the ambient viewing parameter and one set corresponding to an upper limit of the ambient viewing parameter, wherein calculating the interpolation function for the P-th film grain parameter comprises calculating

f(P)＝δ*P(P_U)+(1-δ)*P(P_L)，

Wherein the method comprises the steps of

15. A non-transitory computer-readable storage medium having stored thereon computer-executable instructions for performing the method of any of claims 1-14 with one or more processors.