METHODS ON SCALING IN VIDEO CODING
Under the applicable patent law and/or rules pursuant to the Paris Convention, this application is made to timely claim the priority to and benefits of International Patent Application No. PCT/CN2019/086789, filed on May 14, 2019. The entire disclosures thereof are incorporated by reference as part of the disclosure of this application.
TECHNICAL FIELD
This patent document relates to video coding techniques, devices and systems.
BACKGROUND
In spite of the advances in video compression, digital video still accounts for the largest bandwidth use on the internet and other digital communication networks. As the number of connected user devices capable of receiving and displaying video increases, it is expected that the bandwidth demand for digital video usage will continue to grow.
SUMMARY
Devices, systems and methods related to digital video coding, and specifically, to scaling and division operations in video coding. The described methods may be applied to both the existing video coding standards (e.g., High Efficiency Video Coding (HEVC) ) and future video coding standards or video codecs.
In one representative aspect, the disclosed technology may be used to provide a method for video processing, comprising: performing a conversion between a current video block of a video and a coded representation of the video, the current video block comprising a luma component and at least one chroma component, wherein the luma component is converted from an original domain to a reshaped domain with a luma mapping with chroma scaling (LMCS) scheme, and chroma samples of the at least one chroma component are predicted based on reconstructed luma samples of the luma component in the original domain.
In yet another representative aspect, the above-described method is embodied in the form of processor-executable code and stored in a computer-readable program medium.
In yet another representative aspect, a device that is configured or operable to perform the above-described method is disclosed. The device may include a processor that is programmed to implement this method.
In yet another representative aspect, a video decoder apparatus may implement a method as described herein.
The above and other aspects and features of the disclosed technology are described in greater detail in the drawings, the description and the claims.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 shows a flowchart of an example of a decoding flow with reshaping.
FIG. 2 shows an example of sample locations used for the derivation of parameters in a cross-component linear model (CCLM) prediction mode.
FIG. 3 shows an example of neighboring samples used for deriving illumination compensation (IC) parameters.
FIG. 4 shows a flowchart of an example method for video processing.
FIG. 5 is a block diagram of an example of a hardware platform for implementing a visual media decoding or a visual media encoding technique described in the present document.
DETAILED DESCRIPTION
Embodiments of the disclosed technology may be applied to existing video coding standards (e.g., HEVC, H. 265) and future standards to improve compression performance. Section headings are used in the present document to improve readability of the description and do not in any way limit the discussion or the embodiments (and/or implementations) to the respective sections only.
2 Video coding introduction
Due to the increasing demand of higher resolution video, video coding methods and techniques are ubiquitous in modern technology. Video codecs typically include an electronic circuit or software that compresses or decompresses digital video, and are continually being improved to provide higher coding efficiency. A video codec converts uncompressed video to a compressed format or vice versa. There are complex relationships between the video quality, the amount of data used to represent the video (determined by the bit rate) , the complexity of the encoding and decoding algorithms, sensitivity to data losses and errors, ease of editing, random access, and end-to-end delay (latency) . The compressed format usually conforms to a standard video compression specification, e.g., the High Efficiency Video Coding (HEVC) standard (also known as H. 265 or MPEG-H Part 2) , the Versatile Video Coding standard to be finalized, or other current and/or future video coding standards.
Video coding standards have evolved primarily through the development of the well-known ITU-T and ISO/IEC standards. The ITU-T produced H. 261 and H. 263, ISO/IEC produced MPEG-1 and MPEG-4 Visual, and the two organizations jointly produced the H. 262/MPEG-2 Video and H. 264/MPEG-4 Advanced Video Coding (AVC) and H. 265/HEVC standards. Since H. 262, the video coding standards are based on the hybrid video coding structure wherein temporal prediction plus transform coding are utilized. To explore the future video coding technologies beyond HEVC, Joint Video Exploration Team (JVET) was founded by VCEG and MPEG jointly in 2015. Since then, many new methods have been adopted by JVET and put into the reference software named Joint Exploration Model (JEM) . In April 2018, the Joint Video Expert Team (JVET) between VCEG (Q6/16) and ISO/IEC JTC1 SC29/WG11 (MPEG) was created to work on the VVC standard targeting at 50%bitrate reduction compared to HEVC.
2.1 In-loop reshaping (ILR) in JVET-M0427
The basic idea of in-loop reshaping (ILR) is to convert the original (in the first domain) signal (prediction/reconstruction signal) to a second domain (reshaped domain) .
The in-loop luma reshaper is implemented as a pair of look-up tables (LUTs) , but only one of the two LUTs need to be signaled as the other one can be computed from the signaled LUT. Each LUT is a one-dimensional, 10-bit, 1024-entry mapping table (1D-LUT) . One LUT is a forward LUT, FwdLUT, that maps input luma code values Y
i to altered values Y
r: Y
r=FwdLUT [Y
i] . The other LUT is an inverse LUT, InvLUT, that maps altered code values Y
r to
(
represents the reconstruction values of Y
i. ) .
ILR is also known as Luma Mapping with Chroma Scaling (LMCS) in VVC.
2.1.1 PWL model
Conceptually, piece-wise linear (PWL) is implemented in the following way:
Let x1, x2 be two input pivot points, and y1, y2 be their corresponding output pivot points for one piece. The output value y for any input value x between x1 and x2 can be interpolated by the following equation:
y = ( (y2-y1) / (x2-x1) ) * (x-x1) + y1
In fixed point implementation, the equation can be rewritten as:
y = ( (m *x + 2FP_PREC-1) >> FP_PREC) + c
Herein, m is scalar, c is an offset, and FP_PREC is a constant value to specify the precision.
Note that in CE-12 software, the PWL model is used to precompute the 1024-entry FwdLUT and InvLUT mapping tables; but the PWL model also allows implementations to calculate identical mapping values on-the-fly without pre-computing the LUTs.
2.1.2 Test CE12-2
2.1.2.1 Luma reshaping
Test 2 of the in-loop luma reshaping (i.e., CE12-2 in the proposal) provides a lower complexity pipeline that also eliminates decoding latency for block-wise intra prediction in inter slice reconstruction. Intra prediction is performed in reshaped domain for both inter and intra slices.
Intra prediction is always performed in reshaped domain regardless of slice type. With such arrangement, intra prediction can start immediately after previous TU reconstruction is done. Such arrangement can also provide a unified process for intra mode instead of being slice dependent. FIG. 14 shows the block diagram of the CE12-2 decoding process based on mode.
CE12-2 also tests 16-piece piece-wise linear (PWL) models for luma and chroma residue scaling instead of the 32-piece PWL models of CE12-1.
Inter slice reconstruction with in-loop luma reshaper in CE12-2 (lighter shaded blocks indicate signal in reshaped domain: luma residue; intra luma predicted; and intra luma reconstructed)
2.1.2.2 Luma-dependent chroma residue scaling
Luma-dependent chroma residue scaling is a multiplicative process implemented with fixed-point integer operation. Chroma residue scaling compensates for luma signal interaction with the chroma signal. Chroma residue scaling is applied at the TU level. More specifically, the average value of the corresponding luma prediction block is utilized.
The average is used to identify an index in a PWL model. The index identifies a scaling factor cScaleInv. The chroma residual is multiplied by that number.
It is noted that the chroma scaling factor is calculated from forward-mapped predicted luma values rather than reconstructed luma values.
2.1.2.3 Usage of ILR
At the encoder side, each picture (or tile group) is firstly converted to the reshaped domain. And all the coding process is performed in the reshaped domain. For intra prediction, the neighboring block is in the reshaped domain; for inter prediction, the reference blocks (generated from the original domain from decoded picture buffer) are firstly converted to the reshaped domain. Then the residual are generated and coded to the bitstream.
After the whole picture (or tile group) finishes encoding/decoding, samples in the reshaped domain are converted to the original domain, then deblocking filter and other filters are applied.
Forward reshaping to the prediction signal is disabled for the following cases:
○ Current block is intra-coded
○ Current block is coded as CPR (current picture referencing, aka intra block copy, IBC)
○ Current block is coded as combined inter-intra mode (CIIP) and the forward reshaping is disabled for the intra prediction block
2.1.2.4 Syntax and semantics in VVC Working Draft 5
7.3.2.3 Sequence parameter set RBSP syntax
7.3.2.5 Adaptation parameter set syntax
7.3.5 Slice header syntax
7.3.5.1 General slice header syntax
7.3.5.4 Luma mapping with chroma scaling data syntax
LMCS APS: An APS that has aps_params_type equal to LMCS_APS.
sps_lmcs_enabled_flag equal to 1 specifies that luma mapping with chroma scaling is used in the CVS. sps_lmcs_enabled_flag equal to 0 specifies that luma mapping with chroma scaling is not used in the CVS.
7.4.3.5 Adaptation parameter set semantics
adaptation_parameter_set_id provides an identifier for the APS for reference by other syntax elements.
NOTE –APSs can be shared across pictures and can be different in different slices within a picture.
aps_params_type specifies the type of APS parameters carried in the APS as specified in Table 7-2.
Table 7-2–APS parameters type codes and types of APS parameters
slice_lmcs_enabled_flag equal to 1 specifies that luma mappin with chroma scaling is enabled for the current slice. slice_lmcs_enabled_flag equal to 0 specifies that luma mapping with chroma scaling is not enabled for the current slice. When slice_lmcs_enabled_flag is not present, it is inferred to be equal to 0.
slice_lmcs_aps_id specifies the adaptation_parameter_set_id of the LMCS APS that the slice refers to. The TemporalId of the LMCS APS NAL unit having adaptation_parameter_set_id equal to slice_lmcs_aps_id shall be less than or equal to the TemporalId of the coded slice NAL unit.
When multiple LMCS APSs with the same value of adaptation_parameter_set_id are referred to by two or more slices of the same picture, the multiple LMCS APSs with the same value of adaptation_parameter_set_id shall have the same content.
7.4.6.4 Luma mapping with chroma scaling data semantics
lmcs_min_bin_idx specifies the minimum bin index used in the luma mapping with chroma scaling construction process. The value of lmcs_min_bin_idx shall be in the range of 0 to 15, inclusive.
lmcs_delta_max_bin_idx specifies the delta value between 15 and the maximum bin index LmcsMaxBinIdx used in the luma mapping with chroma scaling construction process. The value of lmcs_delta_max_bin_idx shall be in the range of 0 to 15, inclusive. The value of LmcsMaxBinIdx is set equal to 15 -lmcs_delta_max_bin_idx. The value of LmcsMaxBinIdx shall be larger than or equal to lmcs_min_bin_idx.
lmcs_delta_cw_prec_minus1 plus 1 specifies the number of bits used for the representation of the syntax lmcs_delta_abs_cw [i] . The value of lmcs_delta_cw_prec_minus1 shall be in the range of 0 to BitDepthY -2, inclusive.
lmcs_delta_abs_cw [i] specifies the absolute delta codeword value for the ith bin.
lmcs_delta_sign_cw_flag [i] specifies the sign of the variable lmcsDeltaCW [i] as follows:
– If lmcs_delta_sign_cw_flag [i] is equal to 0, lmcsDeltaCW [i] is a positive value.
– Otherwise (lmcs_delta_sign_cw_flag [i] is not equal to 0) , lmcsDeltaCW [i] is a negative value.
When lmcs_delta_sign_cw_flag [i] is not present, it is inferred to be equal to 0.
The variable OrgCW is derived as follows:
OrgCW = (1 << BitDepth
Y) /16 (7-77)
The variable lmcsDeltaCW [i] , with i = lmcs_min_bin_idx.. LmcsMaxBinIdx, is derived as follows:
lmcsDeltaCW [i] = (1 -2 *lmcs_delta_sign_cw_flag [i] ) *lmcs_delta_abs_cw [i] (7-78)
The variable lmcsCW [i] is derived as follows:
– For i = 0.. lmcs_min_bin_idx -1, lmcsCW [i] is set equal 0.
– For i = lmcs_min_bin_idx.. LmcsMaxBinIdx, the following applies:
lmcsCW [i] = OrgCW + lmcsDeltaCW [i] (7-79)
The value of lmcsCW [i] shall be in the range of (OrgCW>>3) to (OrgCW<<3 -1) , inclusive.
– For i = LmcsMaxBinIdx + 1.. 15, lmcsCW [i] is set equal 0.
It is a requirement of bitstream conformance that the following condition is true:
The variable InputPivot [i] , with i = 0.. 16, is derived as follows:
InputPivot [i] = i *OrgCW (7-81)
The variable LmcsPivot [i] with i = 0.. 16, the variables ScaleCoeff [i] and InvScaleCoeff [i] with i = 0.. 15, are derived as follows:
The variable ChromaScaleCoeff [i] , with i = 0…15, is derived as follows:
chromaResidualScaleLut [] = {16384, 16384, 16384, 16384, 16384, 16384, 16384, 8192, 8192, 8192, 8192, 5461, 5461, 5461, 5461, 4096, 4096, 4096, 4096, 3277, 3277, 3277, 3277, 2731, 2731, 2731, 2731, 2341, 2341, 2341, 2048, 2048, 2048, 1820, 1820, 1820, 1638, 1638, 1638, 1638, 1489, 1489, 1489, 1489, 1365, 1365, 1365, 1365, 1260, 1260, 1260, 1260, 1170, 1170, 1170, 1170, 1092, 1092, 1092, 1092, 1024, 1024, 1024, 1024}
The variables ClipRange, LmcsMinVal, and LmcsMaxVal are derived as follows:
ClipRange = ( (lmcs_min_bin_idx > 0) && (LmcsMaxBinIdx < 15) (7-84)
LmcsMinVal = 16 << (BitDepth
Y -8) (7-85)
LmcsMaxVal = 235 << (BitDepth
Y -8) (7-86)
NOTE –Arrays InputPivot [i] and LmcsPivot [i] , ScaleCoeff [i] , and InvScaleCoeff [i] , ChromaScaleCoeff [i] , ClipRange, LmcsMinVal and LmcsMaxVal, are updated only when slice_lmcs_model_present_flag is equal to 1. Thus, the lmcs model may be sent with an IRAP picture, for example, but lmcs is disabled for that IRAP picture.
2.1.3 JVET-N0220
In JVET-N0220, four aspects are proposed:
1. reduction of the pipeline delay for computing average luma for chroma residue scaling
2. reduction of the size of the local buffer needed to store chroma residue samples
3. unification of the fixed-point precision used in luma mapping and chroma residue scaling
4. unification of the method of calculating chroma residue scale with the method of calculating the luma inverse scale, and removal of the pre-computed LUT.
Bullet 2-Bullet 4 has been adopted into VTM-5.
With the adopted parts, the working draft is revised as below.
7.4.5.4 Luma mapping with chroma scaling data semantics
lmcs_delta_sign_cw_flag [i] specifies the sign of the variable lmcsDeltaCW [i] as follows:
– If lmcs_delta_sign_cw_flag [i] is equal to 0, lmcsDeltaCW [i] is a positive value.
– Otherwise (lmcs_delta_sign_cw_flag [i] is not equal to 0) , lmcsDeltaCW [i] is a negative value.
When lmcs_delta_sign_cw_flag [i] is not present, it is inferred to be equal to 0.
The variable OrgCW is derived as follows:
OrgCW = (1 << BitDepth
Y) /16 (7-70)
The variable lmcsDeltaCW [i] , with i = lmcs_min_bin_idx.. LmcsMaxBinIdx, is derived as follows:
lmcsDeltaCW [i] = (1-2 *lmcs_delta_sign_cw_flag [i] ) *lmcs_delta_abs_cw [i] (7-71)
The variable lmcsCW [i] is derived as follows:
– For i = 0.. lmcs_min_bin_idx -1, lmcsCW [i] is set equal 0.
– For i = lmcs_min_bin_idx.. LmcsMaxBinIdx, the following applies:
lmcsCW [i] = OrgCW + lmcsDeltaCW [i] (7-72)
The value of lmcsCW [i] shall be in the range of (OrgCW>>3) to (OrgCW<<3 -1) , inclusive.
– For i = LmcsMaxBinIdx + 1.. 15, lmcsCW [i] is set equal 0.
It is a requirement of bitstream conformance that the following condition is true:
The variable InputPivot [i] , with i = 0.. 16, is derived as follows:
InputPivot [i] = i *OrgCW (7-74)
The variable LmcsPivot [i] with i = 0.. 16, the variables ScaleCoeff [i] and InvScaleCoeff [i] with i = 0.. 15, are derived as follows:
LmcsPivot [0] = 0;
for (i = 0; i <= 15; i++) {
LmcsPivot [i + 1] = LmcsPivot [i] + lmcsCW [i]
ScaleCoeff [i] = (lmcsCW [i] * (1 << 11) + (1 << (Log2 (OrgCW) -1) ) ) >> (Log2 (OrgCW) )
(7-75)
if (lmcsCW [i] = = 0)
InvScaleCoeff [i] = 0
else
InvScaleCoeff [i] = OrgCW * (1 << 11) /lmcsCW [i]
}
The variable ChromaScaleCoeff [i] , with i = 0…15, is derived as follows:
if (lmcsCW [i] = = 0)
ChromaScaleCoeff [i] = (1 << 11)
else {
ChromaScaleCoeff [i] = InvScaleCoeff [i]
}
The variables ClipRange, LmcsMinVal, and LmcsMaxVal are derived as follows:
ClipRange = ( (lmcs_min_bin_idx > 0) && (LmcsMaxBinIdx < 15) (7-77)
LmcsMinVal = 16 << (BitDepth
Y -8) (7-78)
LmcsMaxVal = 235 << (BitDepth
Y -8)
8.5.7.6 Weighted sample prediction process for combined merge and intra prediction
…
8.7.4.2 Picture reconstruction with mapping process for luma samples
…
idxY = predSamples [i] [j] >> Log2 (OrgCW)
PredMapSamples [i] [j] = LmcsPivot [idxY]+ (ScaleCoeff [idxY] * (predSamples [i] [j] -InputPivot [idxY] ) + (1 << 10 ) ) >> 11 (8-1058)
with i = 0.. nCurrSw -1, j = 0.. nCurrSh –1
8.7.4.3.1 Picture inverse mapping process of luma samples
…
The variable invSample is derived as follows:
8.7.4.4 Picture reconstruction with luma dependent chroma residual scaling process for chroma samples
…
The variable idxYInv is derived by invoking the identification of piece-wise function index as specified in clause 8.7.5.3.2 with invAvgLuma as the input and idxYInv as the output.
1. The variable varScale is derived as follows:
varScale = ChromaScaleCoeff [idxYInv] (8-1065)
– The recSamples is derived as follows:
– If tu_cbf_cIdx [xCurr] [yCurr] equal to 1, the following applies:
resSamples [i] [j] = Clip3 (- (1 << BitDepth
C) , 1 << BitDepth
C -1, resSamples [i] [j] )
– Otherwise (tu_cbf_cIdx [xCurr] [yCurr] equal to 0) , the following applies: recSamples [xCurr + i] [yCurr + j] = ClipCidx1 (predSamples [i] [j] ) (8-1067)
2.2 Adaptation Parameter Set (APS)
An Adaptation Parameter Set (APS) is adopted in VVC to carry ALF parameters. The tile group header contains an aps_id which is conditionally present when ALF is enabled. The APS contains an aps_id and the ALF parameters. A new NUT (NAL unit type, as in AVC and HEVC) value is assigned for APS (from JVET-M0132) . For the common test conditions in VTM-4.0 (to appear) , it is suggested just using aps_id = 0 and sending the APS with each picture. For now, the range of APS ID values will be 0.. 31 and APSs can be shared across pictures (and can be different in different tile groups within a picture) . The ID value should be fixed-length coded when present. ID values cannot be re-used with different content within the same picture.
7.3.5.3 Adaptive loop filter data syntax
7.3.7.2 Coding tree unit syntax
slice_alf_enabled_flag equal to 1 specifies that adaptive loop filter is enabled and may be applied to Y, Cb, or Cr co lour component in a slice. slice_alf_enabled_flag equal to 0 specifies that adaptive loop filter is disabled for all colo ur components in a slice.
num_alf_aps_ids_minus1 plus 1 specifies the number of ALF APSs that the slice refers to. The value of num_alf_ap s_ids_minus1 shall be in the range of 0 to 7, inclusive.
slice_alf_aps_id [i] specifies the adaptation_parameter_set_id of the i-th ALF APS that the slice refers to. The Tem poralId of the ALF APS NAL unit having adaptation_parameter_set_id equal to slice_alf_aps_id [i] shall be less th an or equal to the TemporalId of the coded slice NAL unit.
When multiple ALF APSs with the same value of adaptation_parameter_set_id are referred to by two or more slices of the same picture, the multiple ALF APSs with the same value of adaptation_parameter_set_id shall have the same content.
alf_ctb_flag [cIdx] [xCtb >> Log2CtbSize] [yCtb >> Log2CtbSize] equal to 1 specifies that the adaptive loop filt er is applied to the coding tree block of the colour component indicated by cIdx of the coding tree unit at luma locati on (xCtb, yCtb) . alf_ctb_flag [cIdx] [xCtb >> Log2CtbSize] [yCtb >> Log2CtbSize] equal to 0 specifies that the adaptive loop filter is not applied to the coding tree block of the colour component indicated by cIdx of the coding tr ee unit at luma location (xCtb, yCtb) .
When alf_ctb_flag [cIdx] [xCtb >> Log2CtbSize] [yCtb >> Log2CtbSize] is not present, it is inferred to be equal to 0.
2.3 Cross-Component Linear Model (CCLM) intra prediction in VVC
To reduce the cross-component redundancy, a cross-component linear model (CCLM) prediction mode is used in the VTM4, for which the chroma samples are predicted based on the reconstructed luma samples of the same CU by using a linear model as follows:
pred
C (i, j) =α·rec
L′ (i, j) +β (3-1)
Herein, pred
C (i, j) represents the predicted chroma samples in a CU and rec
L (i, j) represents the downsampled reconstructed luma samples of the same CU. Linear model parameter α and β are derived from the relation between luma values and chroma values from two samples, which are luma sample with minimum sample value and with maximum smample sample inside the set of downsampled neighboring luma samples, and their corresponding chroma samples. The linear model parameters α and β are obtained according to the following equations.
β=Y
b-á·X
b (3-3)
Herein, Y
a and X
a represent luma value and chroma value of the luma sample with maximum luma sample value. And X
b and Y
b represent luma value and chroma value of the luma sample with minimum luma sample, respectively. FIG. 2 shows an example of the location of the left and above samples and the sample of the current block involved in the CCLM mode.
The division operation to calculate parameterα is implemented with a look-up table. To reduce the memory required for storing the table, the diff value (difference between maximum and minimum values) and the parameterα are expressed by an exponential notation. For example, diff is approximated with a 4-bit significant part and an exponent. Consequently, the table for 1/diff is reduced into 16 elements for 16 values of the significand as follows:
DivTable [] = {0, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 1, 1, 0 } .
This would have a benefit of both reducing the complexity of the calculation as well as the memory size required for storing the needed tables
Besides the above template and left template can be used to calculate the linear model coefficients together, they also can be used alternatively in the other 2 LM modes, called LM_A, and LM_L modes.
In LM_Amode, only the above template are used to calculate the linear model coefficients. To get more samples, the above template are extended to (W+H) . In LM_L mode, only left template are used to calculate the linear model coefficients. To get more samples, the left template are extended to (H+W) .
For a non-square block, the above template are extended to W+W, the left template are extended to H+H.
To match the chroma sample locations for 4: 2: 0 video sequences, two types of downsampling filter are applied to luma samples to achieve 2 to 1 downsampling ratio in both horizontal and vertical directions. The selection of downsampling filter is specified by a SPS level flag. The two downsampling filters are as follows, which are corresponding to “type-0” and “type-2” content, respectively.
Note that only one luma line (general line buffer in intra prediction) is used to make the downsampled luma samples when the upper reference line is at the CTU boundary.
This parameter computation is performed as part of the decoding process, and is not just as an encoder search operation. As a result, no syntax is used to convey the α and β values to the decoder.
For chroma intra mode coding, a total of 8 intra modes are allowed for chroma intra mode coding. Those modes include five traditional intra modes and three cross-component linear model modes (CCLM, LM_A, and LM_L) . Chroma mode signaling and derivation process are shown in the table below. Chroma mode coding directly depends on the intra prediction mode of the corresponding luma block. Since separate block partitioning structure for luma and chroma components is enabled in I slices, one chroma block may correspond to multiple luma blocks. Therefore, for Chroma DM mode, the intra prediction mode of the corresponding luma block covering the center position of the current chroma block is directly inherited.
The decoding process specified in JVET-N1001-v2 is demonstrated as below.
8.4.4.2.8 Specification of INTRA_LT_CCLM, INTRA_L_CCLM and INTRA_T_CCLM intra prediction mode
…
8. The variables a, b, and k are derived as follows:
– If numSampL is equal to 0, and numSampT is equal to 0, the following applies:
k = 0 (8-208)
a = 0 (8-209)
b = 1 << (BitDepth
C -1) (8-210)
– Otherwise, the following applies:
diff = maxY -minY (8-211)
–If diff is not equal to 0, the following applies:
diffC = maxC –minC (8-212)
x = Floor (Log2 (diff) ) (8-213)
normDiff = ( (diff << 4) >> x) &15 (8-214)
x += (normDiff ! = 0) ? 1 : 0 (8-215)
y = Floor (Log2 (Abs (diffC) ) ) + 1 (8-216)
a = (diffC * (divSigTable [normDiff] | 8) + 2
y
-1) >> y (8-217)
k = ( (3 + x -y) < 1) ? 1 : 3 + x -y (8-218)
a = ( (3 + x -y) < 1) ? Sign (a) *15: a (8-219)
b = minC - ( (a *minY) >> k) (8-220)
where divSigTable [] is specified as follows:
divSigTable [] = {0, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 1, 1, 0 } (8-221)
–Otherwise (diff is equal to 0) , the following applies:
k = 0 (8-222)
a = 0 (8-223)
b = minC (8-224)
9. The prediction samples predSamples [x] [y] with x = 0.. nTbW -1, y = 0.. nTbH -1 are derived as follows:
predSamples [x] [y] = Clip1C ( ( (pDsY [x] [y] *a) >> k) + b) (8-225)
2.4 MV Scaling in VVC
MV scaling is applied for Temproal Motion Vector Prediction (TMVP) and AMVP.
In JVET-N1001, MV scaling is described as:
8.5.2.12 Derivation process for collocated motion vectors
…
– Otherwise, mvLXCol is derived as a scaled version of the motion vector mvCol as follows:
tx = (16384 + (Abs (td) >> 1) ) /td (8-421)
distScaleFactor = Clip3 (-4096, 4095, (tb *tx + 32) >> 6) (8-422)
mvLXCol = Clip3 (-131072, 131071, Sign (distScaleFactor *mvCol) * ( (Abs (distScaleFactor *mvCol) + 127) >> 8) ) (8-423)
where td and tb are derived as follows:
td = Clip3 (-128, 127, colPocDiff) (8-424)
td = Clip3 (-128, 127, currPocDiff) (8-425)
The division operator to derive tx may be implemented by a lookup table. The lookup table MV_SCALE_T [poc_diff_idx] may be with a size of 128, and MV_SCALE_T [poc_diff_idx] = (16384 + (Abs (poc_diff_idx) >> 1) ) /poc_diff_idx, for poc_diff_idx from 1 to 128.
2.5 Localized Illumination Compensation
Local Illumination Compensation (LIC) is based on a linear model for illumination changes, using a scaling factor a and an offset b. And it is enabled or disabled adaptively for each inter-mode coded coding unit (CU) .
When LIC applies for a CU, a least square error method is employed to derive the parameters a and b by using the neighbouring samples of the current CU and their corresponding reference samples (also known as the reference neighbouring samples) . More specifically, as illustrated in FIG. 3, the subsampled (2: 1 subsampling) neighbouring samples of the CU and the corresponding samples (identified by motion information of the current CU or sub-CU) in the reference picture are used. The IC parameters are derived and applied for each prediction direction separately.
When a CU is coded with merge mode, the LIC flag is copied from neighbouring blocks, in a way similar to motion information copy in merge mode; otherwise, an LIC flag is signalled for the CU to indicate whether LIC applies or not.
When LIC is enabled for a pciture, addtional CU level RD check is needed to determine whether LIC is applied or not for a CU. When LIC is enabled for a CU, mean-removed sum of absolute diffefference (MR-SAD) and mean-removed sum of absolute Hadamard-transformed difference (MR-SATD) are used, instead of SAD and SATD, for integer pel motion search and fractional pel motion search, respectively.
To reduce the encoding complexity, the following encoding scheme is applied in the JEM.
LIC is disabled for the entire picture when there is no obvious illumination change between a current picture and its reference pictures. To identify this situation, histograms of a current picture and every reference picture of the current picture are calculated at the encoder. If the histogram difference between the current picture and every reference picture of the current picture is smaller than a given threshold, LIC is disabled for the current picture; otherwise, LIC is enabled for the current picture.
3 Drawbacks of existing implementations
The current design of IRL (aka LMCS) and CCLM may have the following problems:
(1) There is a division operation in the process of LMCS, which is undesirable in a hardware design.
(2) In CCLM mode, the chroma samples are in the original domain but the luma samples are in the reshaping domain with LMCS, which may be inefficient.
(3) The signaling method of the LMCS model may be inefficient.
4 Example methods for lossless coding for visual media coding
Embodiments of the presently disclosed technology overcome the drawbacks of existing implementations, thereby providing video coding with higher coding efficiencies. The methods for scaling and division operations in video coding, based on the disclosed technology, may enhance both existing and future video coding standards, is elucidated in the following examples described for various implementations. The examples of the disclosed technology provided below explain general concepts, and are not meant to be interpreted as limiting. In an example, unless explicitly indicated to the contrary, the various features described in these examples may be combined.
In the following discussion, SatShift (x, n) is defined as
Shift (x, n) is defined as Shift (x, n) = (x+ offset0) >>n.
In one example, offset0 and/or offset1 are set to (1<<n) >>1 or (1<< (n-1) ) . In another example, offset0 and/or offset1 are set to 0.
In another example, offset0=offset1= ( (1<<n) >>1) -1 or ( (1<< (n-1) ) ) -1.
Clip3 (min, max, x) is defined as
Floor (x) is defined as the largest integer less than or equal to x.
Ceil (x) the smallest integer greater than or equal to x.
Log2 (x) is defined as the base-2 logarithm of x.
Division replacement in LMCS
1. It is proposed that a division operation in the video/image coding/decoding process, such as InvScaleCoeff [i] = OrgCW * (1 << 11) /lmcsCW [i] in the LMCS approach, may be replaced or approximated by an operation or a procedure of multiple operations.
a. In one example, the operation or the procedure of multiple operations may comprise the operation of inquiring an entry of a table with an index.
i. In an alternative example, it may comprise operations of inquiring multiple entries of one or multiples table with an index
b. In one example, the operation or the procedure of multiple operations may comprise an operation which is not the division operation.
i. In one example, it may comprise the operation of multiplication.
ii. In one example, it may comprise the operation of addition.
iii. In one example, it may comprise the operation of SatShift (x, n) .
iv. In one example, it may comprise the operation of Shift (x, n) .
v. In one example, it may comprise the operation of left shift.
vi. In one example, it may comprise the operation of Floor (x) .
vii. In one example, it may comprise the operation of Log2 (x) .
viii. In one example, it may comprise the operation of “logical or” (| in C language) .
ix. In one example, it may comprise the operation of “logical and” (&in C language) .
2. In one example, a table denoted as T [idx] may be used to replace or approximate the division operation in bullet 1.
a. In one example, the table size may be equal to 2
M and idx may be in a range of [0, 2
M-1] , inclusively.
b. In one example, T [idx] = Rounding (2
P/ (idx+offset0) ) -offset1, where offset0 and offset 1 are integers. P is an integer. E.g. P=8 or P=9 or P=10 or P=11 or P=12 or P=13 or P=14.
i. In one example, Rounding (x/y) = Floor ( (x+y/2) /y) ;
ii. In one example, Rounding (x/y) is set equal to an integer Q, so that |Q*y-x|<=|Q’*y-x| for any integers Q’ in a set. For example, the set may be {Floor (x/y) -1, Floor (x/y) , Floor (x/y) +1} .
iii. In one example, offset0 may be equal to 2
W, e.g., W=0 or W=1 or W=4, or W=5, or W=6, or W=7.
1) Alternatively, offset0 may be 0.
iv. In one example, offset1 may be equal to 2
Z, e.g. Z=3, or Z=4, or Z=5, or Z=6, or Z=7.
1) Alternatively, offset1 may be 0,
c. In one example, M defined in bullet 2. a may be equal to W defined in bullet 2. b.
d. In one example, Z may be equal to P-W-1, where Z, P and W are defined in bullet 2. b.
e. In one example, T [idx] should be smaller than 2
Z.
i. In one example, if T [idx] is equal to 2
Z, it may be set equal to 0.
ii. For example, T [idx] = (Rounding (2
P/ (idx+offset0) ) -offset1) %offset1.
3. Some examples of tables are demonstrated as below.
a. W=M=7, P=15,
T= {0, 126, 124, 122, 120, 118, 117, 115, 113, 111, 109, 108, 106, 104, 103, 101, 100, 98, 96, 95, 93, 92, 90, 89, 88, 86, 85, 83, 82, 81, 79, 78, 77, 76, 74, 73, 72, 71, 69, 68, 67, 66, 65, 64, 63, 61, 60, 59, 58, 57, 56, 55, 54, 53, 52, 51, 50, 49, 48, 47, 46, 45, 44, 44, 43, 42, 41, 40, 39, 38, 37, 37, 36, 35, 34, 33, 33, 32, 31, 30, 30, 29, 28, 27, 27, 26, 25, 24, 24, 23, 22, 22, 21, 20, 20, 19, 18, 18, 17, 16, 16, 15, 14, 14, 13, 13, 12, 11, 11, 10, 10, 9, 9, 8, 7, 7, 6, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1} .
b. W=M=7, P=14,
T= {0, 63, 62, 61, 60, 59, 58, 57, 56, 56, 55, 54, 53, 52, 51, 51, 50, 49, 48, 47, 47, 46, 45, 45, 44, 43, 42, 42, 41, 40, 40, 39, 38, 38, 37, 37, 36, 35, 35, 34, 34, 33, 32, 32, 31, 31, 30, 30, 29, 29, 28, 28, 27, 27, 26, 26, 25, 25, 24, 24, 23, 23, 22, 22, 21, 21, 20, 20, 20, 19, 19, 18, 18, 18, 17, 17, 16, 16, 16, 15, 15, 14, 14, 14, 13, 13, 13, 12, 12, 12, 11, 11, 10, 10, 10, 9, 9, 9, 8, 8, 8, 8, 7, 7, 7, 6, 6, 6, 5, 5, 5, 5, 4, 4, 4, 3, 3, 3, 3, 2, 2, 2, 2, 1, 1, 1, 1, 0} .
c. W=M=6, P=13,
T= {0, 0, 31, 31, 30, 30, 29, 29, 28, 28, 27, 27, 27, 26, 26, 25, 25, 24, 24, 24, 23, 23, 23, 22, 22, 22, 21, 21, 21, 20, 20, 20, 19, 19, 19, 18, 18, 18, 17, 17, 17, 16, 16, 16, 16, 15, 15, 15, 15, 14, 14, 14, 14, 13, 13, 13, 13, 12, 12, 12, 12, 11, 11, 11, 11, 10, 10, 10, 10, 10, 9, 9, 9, 9, 9, 8, 8, 8, 8, 8, 7, 7, 7, 7, 7, 6, 6, 6, 6, 6, 6, 5, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0} .
d. W=M=6, P=14,
T= {0, 124, 120, 117, 113, 109, 106, 103, 100, 96, 93, 90, 88, 85, 82, 79, 77, 74, 72, 69, 67, 65, 63, 60, 58, 56, 54, 52, 50, 48, 46, 44, 43, 41, 39, 37, 36, 34, 33, 31, 30, 28, 27, 25, 24, 22, 21, 20, 18, 17, 16, 14, 13, 12, 11, 10, 9, 7, 6, 5, 4, 3, 2, 1} .
e. W=M=6, P=13,
T= {0, 62, 60, 58, 56, 55, 53, 51, 50, 48, 47, 45, 44, 42, 41, 40, 38, 37, 36, 35, 34, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 20, 19, 18, 17, 16, 16, 15, 14, 13, 13, 12, 11, 10, 10, 9, 8, 8, 7, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1} .
f. W=M=6, P=12,
T= {0, 31, 30, 29, 28, 27, 27, 26, 25, 24, 23, 23, 22, 21, 21, 20, 19, 19, 18, 17, 17, 16, 16, 15, 15, 14, 14, 13, 13, 12, 12, 11, 11, 10, 10, 9, 9, 9, 8, 8, 7, 7, 7, 6, 6, 6, 5, 5, 5, 4, 4, 4, 3, 3, 3, 2, 2, 2, 2, 1, 1, 1, 1, 0} .
g. W=M=6, P=11,
T= {0, 0, 15, 15, 14, 14, 13, 13, 12, 12, 12, 11, 11, 11, 10, 10, 10, 9, 9, 9, 8, 8, 8, 8, 7, 7, 7, 7, 6, 6, 6, 6, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0} .
h. W=M=6, P=10,
T= { 0, 0, 0, 7, 7, 7, 7, 6, 6, 6, 6, 6, 5, 5, 5, 5, 5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0} .
i. W=M=5, P=13,
T= {0, 120, 113, 106, 100, 93, 88, 82, 77, 72, 67, 63, 58, 54, 50, 46, 43, 39, 36, 33, 30, 27, 24, 21, 18, 16, 13, 11, 9, 6, 4, 2} .
j. W=M=5, P=12,
T= {0, 60, 56, 53, 50, 47, 44, 41, 38, 36, 34, 31, 29, 27, 25, 23, 21, 20, 18, 16, 15, 13, 12, 10, 9, 8, 7, 5, 4, 3, 2, 1} .
k. W=M=5, P=11,
T= {0, 30, 28, 27, 25, 23, 22, 21, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 7, 6, 5, 5, 4, 3, 3, 2, 2, 1, 1} .
l. W=M=5, P=10,
T= {0, 15, 14, 13, 12, 12, 11, 10, 10, 9, 8, 8, 7, 7, 6, 6, 5, 5, 4, 4, 4, 3, 3, 3, 2, 2, 2, 1, 1, 1, 1, 0} .
m. W=M=5, P=9,
T= {0, 0, 7, 7, 6, 6, 5, 5, 5, 4, 4, 4, 4, 3, 3, 3, 3, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0} .
n. W=M=4, P=10,
T= {0, 28, 25, 22, 19, 17, 15, 13, 11, 9, 7, 6, 5, 3, 2, 1} .
o. W=M=4, P=9,
T= {0, 14, 12, 11, 10, 8, 7, 6, 5, 4, 4, 3, 2, 2, 1, 1} .
p. offset0= 0, M=8, P=14
T[k] = (2
14+k/2) /k, for k from 1 to 256;
q. offset0= 0, M=8, P=11
T[k] = (2
11+k/2) /k, for k from 1 to 256;
r. offset0= 0, M=9, P=14
T[k] = (2
14+k/2) /k, for k from 1 to 512;
s. offset0= 0, M=9, P=11
T[k] = (2
11+k/2) /k, for k from 1 to 512;
t. offset0= 0, M=10, P=14
T[k] = (2
14+k/2) /k, for k from 1 to 1024;
u. offset0= 0, M=10, P=11
T[k] = (2
11+k/2) /k, for k from 1 to 1024;
v. T = {0, 248, 240, 233, 226, 219, 212, 206, 199, 193, 187, 181, 175, 170, 164, 159, 154, 149, 144, 139, 134, 130, 125, 121, 116, 112, 108, 104, 100, 96, 93, 89, 85, 82, 78, 75, 72, 68, 65, 62, 59, 56, 53, 50, 47, 45, 42, 39, 37, 34, 31, 29, 26, 24, 22, 19, 17, 15, 13, 10, 8, 6, 4, 2} .
4. When T [idx] is used in the procedure of multiple operations to replace or approximate the division operation in bullet 2, idx may be derived with a procedure of multiple operations.
a. In one example, the procedure of multiple operations may comprise addition, Log2 (x) , Floor (x) , left shift, Shift (x, n) or SatShift (x, n) , “logical or” and “logical and” .
b. In one example, the procedure of multiple operations may depend on M or W defined in bullet 2.
c. In one example, idx may be set equal to Shift (D<<M, Floor (Log2 (D) ) ) & (2
M-1) , where D is the denominator , such as lmcsCW [i] in the LMCS process.
i. idx may be further modified by adding an offset as idx = ( (Shift (D<<M, Floor (Log2 (D) ) ) & (2
M-1) ) + Off.
1) The modified idx may be clipped to the range of [0, 2
M-1] , inclusively.
2) For example, idx = Min (2
M-1, idx0 + Off) , where idx0 =(Shift (D<<M, Floor (Log2 (D) ) ) & (2
M-1) and Off= (Shift (D<< (M+1) , Floor (Log2 (D) ) ) ) &1.
d. In one example, the variable M in bullet 4. c may be replaced to be W defined in bullet 2.
e. Alternatively, furthermore, clipping may be further applied to the resulted value.
5. It is proposed that T [idx] may be modified to an intermedium value V, which will be used in the procedure of multiple operations to replace or approximate the division operation in bullet 2.
a. In one example, V=T [idx] .
b. In one example, V=T [idx] <<m, where m is an integer.
c. In one example, V=T [idx] *m, where m is an integer.
d. In one example, V=T [idx] +m, where m is an integer.
e. In one example, V=Shift (T [idx] , n) or V=SatShift (T [idx] , n) where m is an integer such as 1 or 2.
1) In one example, the value n may depend on the value of D defined in bullet 4.
a) For example, n = 0 if D <= 2
M, where M is defined in bullet 2.
b) For example, n = Ceil (Log2 (D) ) -M if D > 2
M.
c) For example, n = Floor (Log2 (D) ) -M+1 if D > 2
M.
f. In one example, the modification method may depend on the value of T [idx] and/or the value of idx.
i. In one example, the modification method may depend on the whether the value of T [idx] is equal to a fixed value such as 0.
ii. In one example, the modification method may depend on the whether the value of idx is smaller than a value TT, e.g. TT=1 or 2 or 2
M-1. In another example, the modification method may depend on the whether the value of idx is equal to a fixed value such as 0.
iii. In one example, the modification method may depend on the whether the value of T [idx] is equal to 0 and value of idx is smaller than a value TT, e.g. TT=1 or 2 or 2
M-1. In another example, the modification method may depend on the whether the value of T [idx] is equal to a fixed value such as 0, and the value of idx is equal to a fixed value such as 0.
g. In one example, V= T [idx] |2
Z, where Z is defined in bullet 2.
i. Alternatively, V= 2
Z+1 if the value of T [idx] is equal to 0 and the value of idx is smaller than a threshold TT, e.g. TT=1 or 2 or 2
M-1; otherwise, V= T [idx] |2
Z.
ii. Alternatively, V= 2
Z+1 if the value of T [idx] is equal to 0 and the value of idx equal to 0; otherwise, V= T [idx] |2
Z.
h. In one example, V= T [idx] +2
Z, where Z is defined in bullet 2.
i. Alternatively, V= 2
Z+1 if the value of T [idx] is equal to 0 and the value of idx is smaller than a threshold TT, e.g. TT=1 or 2 or 2
M-1; otherwise, V= T [idx] +2
Z.
ii. Alternatively, V= 2
Z+1 if the value of idx is equal to 0; otherwise, V= T [idx] +2
Z.
6. It is proposed that the multiplication of the numerator in the division operation to be replaced or approximated, such as OrgCW in the LMCS approach, denoted as N, and the intermedium value V defined in bullet 5, may be used as the replacement or approximation of the division result. In a formulated way, R = N *V may be used as the replacement or approximation of the division result.
a. Alternatively, a modified R, denoted as R’ may be used as the replacement or approximation of the division result.
i. In one example, R’ = R<<m, where m is an integer such as 11.
1) Alternatively, R’ =Shift (R, m) or R’ =SatShift (R, m) where m is an integer.
2) In one example, m may depend on the value n defined in bullet 5. e.
a) For example, m=offset-n, where offset is an integer such as 11.
ii. In one example, the modification method may depend on P as defined in bullet 2 and/or W (or M) defined in bullet 2, and/or D as defined in bullet 4, and/or T and/or idx.
1) In one example, the modification method may depend on the relationship between a fixed number S. and a functionf of P, W (or M) , T, idx and D.
a) For example, f (P, W, D, T, idx) =P-W-1+Log2 (D) , orf (P, M, D, T, idx) =P-M-1+Log2 (D) .
b) For example, f (P, W, D, T, idx) =P-W-1+Log2 (D) +off, or f (P, M, D, T, idx) =P-M-1+Log2 (D) +off.
i. In one example, off is equal to 0 if the value of T [idx] is equal to 0 and the value of idx is smaller than a threshold TT, e.g. TT=1 or 2 or 2
M-1; otherwise, off= 1.
ii. In one example, off is equal to 0 if the value of idx is equal to 0; otherwise, off= 1.
c) For example, f (P, W, D, T, idx) =P-W+Log2 (D) , or f (P, M, D, T, idx) =P-M+Log2 (D) .
d)For example, f (P, W, D, T, idx) =P-W+1+Log2 (D) , or f (P, M, D, T, idx) =P-M+1+Log2 (D) .
e) For example, S=11 or 14.
f) For example, iff (P, W, D, T, idx) <=S, R’ =R<< (S-f (P, W, D, T, idx) ) ; otherwise, R’ =Shift (R, f (P, W, D, T, idx) -S) .
g) For example, iff (P, M, D, T, idx) <=S, R’ =R<< (S-f (P, M, D, T, idx) ) ; otherwise, R’ =Shift (R, f (P, M, D, T, idx) -S) .
b. Alternatively, a modified N, denoted as N’ may be used to derive R, as R=N’ *V.
i. In one example, N’ = N<<m, where m is an integer such as 11.
1) Alternatively, N’ =Shift (N, m) or N’ =SatShift (N, m) where m is an integer.
2) In one example, m may depend on the value n defined in bullet 5. e.
a) For example, m=offset-n, where offset is an integer such as 11.
ii. In one example, the modification method may depend on P as defined in bullet 2 and/or W (or M) defined in bullet 2, and/or D as defined in bullet
4. and/or T and/or idx.
1) In one example, the modification method may depend on the relationship between a fixed number S. and a function f of P, W (or M) , T, idx and D.
a) Functionf may be defined as in bullet 6. a.
b) For example, S=11.
c) For example, iff (P, W, D) <=S, N’ =N<< (S-f (P, W, D) ) ; otherwise, N’ =Shift (N, f (P, W, D) -S) .
d) For example, iff (P, M, D) <=S, N’ =N<< (S-f (P, M, D) ) ; otherwise, N’ =Shift (N, f (P, M, D) -S) .
7. It is proposed the result of operation or the procedure of multiple operations to replace or approximate the division operator, denoted as R, may be associated with a precision value, denoted as Q. When R is used to calculate a variable B, B may be calculated as B=Shift (g (R) , Q) or B= SatShift (g (R) , Q) , where g (R) is any function on R.
a. In one example in the approach of LMCS, R may be InvScaleCoeff [idxYInv] , and the inverted-back luma reconstructed sample may be derived as below:
invSample = InputPivot [idxYInv] +difference,
where
difference= Shift (g (InvScaleCoeff [idxYInv] ) , Q) , and
g (InvScaleCoeff [idxYInv] ) = InvScaleCoeff [idxYInv] * (lumaSample [xP] [yP] -LmcsPivot [idxYInv] ) ,
Q=11.
b. In one example in the approach of LMCS, R may be ChromaScaleCoeff [idxYInv] , and the inverted-back chroma residue sample may be derived as below:
invResSample [i] [j] =
Sign (resSamples [i] [j] ) * (Shift (ChromaScaleCoeff [idxYInv] , Q) ) ,
where
g (ChromaScaleCoeff [idxYInv] ) = Abs (resSamples [i] [j] ) *ChromaScaleCoeff [idxYInv] ,
Q=11.
c. In one example, Q may be a fixed number such as 11 or 14.
d. In one example, Q may depend on P as defined in bullet 2 and/or W (or M) defined in bullet 2, and/or D as defined in bullet 4 and/or the table T, and/or idx.
i. In one example, Q = P-W-1+Log2 (D) , or Q = P-M-1+Log2 (D) .
ii. For example, Q=P-W-1+Log2 (D) +off, or Q=P-M-1+Log2 (D) +off.
1) In one example, off is equal to 0 if the value of T [idx] is equal to 0 and the value of idx is smaller than a threshold TT, e.g. TT=1 or 2 or 2
M-1; otherwise, off= 1.
iii. For example, Q=P-W+Log2 (D) , or Q=P-M+Log2 (D) .
iv. For example, Q =P-W+1+Log2 (D) , or Q =P-M+1+Log2 (D) .
e. In one example, Q may depend on n defined in bullet 5. e.
i. For example, Q=offset-n, where offset is an integer such as 11 or 14.
8. It is proposed that D or absolute value of D is always larger than a value G, such as 8, then the table size may depend on G.
a. In one example, G may depend on the sample bit-depth.
b. In one example, the table size may depend on the sample bit-depth.
Unification of Division Replacement
9. In one example, the division operation used in MV scaling may be used in the procedure of multiple operations to replace or approximate the division operation in bullet 1.
a. In one example, D may be converted into an intermedium variable D’ in the range [minD, maxD] (such as [-128, 127] ) inclusively, where D is the denominator (e.g., lmcsCW [i] in bullet 1) .
i. In one example, the conversion form D to D’ may be a linear or non-linear quantization process.
ii. In one example, D’ = Clip3 (D, minD, maxD) ;
iii. In one example, D’ = Clip3 (D, 0, maxD) ;
iv. In one example, D’ = SatShift (D, n) or D’ = Shift (D, n) , where n is an integer such as 1 or 2.
1) Alternatively, D’ = Clip3 (Shift (D, n) , minD, maxD) or D’ = Clip3 (SatShift (D, n) , -128, 127) , where n is an integer such as 1 or 2.
2) Alternatively, D’ = Clip3 (Shift (D, n) , 0, maxD) or D’ = Clip3 (SatShift (D, n) , 0, maxD) , where n is an integer such as 1 or 2.
3) In one example, the value n may depend on the value of D.
a) For example, n = 0 if D <= maxD.
b) For example, n = Ceil (Log2 (D) ) -Log2 (maxD) if D > maxD.
c) For example, n = Floor (Log2 (D) ) - (Log2 (maxD) -1) if D >maxD.
b. In one example, an intermedium variable R derived as R = (16384 + (Abs (D’) >> 1) ) /D’ , may be used in the procedure of multiple operations to replace or approximate the division operation in bullet 1.
i. Alternatively, the table MV_SCALE_T may be used to derive R.
c. In one example, R in bullet 8. b may be associated with a precision value, denoted as Q. When R is used to calculate a variable B, B may be calculated as B=Shift (g (R) , Q) or B= SatShift (g (R) , Q) , where g (R) is any function on R.
i. In one example, Q may depend on n in bullet 8. a.
1) Alternatively, Q may be equal to Offset-n, where Offset is an integer such as 14.
d. The disclosed MV scaling methods may be used in MV scaling for temproal motion vector prediction (TMVP) of merge inter-mode or Advanced Motion Vector Prediction (AMVP) mode.
e. The disclosed MV scaling methods may be used in MV scaling for the affine prediction mode.
f. The disclosed MV scaling methods may be used in MV scaling for the sub-block based temporal merge mode.
10. In one example, the same table may be used in the procedure of multiple operations to replace or approximate the division operation in CCLM and LMCS.
a. For example, the table may be any one demonstrated in bullet 3, such as bullet 3. e.
11. In one example, the same table may be used in the procedure to replace or approximate the division operation in MV scaling and LMCS.
a. For example, the table may be any one demonstrated in bullet 3, such as bullet 3. e.
12. In one example, the method to replace or approximate the division operation in CCLM may be used in the procedure to replace or approximate the division operation in Localized Illumination Compensation (LIC) .
a. Alternatively, the method to replace or approximate the division operation in LMCS may be used in the procedure to replace or approximate the division operation in LIC.
b. Alternatively, the method to replace or approximate the division operation in MV scaling may be used in the procedure to replace or approximate the division operation in LIC.
CCLM, LIC and LMCS
13. Chroma samples are predicted from reconstructed luma samples in the original domain even the luma block is coded with LMCS.
a. In one example, for the CCLM mode, neighbouring (adjacent or non-adjacent) reconstructed luma samples which are in the reshaping domain may be converted to the original domain first, then the converted luma samples are used to derive the linear model parameters, such as a and b.
b. In one example, for the CCLM mode, the collocated luma samples which are in the reshaping domain may be converted to the original domain first, then the converted luma samples are used with the linear model to generate the prediction chroma samples.
Signaling of LMCS models
14. Indications of the scaling coefficient (e.g., ScaleCoeff [i] in VVC) and/or inverse scaling coefficient (e.g., InvScaleCoeff [i] in VVC) used in LMCS may be signaled.
a. In one example, the scaling coefficient may be directly coded. Alternatively, the quantized value of the scaling coefficient may be signaled.
b. It may be signaled in a predictive way.
c. It may be signaled with fixed-length coding, or unary coding, or truncated unary coding, or exponential Golomb code.
d. It may be signaled in a video unit such as VPS/SPS/PPS/APS/picture header/slice header/tile group header, etc.
15. Indications of the chroma scaling factor in LMCS such as ChromaScaleCoeff [i] may be signaled.
a. In one example, the chroma scaling factor may be directly coded. Alternatively, the quantized value of the scaling factor may be signaled.
b. It may be signaled in a predictive way.
c. It may be signaled with fixed-length coding, or unary coding, or truncated unary coding, or exponential Golomb code.
d. It may be signaled in a video unit such as VPS/SPS/PPS/APS/picture header/slice header/tile group header, etc.
i. Alternatively, furthermore, all video blocks in the video unit may use the same signaled scaling factor, if chroma scaling is applied to the video block.
16. It is proposed information may be signaled to indicate whether chroma scaling in LMCS is applied or not.
a. It may be signaled in a video unit such as VPS/SPS/PPS/APS/picture header/slice header/tile group header, etc.
i. In one example, it maybe signaled in the slice or tile group header when luma and chroma samples are coded with dual-tree coding structure and the current slice or tile group is an intra-coded slice or tile group.
The examples described above may be incorporated in the context of the method described below, e.g., method 400, which may be implemented at a video decoder or a video encoder.
FIG. 4 shows a flowchart of an exemplary method for video processing. The method 400 includes, at step 402, performing a conversion between a current video block of a video and a coded representation of the video, the current video block comprising a luma component and at least one chroma component, wherein the luma component is converted from an original domain to a reshaped domain with a luma mapping with chroma scaling (LMCS) scheme, and chroma samples of the at least one chroma component are predicted based on reconstructed luma samples of the luma component in the original domain.
In one aspect, there is disclosed a method for video processing. The method includes, performing a conversion between a current video block of a video and a coded representation of the video, the current video block comprising a luma component and at least one chroma component, wherein the luma component is converted from an original domain to a reshaped domain with a luma mapping with chroma scaling (LMCS) scheme, and chroma samples of the at least one chroma component are predicted based on reconstructed luma samples of the luma component in the original domain.
In an example, the chroma samples of the at least one chroma component are predicted in a cross-component linear model (CCLM) prediction mode.
In an example, neighboring reconstructed luma samples are converted from the reshaped domain to the original domain, and then the converted neighboring luma samples are used to derive linear model parameters in a linear model to be used in the CCLM prediction mode.
In an example, the neighboring reconstructed luma samples comprises adjacent or non-adjacent reconstructed luma samples in the reshaped domain.
In an example, collocated reconstructed luma samples are converted from the reshaped domain to the original domain, and then the converted collocated luma samples are used in the linear model to predict the chroma samples.
In an example, a first indication is signaled for indicating at least one of a scaling coefficient and an inverse scaling coefficient which is to be used in the LMCS scheme.
In an example, at least one of the scaling coefficient and the inverse scaling coefficient is coded.
In an example, a value of the at least one of the scaling coefficient and the inverse scaling coefficient is quantized, and the quantized value of at least one of the scaling coefficient and the inverse scaling coefficient is signaled.
In an example, the first indication is signaled in a predictive way.
In an example, the first indication is coded using a fixed-length code, a unary code, a truncated unary code or an exponential-Golomb code.
In an example, the first indication is signaled in a video parameter set (VPS) , a sequence parameter set (SPS) , a picture parameter set (PPS) , an adaptation parameter set (APS) , a picture header, a slice header or a tile group header.
In an example, the scaling coefficient comprises ScaleCoeff [i] to be used in the LMCS scheme.
In an example, the inverse scaling coefficient comprises InvScaleCoeff [i] to be used in the LMCS scheme.
In an example, the scaling coefficient comprises a chroma scaling factor to be used in the LMCS scheme.
In an example, the chroma scaling factor comprises ChromaScaleCoeff [i] , and wherein i is an integer.
In an example, a second indication is signaled for indicating whether a chroma scaling is applied to the current video block in the LMCS scheme.
In an example, if the chroma scaling is applied to the current video block, all the video blocks in a video unit which covers the current video block use the same chroma scaling factor signaled for the current video block.
In an example, the second indication is signaled in one of a video parameter set (VPS) , a sequence parameter set (SPS) , a picture parameter set (PPS) , an adaptation parameter set (APS) , a picture header, a slice header or a tile group header.
In an example, if the luma component and the at least chroma component of the current video block are coded with a dual-tree coding structure and a current slice or tile group covers the current video block is intra-coded, the second indication is signaled in the slice header or the tile group header, .
In an example, the performing the conversion includes generating the coded representation from the current video block.
In an example, the performing the conversion includes generating the current video block from the coded representation.
In another aspect, there is disclosed an apparatus in a video system comprising a processor and a non-transitory memory with instructions thereon, wherein the instructions upon execution by the processor, cause the processor to implement the method as described above.
In still another aspect, there is disclosed a computer program product stored on a non-transitory computer readable media, the computer program product including program code for carrying out the method as described above.
5 Example implementations of the disclosed technology
5.1 Embodiment #1
The variable LmcsPivot [i] with i = 0.. 16, the variables ScaleCoeff [i] and InvScaleCoeff [i] with i = 0.. 15, are derived as follows:
The variable ChromaScaleCoeff [i] , with i = 0…15, is derived as follows:
if (lmcsCW [i] = = 0)
ChromaScaleCoeff [i] = (1 << 11)
else {
ChromaScaleCoeff [i] = InvScaleCoeff [i]
}
divTable [] is specified as follows:
divTable [] = {0, 62, 60, 58, 56, 55, 53, 51, 50, 48, 47, 45, 44, 42, 41, 40, 38, 37, 36, 35, 34, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 20, 19, 18, 17, 16, 16, 15, 14, 13, 13, 12, 11, 10, 10, 9, 8, 8, 7, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1 }
5.2 Embodiment #2
The variable LmcsPivot [i] with i = 0.. 16, the variables ScaleCoeff [i] and InvScaleCoeff [i] with i = 0.. 15, are derived as follows:
The variable ChromaScaleCoeff [i] , with i = 0…15, is derived as follows:
if (lmcsCW [i] = = 0)
ChromaScaleCoeff [i] = (1 << 11)
else {
ChromaScaleCoeff [i] = InvScaleCoeff [i]
}
divTable [] is specified as follows:
divTable [] = {0, 62, 60, 58, 56, 55, 53, 51, 50, 48, 47, 45, 44, 42, 41, 40, 38, 37, 36, 35, 34, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 20, 19, 18, 17, 16, 16, 15, 14, 13, 13, 12, 11, 10, 10, 9, 8, 8, 7, 7, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1 }
5.3 Embodiment #3
The variable LmcsPivot [i] with i = 0.. 16, the variables ScaleCoeff [i] and InvScaleCoeff [i] with i = 0.. 15, are derived as follows:
The variable ChromaScaleCoeff [i] , with i = 0…15, is derived as follows:
if (lmcsCW [i] = = 0)
ChromaScaleCoeff [i] = (1 << 11)
else {
ChromaScaleCoeff [i] = InvScaleCoeff [i]
}
divTable [] is specified as follows:
divTable [] = {0, 124, 120, 117, 113, 109, 106, 103, 100, 96, 93, 90, 88, 85, 82, 79, 77, 74, 72, 69, 67, 65, 63, 60, 58, 56, 54, 52, 50, 48, 46, 44, 43, 41, 39, 37, 36, 34, 33, 31, 30, 28, 27, 25, 24, 22, 21, 20, 18, 17, 16, 14, 13, 12, 11, 10, 9, 7, 6, 5, 4, 3, 2, 1}
5.4 Embodiment #4
The variable LmcsPivot [i] with i = 0.. 16, the variables ScaleCoeff [i] and InvScaleCoeff [i] with i = 0.. 15, are derived as follows:
The variable ChromaScaleCoeff [i] , with i = 0…15, is derived as follows:
if (lmcsCW [i] = = 0)
ChromaScaleCoeff [i] = (1 << 11)
else {
ChromaScaleCoeff [i] = InvScaleCoeff [i]
}
divTable [] is specified as follows:
divTable [] = {0, 124, 120, 117, 113, 109, 106, 103, 100, 96, 93, 90, 88, 85, 82, 79, 77, 74, 72, 69, 67, 65, 63, 60, 58, 56, 54, 52, 50, 48, 46, 44, 43, 41, 39, 37, 36, 34, 33, 31, 30, 28, 27, 25, 24, 22, 21, 20, 18, 17, 16, 14, 13, 12, 11, 10, 9, 7, 6, 5, 4, 3, 2, 1}
Embodiment #5
Based on JVET-1001-v8, the text may be changed as:
7.4.6.4 Luma mapping with chroma scaling data semantics
…
The variable LmcsPivot [i] with i = 0.. 16, the variables ScaleCoeff [i] and InvScaleCoeff [i] with i = 0.. 15, are derived as follows:
…
divTable [] is specified as follows:
divTable [] = {0, 248, 240, 233, 226, 219, 212, 206, 199, 193, 187, 181, 175, 170, 164, 159, 154, 149, 144, 139, 134, 130, 125, 121, 116, 112, 108, 104, 100, 96, 93, 89, 85, 82, 78, 75, 72, 68, 65, 62, 59, 56, 53, 50, 47, 45, 42, 39, 37, 34, 31, 29, 26, 24, 22, 19, 17, 15, 13, 10, 8, 6, 4, 2} .
FIG. 5 is a block diagram of a video processing apparatus 500. The apparatus 500 may be used to implement one or more of the methods described herein. The apparatus 500 may be embodied in a smartphone, tablet, computer, Internet of Things (IoT) receiver, and so on. The apparatus 500 may include one or more processors 502, one or more memories 504 and video processing hardware 506. The processor (s) 502 may be configured to implement one or more methods (including, but not limited to, method 400) described in the present document. The memory (memories) 504 may be used for storing data and code used for implementing the methods and techniques described herein. The video processing hardware 506 may be used to implement, in hardware circuitry, some techniques described in the present document.
In some embodiments, the video coding methods may be implemented using an apparatus that is implemented on a hardware platform as described with respect to FIG. 5.
From the foregoing, it will be appreciated that specific embodiments of the presently disclosed technology have been described herein for purposes of illustration, but that various modifications may be made without deviating from the scope of the invention. Accordingly, the presently disclosed technology is not limited except as by the appended claims.
Implementations of the subject matter and the functional operations described in this patent document can be implemented in various systems, digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Implementations of the subject matter described in this specification can be implemented as one or more computer program products, i.e., one or more modules of computer program instructions encoded on a tangible and non-transitory computer readable medium for execution by, or to control the operation of, data processing apparatus. The computer readable medium can be a machine-readable storage device, a machine-readable storage substrate, a memory device, a composition of matter effecting a machine-readable propagated signal, or a combination of one or more of them. The term “data processing unit” or “data processing apparatus” encompasses all apparatus, devices, and machines for processing data, including by way of example a programmable processor, a computer, or multiple processors or computers. The apparatus can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, or a combination of one or more of them.
A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment. A computer program does not necessarily correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language document) , in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub programs, or portions of code) . A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform functions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application specific integrated circuit) .
Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a read only memory or a random access memory or both. The essential elements of a computer are a processor for performing instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include, or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto optical disks, or optical disks. However, a computer need not have such devices. Computer readable media suitable for storing computer program instructions and data include all forms of nonvolatile memory, media and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
It is intended that the specification, together with the drawings, be considered exemplary only, where exemplary means an example. As used herein, the use of “or” is intended to include “and/or” , unless the context clearly indicates otherwise.
While this patent document contains many specifics, these should not be construed as limitations on the scope of any invention or of what may be claimed, but rather as descriptions of features that may be specific to particular embodiments of particular inventions. Certain features that are described in this patent document in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.
Similarly, while operations are depicted in the drawings in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. Moreover, the separation of various system components in the embodiments described in this patent document should not be understood as requiring such separation in all embodiments.
Only a few implementations and examples are described and other implementations, enhancements and variations can be made based on what is described and illustrated in this patent document.