The application be that July 14, application number in 2003 are 03817031.0 the applying date, denomination of invention divides an application for the application for a patent for invention of " reference pictures is adaptive weighted in the video decode ".
The application requires in U.S. Provisional Patent Application sequence number 60/395 submission on July 15th, 2002, " reference pictures is adaptive weighted in the Video Codec " by name; The priority of 843 (attorney PU020340), it incorporates this literary composition as a reference in full.In addition, the application requires same priority in U.S. Provisional Patent Application sequence number 60/395,874 (attorney PU020339) submission on July 15th, 2002, " weight estimation estimation " by name, and it incorporates this literary composition as a reference in full equally.
Embodiment
The invention provides a kind of apparatus and method, be used for motion vector estimation and self adaptation benchmark picture weighted factor and distribute.In some video sequence, especially have in the video sequence of fade (fading) at those, current picture to be encoded or image block are relevant with reference pictures self stronger with the reference pictures ratio that passes through the weighted factor convergent-divergent.The Video Codec that does not apply weighted factor to reference pictures is to the unusual poor efficiency of the coding of the sequence that fades.When in coding, using weighted factor; Video encoder need be confirmed weighted factor and motion vector; But depend on another for one optimal selection among both, wherein estimation is generally amount of calculation the best part in the compression of digital video encoder.
In joint video team (" JVT (Joint Video Team) ") video compression standard that people propose; Each P picture can use a plurality of reference pictures to form the prediction of picture, but the 8x8 zone of each other moving mass or macro block only uses single reference pictures to predict.Except that coding and translatory movement vector, also transmit reference picture indices for each moving mass or 8x8 zone, use which reference pictures with indication.Store at encoder and decoder place maybe reference pictures finite aggregate, and transmit the number of allowable reference pictures.
In the JVT standard; For bi-directional predicted picture (being also referred to as " B " picture); Form two predictive operators (predictor) for each moving mass or 8x8 zone; Wherein each can form from the reference pictures that separates, and these two predictive operators are together on average to form single consensus forecast operator.For bi-directional predicted coded motion blocks, reference pictures can be all from forward direction, all from the back to direction, or one from one of forward direction from the back to direction.For the available reference pictures that can be used to predict, safeguard two tabulations.These two reference pictures are called tabulation 0 and tabulation 1 predictive operator.Encode respectively and transmit each reference pictures for the tabulation 0 with the tabulation 1 reference pictures index, that is, and red_idx_I0 and red_idx_I1.Joint video team (" JVT ") bi-directional predicted perhaps " B " picture allows adaptive weighted between two predictions, that is,
Pred=[(P0)(Pred0)]+[(P1)(Pred1)]+D,
Wherein P0 and P1 are weighted factor, and Pred0 and Pred1 are respectively the reference picture prediction of tabulation 0 and tabulation 1, and D is skew.
For the indication weighted factor, two kinds of methods have been proposed.In first method,, confirm weighted factor through being used for the direction of reference pictures.In this method, if the red_idx_I0 index is less than or equal to red_idx_I1, then use weighted factor (1/2,1/2), otherwise usage factor (2 ,-1).
In the second method that people propose,, transmit the weighted factor of arbitrary number for each fragment (slice).Then, transmit the weighted factor index for each the 8x8 zone or the moving mass that use bi-directional predicted macro block.Decoder uses the weighted factor index receive from the set that is transmitted, to select suitable weighted factor, to use when decoding moving piece or the 8x8 zone.For example, if send three weighted factors in this slice layer, then they are respectively corresponding to weighted factor index 0,1,2.
Below describe principle of the present invention only has been described.Though therefore being appreciated that those skilled in the art can imagine here clearly description or demonstration, still comprise the principle of the invention and be included in the various structures in the present invention's spirit and the scope.In addition; The all examples here enumerated and conditional language mainly are to be used for only being used for teaching purpose; With principle of the present invention and the design that helps the reader understanding inventor that prior art is made contributions, and should be understood that these examples and the condition that are not limited to enumerate particularly.In addition, point out that here all statements of the principle of the invention, aspect and execution mode and specific examples thereof all are to be used for covering its structure and function equivalent.In addition, these equivalents are intended to comprise current known equivalent and the equivalent of developing in the future, that is, tubular construction is not how, carries out any element of being developed of said function.
Thus, for example, those skilled in the art should understand that: the block diagram here representes to realize the concept map of the illustrative circuit of the principle of the invention.Similarly; Be appreciated that various processing or process that any flow table, flow chart, state transition graph, false code or the like express possibility and in fact in computer-readable medium, represent and carried out by computer or processor, and no matter whether clearly shown such computer or processor.
The function of various elements shown in the figure can through use specialized hardware and can with suitable software in combination the hardware of executive software provide.When being provided by processor, these functions can be provided by single application specific processor, are perhaps provided by a plurality of independent processors, and wherein some processor can be shared.In addition; For " processor " perhaps " controller " that directly use a technical term; Only should not be understood that to refer to can executive software hardware, and possibly comprise in secretly but be not limited to: the read-only memory (" ROM ") of digital signal processor (" DSP ") hardware, storing software, random access storage device (" RAM ") and nonvolatile storage.Similarly, any switch that shows among the figure is all just conceptual.Even the operation that its function can be through coming programmed logic, through dedicated logic circuit, through the execution manually alternately of program control and dedicated logic circuit, wherein the implementor can select concrete technology as the case may be.
In claim; Any element that is represented as the parts that are used to carry out appointed function is used for comprising all modes of carrying out this function; Comprise (for example): a) carry out the combination of this functional circuit elements; Perhaps b) therefore any type of software comprise firmware, microcode or the like, and its proper circuit with this software of execution combines to carry out this function.Such invention that claim limited is included among the following fact: with the desired mode of claim, make up and converge the function that is provided by pointed various parts.Therefore, the applicant can provide any parts of those functions to think the equivalent of parts shown here.
As shown in Figure 1, label 100 overall expression standard video decoders.Video Decoder 100 comprises with inverse quantizer 120 and carries out the length variable decoder (" VLD ") 110 that signal communication is connected.Inverse quantizer 120 is connected with inverse converter 130 signal communications.Inverse converter 130 is connected with the first input end signal communication of adder or summing junction 140, and wherein the output of summing junction 140 provides the output of Video Decoder 100.The output of summing junction 140 is connected with reference picture store 150 signal communications.Reference picture store 150 is connected with motion compensator 160 signal communications, and motion compensator 160 is communicated by letter with second input end signal of summing junction 140 and is connected.
Forward Fig. 2 to, label 200 overall expressions have the bi-directional predicted Video Decoder of self adaptation.Video Decoder 200 comprises the VLD 210 that is connected with inverse quantizer 220 signal communications.Inverse quantizer 220 is connected with inverse converter 230 signal communications.Inverse converter 230 is connected with the first input end signal communication of summing junction 240, and wherein the output of summing junction 240 provides the output of Video Decoder 200.The output of summing junction 240 is connected with reference picture store 250 signal communications.Reference picture store 250 devices are connected with motion compensator 260 signal communications, and motion compensator 260 is connected with the first input end signal communication of multiplier 270.
VLD 210 also is connected with reference pictures weighted factor look-up table 280 signal communications, so that self adaptation two-way (" ABP ") coefficient index to be provided to look-up table 280.First output of look-up table 280 is used to provide weighted factor, and communicates by letter with second input end signal of multiplier 270 and to be connected.The output of multiplier 270 is connected with the first input end signal communication of summing junction 290.Second output of look-up table 280 is used to provide skew, and communicates by letter with second input end signal of summing junction 290 and to be connected.The output of summing junction 290 is communicated by letter with second input end signal of summing junction 240 and is connected.
Forward Fig. 3 now to, label 300 overall expressions have the Video Decoder of reference pictures weighting.Video Decoder 300 comprises the VLD 310 that is connected with inverse quantizer 320 signal communications.Inverse quantizer 330 is connected with inverse converter 330 signal communications.Inverse converter 330 is connected with the first input end signal communication of summing junction 340, and wherein the output of summing junction 340 provides the output of Video Decoder 300.The output of summing junction 340 is connected with reference picture store 350 signal communications.Reference picture store 350 is connected with motion compensator 360 signal communications, and motion compensator 360 is connected with the first input end signal communication of multiplier 370.
In addition, VLD 310 also is connected with reference pictures weighted factor look-up table 380 signal communications, to look-up table 380 reference picture indices to be provided.First output of look-up table 380 is used to provide weighted factor, and communicates by letter with second input end signal of multiplier 370 and to be connected.The output of multiplier 370 is connected with the first input end signal communication of summing junction 390.Second output of look-up table 380 is used to provide skew, and communicates by letter with second input end signal of summing junction 390 and to be connected.The output of summing junction 390 is communicated by letter with second input end signal of summing junction 340 and is connected.
As shown in Figure 4, label 400 overall expression standard video encoder.The input of encoder 400 is connected with the normal phase input end signal communication of summing junction 410.The output of summing junction 410 is connected with piece converter 420 signal communications.Converter 420 is connected with quantizer 430 signal communications.The output of quantizer 430 is connected with variable length encoder (" VLC ") 440 signal communications, and wherein VLC 440 outside that is output as encoder 400 can obtain output.
The output of quantizer 430 also is connected with inverse quantizer 450 signal communications.Inverse quantizer 450 is connected with contrary piece converter 460 signal communications, is connected against piece converter 460 and then with reference picture store 470 signal communications.First output of reference picture store 470 is connected with the first input end signal communication of exercise estimator 480.The input of encoder 400 is also communicated by letter with second input end signal of exercise estimator 480 and is connected.The output of exercise estimator 480 is connected with the first input end signal communication of motion compensator 490.Second output of reference picture store 470 is communicated by letter with second input end signal of motion compensator 490 and is connected.The output of motion compensator 490 is connected with the inverting input signal communication of summing junction 410.
Forward Fig. 5 to, label 500 overall expressions have the video encoder of reference pictures weighting.The input of encoder 500 is connected with the normal phase input end signal communication of summing junction 510.The output of summing junction 510 is connected with piece converter 520 signal communications.Converter 520 is connected with quantizer 530 signal communications.The output of quantizer 530 is connected with VLC 540 signal communications, and wherein VLC 540 outside that is output as encoder 500 can obtain output.
The output of quantizer 530 also is connected with inverse quantizer 550 signal communications.Inverse quantizer 550 is connected with contrary piece converter 560 signal communications, is connected against piece converter 560 and then with reference picture store 570 signal communications.First output of reference picture store 570 is connected with the first input end signal communication of reference pictures weighted factor distributor 572.The input of encoder 500 is also communicated by letter with second input end signal of reference pictures weighted factor distributor 572 and is connected.The output of the reference pictures weighted factor distributor 572 of indication weighted factor is connected with the first input end signal communication of motion compensator 580.Second output of reference picture store 570 is communicated by letter with second input end signal of motion compensator 580 and is connected.
The input of encoder 500 is also communicated by letter with the 3rd input end signal of exercise estimator 580 and is connected.The output of the exercise estimator 580 of indication motion vector is connected with the first input end signal communication of motion compensator 590.The 3rd output of reference picture store 570 is communicated by letter with second input end signal of motion compensator 590 and is connected.Indication is connected with the first input end signal communication of multiplier 592 through the output of the motion compensator 590 of the reference pictures of motion compensation.The output of the reference pictures weighted factor distributor 572 of indication weighted factor is communicated by letter with second input end signal of multiplier 592 and is connected.The output of multiplier 592 is connected with the inverting input signal communication of summing junction 510.
Forward Fig. 6 now to, the example procedure of the video signal data of label 600 overall expression decoded image blocks.This process comprises begin block 610, and it passes to input block 612 with control.Input block 612 receives image block compressed data, and control is passed to input block 614.Input block 614 receives at least one reference picture indices of image block data, and wherein each reference picture indices is corresponding to particular reference picture.Input block 614 passes to functional block 616 with control, and functional block 616 is confirmed the weighted factors corresponding to the reference picture indices that each received, and control is passed to optional function piece 617.Optional function piece 617 is confirmed the skews corresponding to the reference picture indices that each received, and control is passed to functional block 618.Functional block 618 retrieval is corresponding to the reference pictures of the reference picture indices that each received, and control is passed to functional block 620.The functional block 620 and then reference pictures of being retrieved carried out motion compensation, and control is passed to functional block 622.The reference pictures that functional block 622 will be passed through motion compensation multiply by corresponding weighting factor, and control is passed to optional function piece 623.The reference pictures that optional function piece 623 will pass through motion compensation adds corresponding skew, and control is passed to functional block 624.The reference pictures of weighting and motion compensation is passed through in functional block 624 and then formation, and control is passed to end block 626.
Forward Fig. 7 now to, the example procedure of the video signal data of label 700 overall presentation code image blocks.This process comprises begin block 710, and it passes to input block 712 with control.Input block 712 receives unpressed basically image block data, and control is passed to functional block 714.Functional block 714 is distributed the weighted factor of image block corresponding to the particular reference picture with respective index.Functional block 714 passes to optional function piece 715 with control.Optional function piece 715 distributes the skew of image block corresponding to the particular reference picture with respective index.Optional function piece 715 passes to functional block 716 with control, and functional block 716 is corresponding to the difference calculation of motion vectors between image block and the particular reference picture, and control is passed to functional block 718.Functional block 718 is carried out motion compensation corresponding to motion vector to particular reference picture, and control is passed to functional block 720.Functional block 720 and then the reference pictures that will pass through motion compensation multiply by the weighted factor that is distributed, and forming the reference pictures through weighting and motion compensation, and control are passed to optional function piece 721.Optional function piece 721 and then the reference pictures that will pass through motion compensation add the skew that is distributed, and forming the reference pictures through weighting and motion compensation, and control are passed to functional block 722.Functional block 722 deducts the reference pictures through weighting and motion compensation from unpressed basically image block, and control is passed to functional block 724.Functional block 724 and then utilize difference and the respective index code signal of particular reference picture between unpressed basically image block and the reference pictures that passes through weighting and motion compensation, and control is passed to end block 726.
In this exemplary embodiment, for the picture or the fragment of each coding, weighted factor with can be associated by its current picture block of encoding, each permission relatively reference pictures.In the coding or the current picture of decoding during each piece, will be corresponding to (a plurality of) weighted factor of its reference picture indices with (a plurality of) offset applications to reference prediction with the formation weight predictor.All pieces in the fragment of same relatively reference pictures coding all apply identical weighted factor to reference picture prediction.
When coded picture, whether use adaptive weighted can frame parameter set or sequence parameter set, or said fragment or picture head middle finger show.For using adaptive weighted each fragment or picture, can transmit weighted factor for each admissible reference pictures of possibly be used for encoding this fragment or picture.The number of admissible reference pictures transmits at the head of said fragment.For example, if can use three reference pictures current fragment of encoding, then transmit nearly three weighted factors, and these weighted factors are associated with the reference pictures with same index.
If do not transmit weighted factor, then use default weights.In one embodiment of the invention, when not transmitting weighted factor, use default weights (1/2,1/2).Can use fixing perhaps elongated code to transmit weighted factor.
Different with canonical system, each weighted factor that transmits with each fragment, piece or picture is corresponding to particular reference picture index.The weighted factor of any set that before, had transmitted with each fragment or picture is not associated with any particular reference picture.On the contrary, for each moving mass or the bi-directional predicted weighted indexing of 8x8 zone transmission self adaptation, to select and to apply from which weighted factor in the set that is transmitted this special exercise piece or 8x8 zone.
In this execution mode, explicitly does not transmit the weighted factor index in each moving mass or 8x8 zone.On the contrary, use and the reference picture indices weighting factor associated that is transmitted.This has greatly reduced in the bit stream that is transmitted to allowing the adaptive weighted amount of overhead that has of reference pictures.
This system can put on prediction " P " picture that uses single predictive operator coding with technology, perhaps uses bi-directional predicted " B " picture of two predictive operator codings.Below the situation to P and B picture is described in the decoding processing that all exists in encoder and the decoder.Replacedly, this technology also can be applied to use be similar to I, B, with the coded system of the notion of P picture.
For B picture single direction prediction and bi-directional predicted in the B picture, can use identical weighted factor.When macro block uses single predictive operator for the P picture or in for the prediction of B picture single direction, be the single reference picture indices of this block movement.After the decoding processing generating step predictive operator of motion compensation, apply weighted factor to predictive operator.Then the predictive operator after the weighting is added on the coded residual (coded residual), to shear to form decoded pictures.Perhaps be used for the only piece of the B picture of use tabulation 0 prediction for the piece that is used for the P picture, weight predictor forms:
Pred=W0*Pred0+D0 (1)
Wherein W0 is and tabulation 0 reference pictures weighting factor associated, the skew of D0 for being associated with tabulation 0 reference pictures, and Pred0 is the predict blocks through motion compensation from tabulation 0 reference pictures.
For being used for the only piece of the B picture of use tabulation 0 prediction, weight predictor forms:
Pred=W1*Pred1+D1 (2)
Wherein W1 is and tabulation 1 reference pictures weighting factor associated, the skew of D1 for being associated with tabulation 1 reference pictures, and Pred1 is the predict blocks through motion compensation from tabulation 1 reference pictures.
Can shear predictive operator after the weighting to guarantee that end value within the pixel value tolerance band, is generally 0 to 255.The precision of multiplication can be limited to the resolution of any predetermined number of bits in the weighting formula.
Under bi-directional predicted situation, for each transmission reference picture indices of two predictive operators.Carry out motion compensation to form two predictive operators.Each predictive operator uses and its reference picture indices weighting factor associated, to form two predictive operators after the weighting.Then, the predictive operator after average together these two weightings is added to coded residual with this consensus forecast operator then to form the consensus forecast operator.
For the piece of the B picture that is used for using tabulation 0 and tabulating 1 prediction, weight predictor forms:
Pred=(P0*Pred0+D0+P1*Pred1+D1)/2 (3)
When calculating weight predictor, can shear the predictive operator after the weighting or any median, to guarantee that end value within the pixel value tolerance band, is generally 0 to 255.
Thus, apply weighted factor to the video compression encoder that uses a plurality of reference pictures and the reference picture prediction of decoder.According to the reference picture indices that is used for moving mass, this weighted factor changes for each moving mass in this picture.Because transmitted reference picture indices in the video bit stream after compression, so significantly reduced the additional overhead that changes weighted factor according to moving mass.All moving mass with respect to the same datum picture coding all apply identical weighted factor to reference picture prediction.
According to the explanation here, those skilled in the art can easily understand these and other characteristic and advantage of the present invention.Be appreciated that explanation of the present invention can be applied to various forms of hardware, software, firmware, application specific processor or its combination.
More preferably, the present invention can be implemented as the combination of hardware and software.In addition, said software preferably is embodied as with tangible form and is included in the application program on the program storage unit (PSU).This application program can upload to the machine that comprises any suitable architecture and by its execution.Preferably, this machine is realized on the computer platform that has such as hardware such as one or more CPU (" CPU "), random access storage device (" RAM ") and I/O (" I/O ") interfaces.This computer platform can also comprise operating system and micro-instruction code.Various processing described herein and function can be the parts of micro-instruction code, or the part of application program, perhaps its combination, and it can be carried out by origin CPU.In addition, can various other peripheral cells be connected to this computer platform, for example additional-data storage unit and print unit.
Should also be appreciated that: because some construction system assembly and the method in the accompanying drawings be preferably with the software realization, so the actual connection between system component or the function blocks maybe be according to programming mode of the present invention and difference.The explanation has here been arranged, those of ordinary skills can imagine of the present invention these and similarly realize or configuration.
Though described exemplary embodiment to accompanying drawing, be appreciated that to the invention is not restricted to those accurate execution modes, and under the prerequisite that does not depart from the scope of the present invention with spirit, those of ordinary skills can carry out various changes and modification.All these change with revising and are included within the scope of the present invention that accompanying claims provides.