CN106060539A

CN106060539A - Video encoding method with low transmission bandwidth

Info

Publication number: CN106060539A
Application number: CN201610428792.1A
Authority: CN
Inventors: 李永旭; 马自强; 肖子玉; 唐大钫; 田言金; 毕鹏飞; 徐圣凯; 王建鹏
Original assignee: Shenzhen Fengjing Network Technology Co Ltd
Current assignee: Shenzhen Fengjing Network Technology Co Ltd
Priority date: 2016-06-16
Filing date: 2016-06-16
Publication date: 2016-10-26
Anticipated expiration: 2036-06-16
Also published as: CN106060539B

Abstract

The invention provides a video encoding method with low transmission bandwidth. The method comprises the steps of S1, obtaining images of an original video; S2, preprocessing the original video, thereby obtaining key frames; S3, carrying out preprocessing of inter-frame prediction encoding on multiple frames of images between two adjacent key frames; S4, encoding the original video by employing a self-adaptive interval frame encoding mode; and S5, outputting generated code streams. According to the method, through adoption of the self-adaptive active interval frame encoding mode, the number of the video images needing to be encoded can be greatly reduced at a video encoding end; through adoption of a time domain interpolation frame compensation mode, the images before encoding can be restored by employing the decoded images as the reference frames at a decoding end; the compression ratio of the video is effectively improved; the code streams of the video are reduced to a great extent; the saved code streams can used for improving the encoding effect of the key frames; the quality of the reconstructed images can be improved; and the accuracy of the restored images can be improved.

Description

A kind of method for video coding of low transmission bandwidth

Technical field

The present invention relates to a kind of method for video coding, particularly relate to the method for video coding of a kind of low transmission bandwidth.

Background technology

The data volume of transmission of video is huge, if be just transmitted without processing, to transmission bandwidth and memory space Require the highest, owing to actual transmission bandwidth is limited, so ensureing the same of certain Video coding reconstructed image quality Time so that it is take few bandwidth, thus video is effectively transmitted, it is necessary to original video is compressed coding, so Video coding should develop towards higher compression ratio direction, and also distortion to be ensured is controlled.

Summary of the invention

The technical problem to be solved is to need to provide a kind of compression ratio that can be effectively improved video, very The method for video coding of the low transmission bandwidth of the code stream of video is decreased in big degree.

To this, the present invention provides the method for video coding of a kind of low transmission bandwidth, comprises the following steps:

Step S1, obtains the image of original video；

Step S2, carries out pretreatment to original video, obtains key frame；

Step S3, carries out the pretreatment of inter prediction encoding to the multiple image between adjacent two key frames；

Step S4, original video is encoded by the mode using self adaptation to encode every frame；

Step S5, the code stream that output generates.

Further improvement of the present invention is, the key frame in described step S2 is the first frame of each scene.

Further improvement of the present invention is, using first key frame as reference frame, calculates this first the most successively Key frame and the similarity of follow-up each two field picture to be encoded, compare described similarity with predetermined threshold value, until phase Like degree less than predetermined threshold value, it is determined that this frame is next key frame.

Further improvement of the present invention is, the computing formula of described phase knowledge and magnanimity is S (z_i, z_j)=w_v*S_v(z_i, z_j)+w_m* S_m(z_i, z_j)；Wherein, S (z_i, z_j) represent total similarity of i frame and j frame, S_v(z_i, z_j) and S_m(z_i, z_j) represent that vision is special respectively Levy similarity and motion feature similarity, W_vAnd W_mRepresent vision and the weight of component motion respectively.

Further improvement of the present invention is, in described step S3, the pretreatment of inter prediction encoding is to adjacent two passes Multiple image between key frame carries out two-way difference frame and the process of difference frame translocation sorting.

Further improvement of the present invention is, in described step S4, the mode that self adaptation encodes every frame is for enter key frame Row frame data compression encodes, and skips over two-way difference frame, carries out difference frame and the disparity code of previous key frame.

Further improvement of the present invention is, described self adaptation is in the mode that frame encodes, when key frame is encoded, Travel through all key frames and carry out the predictive coding mode in frame, and then select the predictive coding mode that data volume is little；To difference frame When carrying out disparity code, for current macro, first read the predictive coding mode of former frame respective macroblock, select and former frame The relevant predictive coding mode of predictive coding mode as candidate's scope, then the mode in the range of candidate is transported respectively The dynamic calculating estimated with rate distortion costs, the predictive coding mode as difference frame that selection rate distortion cost is minimum.

Further improvement of the present invention is, the system of selection of described predictive coding mode is as follows: first, former to input Beginning pixel carries out the sub-sampling of 2:1, and the pixel after sampling is carried out edge direction vector calculating, and the edge direction generating macro block is straight Fang Tu, and obtained candidate modes by edge orientation histogram；Then, it is judged that the edge orientation histogram obtained the most just has Unimodality, if having unimodality, selects a predictive coding mode of amplitude maximum in edge orientation histogram and adjacent Two predictive coding patterns be candidate prediction coding mode；If not possessing unimodality, use DC coding mode；Finally, right Each candidate prediction coding mode calculated distortion cost value, selects a kind of as final predictive coding of distortion cost value minimum Pattern.

Further improvement of the present invention is, determines that the method skipping over two-way difference frame number is: compare by every frame pressure After contracting and H.264 compressing m frame video image by standard, the total distortion of reconstruct m two field picture, n is initially 0, increases successively；When full FootAndTime, then skip over i frame Good, wherein B_c(n) and B_sN () represents the distortion under H.264 frame compression and standard are compressed respectively.

Further improvement of the present invention is, in described step S4, when the video sequence of input is two-way difference frame, and meter Calculate before and after two-way difference frame the frame between key frame poor；Relatively frame difference and frame difference limen value, if frame difference is less than frame difference limen value, then Save the coding to this two-way difference frame；If frame difference is more than frame difference limen value, then this two-way difference frame region is mended Repay, then compensated information is encoded；The computing formula of described frame difference is C=∑_{I, j}| A (i, j)-B (i, j) |²/ n, wherein, C Representing frame poor, (i, j) (i, j) represents the pixel of former and later two key frames to A respectively, and the pixel that n is comprised by image is total with B Number.

Compared with prior art, the beneficial effects of the present invention is: use adaptive active every frame coding mode, regarding Frequently coding side can be greatly reduced the video image quantity needing coding, and in decoding end by the way of temporal interpolation mends frame Utilize decoded image just can restore the image of un-encoded as reference frame, be effectively improved the compression ratio of video, Largely decrease the code stream of video；The bit stream saved is possible not only to save transmission bandwidth, but also permissible For improving the encoding efficiency of key frame, the quality improving reconstruct image and the accuracy of the image restored.

Accompanying drawing explanation

Fig. 1 is the workflow schematic diagram of an embodiment of the present invention.

Detailed description of the invention

Below in conjunction with the accompanying drawings, the preferably embodiment of the present invention is described in further detail.

As it is shown in figure 1, this example provides the method for video coding of a kind of low transmission bandwidth, comprise the following steps:

Step S1, obtains the image of original video；

Step S2, carries out pretreatment to original video, obtains key frame；

Step S5, the code stream that output generates.

This example still falls within the hybrid encoding frame that H.264 standard is compressed, original video through infra-frame prediction or inter prediction, Code stream is generated after transition coding；Key frame in step S2 described in this example is the first frame of each scene.

After this example obtains the image of original video, it is necessary first to original video is carried out pretreatment, finds all key frames, The first frame of the most each scene.Using first key frame as reference frame, calculate this first key frame the most successively and treat with follow-up The similarity of each two field picture of coding, compares described similarity with predetermined threshold value, if similarity is higher than threshold value, then There is not scene switching；Until similarity is less than predetermined threshold value, described predetermined threshold value is default predetermined threshold value, this predetermined threshold value Typically can be according to video bits number variable quantity accounting for according to determining, as being set as 0.7～about 0.9, it is also possible to according to reality Border requires be defined and revise.Then representative image scene changes, determines that this frame is for next key frame.The first frame of each scene Being all key frame, described key frame is also referred to as I frame, carries out intraframe predictive coding in this example.Described predetermined threshold value can be according to regarding Frequency reduction requirement carries out self-defining setting, and predetermined threshold value arranges the least, then the video distortion restored is the least, generation Code stream strains greatly mutually；Vice versa.

Phase knowledge and magnanimity threshold setting method is to utilize the global color feature of image and motion feature to come split sence, same field Scape not only visual signature is similar, and motion feature also has concordance.The computing formula of described phase knowledge and magnanimity is S (z_i, z_j)= w_v*S_v(z_i, z_j)+w_m*S_m(z_i, z_j)；Wherein, S (z_i, z_j) represent total similarity of i frame and j frame, S_v(z_i, z_j) and S_m(z_i, z_j) Represent visual signature similarity and motion feature similarity, S respectively_v(z_i, z_j) use hsv color histogram calculation, because HSV Color space is close with the perception color mode of people, and S_m(z_i, z_j) depend on camera lens number and hunting zone.W_vAnd W_mTable respectively Show the weight of vision and component motion, represent visual signature similarity and the weight of motion feature similarity the most respectively.Vision is special The span levying similarity and motion feature similarity is 0～1, but the weight of visual signature similarity is less than motion spy Levying the weight of similarity, the weight of general vision similarity is 0.2～0.4, and the weight of motion feature similarity is 0.6～0.8.

The computational methods of weight are: W_mThe variance that value is motion feature similarity and this motion feature similarity and regard The ratio of the variance sum of feel characteristic similarity；W_vThe variance that value is visual signature similarity and motion feature similarity and The ratio of the variance sum of visual signature similarity.

After all of key frame determines, the multiple image between two key frames belongs to the image of Same Scene.Unified Field Image in scape, in addition to first two field picture intraframe predictive coding, to the residual image in this Same Scene, uses inter prediction Coding processes.The mode of existing inter prediction encoding up to tens kinds, can be correlated with according to the prediction mode of consecutive frame The feature that property is high, reduces the optional scope of predictive mode, reduces the complexity of algorithm.Either intraframe predictive coding or frame Between predictive coding all can produce residual error, i.e. prognostic chart picture and the difference of original image.Residual error is transformed, quantify and entropy code after It is transferred to decoding end together with information of forecasting, is formed for code stream.

The system of selection of predictive coding mode described in this example is as follows: first, and the Asia that the original pixels inputted carries out 2:1 is adopted Sample, carries out edge direction vector calculating to the pixel after sampling, generates the edge orientation histogram of macro block, and straight by edge direction Side's figure obtains candidate modes；Then, it is judged that the edge orientation histogram obtained the most just has unimodality, if having unimodal Property then selects a predictive coding mode of amplitude maximum in edge orientation histogram and adjacent two predictive coding patterns For candidate prediction coding mode；If not possessing unimodality, use DC coding mode；Finally, each candidate prediction is encoded mould Formula calculated distortion cost value, selects a kind of as final predictive coding pattern of distortion cost value minimum.

In step S3 described in this example, the pretreatment of inter prediction encoding is to the multiple image between adjacent two key frames Carry out two-way difference frame and the process of difference frame translocation sorting.

It is to say, original video uses IB ... BPB ... the frame structure of BP, the first frame of each new scene is exactly I frame, i.e. closes Key frame；Then B frame and P frame translocation sorting, described B frame is two-way difference frame, and namely B frame recording is this frame and frame front and back Difference；Described P frame is difference frame, and what namely P frame represented is that this frame is with a key frame before or the most previous P frame Between difference.Self adaptation in described step S4 actively skips over B frame therein exactly every the mode that frame encodes and does not encodes, Only I frame and P frame are encoded, in decoding end, I frame and P frame are decoded accordingly, restore the B frame skipped over.Every a pair I B number of frames between frame and P frame or P frame and P frame is i.e. to ensure the distortion rate of each frame B two field picture all meeting picture quality Under requirement in allowed band, adaptive determining.The quantity of B frame is the most, and the effect of compression is the most obvious.

This example determines that the best approach skipping over two-way difference frame number is: compare by compressing every frame and passing through standard H.264, after compression m frame video image, the total distortion of reconstruct m two field picture, n is initially 0, increases successively；When meetingAndThen skip over i frame optimal, Wherein B_c(n) and B_sN () represents the distortion under H.264 frame compression and standard are compressed respectively.

In step S4 described in this example, the mode that self adaptation encodes every frame, for key frame is carried out frame data compression coding, skips over Two-way difference frame, is carried out and the disparity code of previous key frame difference frame.

Described self adaptation is in the mode that frame encodes, and when encoding key frame, travels through all key frames and carries out in frame Predictive coding mode, and then select the minimum predictive coding mode of data volume as forced coding mode；Difference frame is carried out During disparity code, for current macro, first reading the predictive coding mode of former frame respective macroblock, selection is pre-with former frame The predictive coding mode that survey coded system is correlated with, as candidate's scope, then carries out motion respectively and estimates the mode in the range of candidate Meter and the calculating of rate distortion costs, the predictive coding mode as difference frame that selection rate distortion cost is minimum.

The video image of input has been categorized into region in units of macro block, is claimed respectively according to during difference For coding unit, predicting unit and converter unit.This example is higher to the coding requirement of key frame, therefore can enter key frame During row infra-frame prediction, predicting unit is divided less, improve precision of prediction.

As it is shown in figure 1, in described step S4, when the video sequence of input is two-way difference frame, calculate two-way difference frame Before and after frame between key frame poor；Relatively frame difference and frame difference limen value, this frame difference limen value typically can change according to video bits number Amount accounting is for according to determining, as being set as about 0.4, it is also possible to is defined according to actual requirement and revises.If frame is poor Less than frame difference limen value, then save the coding to this two-way difference frame；If frame difference is more than frame difference limen value, then to this two-way difference frame Region compensates, then is encoded by compensated information；The computing formula of described frame difference is C=∑_{I, j}| A (i, j)-B (i, j)|²/ n, wherein, it is poor that C represents frame, and (i, j) (i, j) represents the pixel of former and later two key frames to A respectively, and n is image institute with B The pixel sum comprised.

The paradox estimated due to the non-linear of object of which movement and reduction linearity causes regional area distortion serious, but It is that scope is little, just can largely improve video image quality as long as this region is improved, therefore for this type of image Only these regional areas need to be encoded, and other parts of image still can save coding.Use the mode of local code, need The image-region that distortion rate to be determined is big.The reference frame that can utilize coded picture buffer restores the frame of centre, by this frame Compare with reconstructed frame image, if the distortion cost of this frame is little, then need not local and compensate, just can carry out also in decoding end Former.If the distortion cost of reconstructed frame is smaller, then distortion cost in image is encoded more than the block of threshold value, all B frames Will judge.

Sequence variation owing to having in video is relatively big, and the then relative motion having is slow.Wherein video sequence interframe is become When changing bigger, estimation is not accurate enough, and the picture quality of reduction is decreased obviously；And for the change of video sequence interframe slowly Time, image quality decrease inconspicuous, it is possible to meet visual demand.Owing to there is this limitation, the matter of image reconstruction can be affected Amount, so needing the image to distortion after reduction is serious to carry out Local treatment.Coding side is by this kind of image information labelling and transmits To decoding end.

Method of reducing for uncompensated picture frame is, I frame adjacent before and after utilization or P frame correspondence position pixel Pixel value weighted sum obtains the pixel of reduction frame.Method of reducing for the picture frame of band compensation is, according to the most adjacent I Frame or P frame find the motion vector of this B frame, further according to local code information, detect a compensation range, in the range of this Use vector to adjust and bi-directional motion estimation compensates this topography, finally give the reduction frame meeting requirement.

In this example, B frame is calculated by its adjacent I frame or P frame, so I frame and P frame affect the reduction effect of B frame, and P frame is again to obtain according to inter prediction, it is therefore necessary to improve the accuracy of inter prediction, can use this smaller size of The macro block of unified size improves precision of prediction, improves compression ratio.Video image through adaptive every frame encode after, with coding The code stream that required header is formed, carries out package transmission and storage according to RTSP agreement through network-adaptive layer, meets relatively low biography Defeated bandwidth requirement.

This example uses adaptive active every frame coding mode, can be greatly reduced at Video coding end and need coding Video image quantity, and utilize decoded image the most permissible as reference frame in decoding end by the way of temporal interpolation mends frame Restore the image of un-encoded, be effectively improved the compression ratio of video, largely decrease the code stream of video；Under saving The bit number code stream come is possible not only to save transmission bandwidth, but also may be used for improving the encoding efficiency of key frame, improves weight The quality of composition picture and the accuracy of image restored.

Above content is to combine concrete preferred implementation further description made for the present invention, it is impossible to assert Being embodied as of the present invention is confined to these explanations.For general technical staff of the technical field of the invention, On the premise of present inventive concept, it is also possible to make some simple deduction or replace, all should be considered as belonging to the present invention's Protection domain.

Claims

1. the method for video coding of a low transmission bandwidth, it is characterised in that comprise the following steps:

Step S1, obtains the image of original video；

Step S2, carries out pretreatment to original video, obtains key frame；

Step S5, the code stream that output generates.

The method for video coding of low transmission bandwidth the most according to claim 1, it is characterised in that the pass in described step S2 Key frame is the first frame of each scene.

The method for video coding of low transmission bandwidth the most according to claim 2, it is characterised in that first key frame is made For reference frame, calculate the similarity of this first key frame and follow-up each two field picture to be encoded the most successively, by described phase Compare with predetermined threshold value like degree, until similarity is less than predetermined threshold value, it is determined that this frame is next key frame.

The method for video coding of low transmission bandwidth the most according to claim 3, it is characterised in that the calculating of described phase knowledge and magnanimity Formula is S (z_i, z_j)=w_v*S_v(z_i, z_j)+w_m*S_m(z_i, z_j)；Wherein, S (z_i, z_j) represent total similarity of i frame and j frame, S_v (z_i, z_j) and S_m(z_i, z_j) represent visual signature similarity and motion feature similarity, w respectively_vAnd W_mRepresent vision and fortune respectively The weight of dynamic component.

5. according to the method for video coding of the low transmission bandwidth described in Claims 1-4 any one, it is characterised in that described In step S3 the pretreatment of inter prediction encoding be the multiple image between adjacent two key frames is carried out two-way difference frame and The process of difference frame translocation sorting.

The method for video coding of low transmission bandwidth the most according to claim 5, it is characterised in that in described step S4, from Adapt to every frame coding mode for key frame is carried out frame data compression coding, skip over two-way difference frame, difference frame carried out with front The disparity code of one key frame.

The method for video coding of low transmission bandwidth the most according to claim 6, it is characterised in that described self adaptation is compiled every frame In the mode of code, when key frame is encoded, travel through all key frames and carry out the predictive coding mode in frame, and then select number According to the predictive coding mode that amount is little；When difference frame is carried out disparity code, for current macro, first read former frame the grandest The predictive coding mode of block, select the predictive coding mode relevant to the predictive coding mode of former frame as candidate's scope, so Afterwards the mode in the range of candidate is carried out respectively the calculating of estimation and rate distortion costs, the work that selection rate distortion cost is minimum Predictive coding mode for difference frame.

The method for video coding of low transmission bandwidth the most according to claim 7, it is characterised in that described predictive coding mode System of selection as follows: first, the original pixels of input is carried out the sub-sampling of 2:1, the pixel after sampling is carried out edge side Calculate to vector, generate the edge orientation histogram of macro block, and obtained candidate modes by edge orientation histogram；Then, Judging that the edge orientation histogram obtained the most just has unimodality, if having unimodality, selecting width in edge orientation histogram A predictive coding mode and adjacent two predictive coding patterns that value is maximum are candidate prediction coding mode；If do not had Standby unimodality then uses DC coding mode；Finally, to each candidate prediction coding mode calculated distortion cost value, select distortion generation It is worth a kind of as final predictive coding pattern of minimum.

The method for video coding of low transmission bandwidth the most according to claim 6, it is characterised in that determine and skip over two-way difference The method of frame number is: after comparing by compressing every frame and H.264 compressing m frame video image by standard, reconstruct m two field picture Total distortion, n is initially 0, increases successively；When meetingAnd Time, then skip over i frame optimal, wherein B_c(n) and B_sN () represents the distortion under H.264 frame compression and standard are compressed respectively.

10. according to the method for video coding of the low transmission bandwidth described in Claims 1-4 any one, it is characterised in that described In step S4, when the video sequence of input is two-way difference frame, calculate before and after two-way difference frame the frame between key frame poor； Relatively frame difference and frame difference limen value, if frame difference is less than frame difference limen value, then save the coding to this two-way difference frame；If frame difference is big In frame difference limen value, then this two-way difference frame region is compensated, then compensated information is encoded；The meter of described frame difference Calculating formula is C=∑_{I, j}| A (i, j)-B (i, j) |²/ n, wherein, it is poor that C represents frame, A (i, j) and B (i, j) respectively represent before and after two The pixel of individual key frame, the pixel sum that n is comprised by image.