Nothing Special   »   [go: up one dir, main page]

CN100415002C - Coding and compression method of multi-mode and multi-viewpoint video signal - Google Patents

Coding and compression method of multi-mode and multi-viewpoint video signal Download PDF

Info

Publication number
CN100415002C
CN100415002C CNB2006100528959A CN200610052895A CN100415002C CN 100415002 C CN100415002 C CN 100415002C CN B2006100528959 A CNB2006100528959 A CN B2006100528959A CN 200610052895 A CN200610052895 A CN 200610052895A CN 100415002 C CN100415002 C CN 100415002C
Authority
CN
China
Prior art keywords
correlation
mode
predictive coding
coding
viewpoint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2006100528959A
Other languages
Chinese (zh)
Other versions
CN1913640A (en
Inventor
蒋刚毅
郁梅
张云
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Spparks Technology Co ltd
Original Assignee
Ningbo University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ningbo University filed Critical Ningbo University
Priority to CNB2006100528959A priority Critical patent/CN100415002C/en
Publication of CN1913640A publication Critical patent/CN1913640A/en
Application granted granted Critical
Publication of CN100415002C publication Critical patent/CN100415002C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本发明公开了一种多模式多视点视频编码方法,通过对多视点视频信号时间相关性及视点间相关性分析、以及系统对多视点视频编码的压缩效率、编码复杂度、随机访问性能、编码延时等综合性能的要求,自适应地从候选预测编码模式中动态选择适合当前编码的多视点视频信号特点以及多视点视频编码综合性能要求的预测编码模式对多视点视频信号进行编码,以取代单一模式的、计算复杂的联合时间与空间预测的多参考帧多视点视频预测编码方法,降低多视点视频信号编码压缩的计算复杂度,同时保证编码压缩效率,提高随机访问性能。

The invention discloses a multi-mode multi-viewpoint video encoding method, which analyzes the time correlation and inter-viewpoint correlation of multi-viewpoint video signals, and the system's compression efficiency, encoding complexity, random access performance, and encoding of multi-viewpoint video encoding. Delay and other comprehensive performance requirements, adaptively select the predictive coding mode that is suitable for the characteristics of the current coding multi-view video signal and the comprehensive performance requirements of multi-view video coding from the candidate predictive coding modes to encode the multi-view video signal, instead of A multi-reference frame multi-viewpoint video predictive coding method with single-mode and computationally complex joint temporal and spatial prediction reduces the computational complexity of multi-viewpoint video signal coding and compression, while ensuring coding compression efficiency and improving random access performance.

Description

多模式多视点视频信号编码压缩方法 Coding and compression method of multi-mode and multi-viewpoint video signal

技术领域 technical field

本发明涉及多视点视频信号的编码压缩方法,尤其是涉及基于多视点视频信号的时间相关性与视点间相关性分析的多模式多视点视频信号编码压缩方法。The invention relates to a method for encoding and compressing multi-viewpoint video signals, in particular to a method for encoding and compressing multi-mode multi-viewpoint video signals based on the analysis of time correlation and inter-viewpoint correlation of multi-viewpoint video signals.

背景技术 Background technique

3DAV(三维音视频)是新一代音视频技术的发展方向。作为FTV(自由视点电视)、3DTV(三维电视)等3DAV应用中的核心技术,多视点视频编码技术旨在解决3D交互式视频的压缩、交互、存储和传输等问题。多视点视频信号是由相机阵列对实际场景进行拍摄得到的一组视频信号,它能提供拍摄场景不同角度的视频图像信息,利用其中的一个或多个视点信息可以合成任意视点的信息,达到自由切换视点的目的。多视点视频是一种具有立体感和交互操作功能的新型视频,将在面向宽带与高密度存储介质的交互式多媒体应用领域(如数字娱乐、远程监控、远程教育等)有广泛的应用前景。图1是目前常用的多视点视频系统的示意图,这种系统可以进行多视点视频信号的成像、编码压缩、传输、接收、解码、显示等,而其中多视点视频信号的编码压缩是整个系统的核心部分。3DAV (three-dimensional audio and video) is the development direction of the new generation of audio and video technology. As the core technology in 3DAV applications such as FTV (Free Viewpoint Television) and 3DTV (3D Television), multi-viewpoint video coding technology aims to solve problems such as compression, interaction, storage and transmission of 3D interactive video. The multi-viewpoint video signal is a group of video signals obtained by shooting the actual scene by the camera array. It can provide video image information from different angles of the shooting scene. Using one or more viewpoint information, the information of any viewpoint can be synthesized to achieve freedom. The purpose of switching viewpoints. Multi-viewpoint video is a new type of video with stereoscopic and interactive operation functions, and will have broad application prospects in interactive multimedia applications (such as digital entertainment, remote monitoring, and distance education, etc.) oriented to broadband and high-density storage media. Figure 1 is a schematic diagram of a commonly used multi-view video system at present. This system can perform imaging, encoding and compression, transmission, reception, decoding, and display of multi-view video signals, and the encoding and compression of multi-view video signals is the core of the entire system. core part.

多视点视频信号存在着数据量巨大、不利于网络传输和存储,以及系统资源消耗(高计算复杂度、高存储容量要求、高功耗等)、用户端随机访问(包括快进、快退、视点切换和观看时刻冻结、视点滑动等观看访问方式)等问题。因此,如何提高多视点视频信号编码的压缩效率、降低系统的资源消耗,使系统具有灵活的随机访问、部分解码与绘制等性能,已成为目前国际上多视点视频编码方法与标准制定研究中所追求的目标,也成为研究热点。Multi-viewpoint video signals have a huge amount of data, which is not conducive to network transmission and storage, as well as system resource consumption (high computational complexity, high storage capacity requirements, high power consumption, etc.), client random access (including fast forward, fast rewind, Viewpoint switching and viewing time freezing, viewing point sliding and other viewing access methods) and other issues. Therefore, how to improve the compression efficiency of multi-view video signal coding, reduce the resource consumption of the system, and enable the system to have flexible random access, partial decoding and rendering, etc., has become the current international research on multi-view video coding methods and standards. The pursuit of the goal has also become a research hotspot.

利用多视点视频信号的时间相关性、视点间的相关性,采用运动补偿预测、视差补偿预测是进行多视点视频信号编码压缩的基本思路。多视点视频信号的时间相关性、视点间相关性随成像系统的相机密度、光照变化、相机及对象运动等因素变化而变化。当相机密集、各视点成像强度一致时,多视点视频信号的视点间相关性强;当相机较稀疏、各视点成像强度不一致时,多视点视频信号的时间相关性则相对较强、而视点间相关性较弱。此外,相机及对象运动对多视点视频信号的相关性也产生影响。因此,如果采用具有单一预测结构模式的多视点视频编码框架对具有不同相关性特点的多视点视频信号进行编码,将导致其要么采用非常复杂的多参考帧预测模式以保证高编码压缩效率,但造成编码器计算复杂度和空间复杂度的成倍上升、随机访问性能下降、编码延时增加;要么采用相对简单的预测结构,但编码器难以充分利用多视点视频信号的时间相关性和视点间相关性,从而制约编码压缩效率的提高。Utilizing the temporal correlation and inter-view correlation of multi-view video signals, motion compensation prediction and parallax compensation prediction are the basic ideas for encoding and compressing multi-view video signals. The time correlation and inter-viewpoint correlation of multi-viewpoint video signals change with the camera density of the imaging system, illumination changes, camera and object motion and other factors. When the cameras are dense and the imaging intensity of each viewpoint is consistent, the correlation between the viewpoints of the multi-viewpoint video signal is strong; when the cameras are sparse and the imaging intensity of each viewpoint is inconsistent, the temporal correlation of the multi-viewpoint video signal is relatively strong, while Correlation is weak. In addition, camera and object motion also have an impact on the correlation of multi-view video signals. Therefore, if a multi-view video coding framework with a single prediction structure mode is used to encode multi-view video signals with different correlation characteristics, it will either use a very complex multi-reference frame prediction mode to ensure high coding compression efficiency, but Causes the multiplied increase of the computational complexity and space complexity of the encoder, the decrease of random access performance, and the increase of encoding delay; Correlation, thus restricting the improvement of coding compression efficiency.

由于不同相机密度、光照变化、相机及对象运动等因素的影响,导致多视点视频信号在其时间上、视点间表现出不同的内容关联统计特性。多视点视频信号的这种复杂的时间上及视点间的内容关联特性,使得现有单一结构的多视点视频编码方案不能很好适应于内容关联特性复杂多变的多视点视频信号的压缩,难以获得综合性能(编码压缩效率、随机访问、系统资源消耗、部分解码与绘制、编码延时等)有效的压缩效果,这也是现有多视点视频编码方法普遍存在的一个重要问题。Due to the influence of factors such as different camera densities, illumination changes, camera and object motions, multi-view video signals show different statistical characteristics of content correlation in time and between viewpoints. The complex temporal and inter-view content correlation characteristics of multi-view video signals make the existing multi-view video coding schemes with a single structure unable to adapt well to the compression of multi-view video signals with complex and changeable content correlation characteristics. Obtaining an effective compression effect with comprehensive performance (encoding compression efficiency, random access, system resource consumption, partial decoding and rendering, encoding delay, etc.) is also an important problem common to existing multi-view video encoding methods.

发明内容 Contents of the invention

本发明所要解决的技术问题是提供一种多视点视频信号编码压缩方法,在降低编码复杂度的同时,提高多视点视频编码压缩的综合性能。The technical problem to be solved by the present invention is to provide a method for coding and compressing multi-view video signals, which can improve the comprehensive performance of multi-view video coding and compression while reducing the coding complexity.

本发明解决上述技术问题所采用的技术方案如下:一种多模式多视点视频信号编码压缩方法,将编码器设置成多视点视频预测编码模块、相关性统计分析模块、预测模式选择模块和模式更新触发模块四个功能模块,对输入的多视点视频信号,在编码初始时,可以先根据已知信息,如相机阵列参数、编码复杂度要求、随机访问性能要求等,确定初始预测编码模式,由所述的多视点视频预测编码模块进行编码,然后按以下步骤进行编码:①由所述的预测模式选择模块根据所述的相关性统计分析模块统计分析得到的多视点视频信号相关性特征以及对多视点视频编码的压缩效率、编码复杂度、随机访问性能、编码延时几项综合性能的要求,从候选预测编码模式中动态选择确定适合当前正在编码的多视点视频信号特点的预测编码模式;②由所述的多视点视频预测编码模块以该选定的预测编码模式对输入的多视点视频信号进行编码后,输出编码压缩后的码流信号;③当所述的模式更新触发模块中的模式更新触发条件未满足时,保持当前的预测编码模式,当所述的模式更新触发模块中的模式更新触发条件满足时,重新开启所述的相关性统计分析模块,以选择更新预测编码模式。The technical scheme adopted by the present invention to solve the above-mentioned technical problems is as follows: a multi-mode multi-viewpoint video signal encoding and compression method, the encoder is set as a multi-viewpoint video predictive encoding module, a correlation statistical analysis module, a prediction mode selection module and a mode update The trigger module has four functional modules. For the input multi-view video signal, at the beginning of encoding, the initial predictive encoding mode can be determined based on known information, such as camera array parameters, encoding complexity requirements, random access performance requirements, etc., by The described multi-viewpoint video predictive encoding module encodes, and then encodes according to the following steps: ① by the described prediction mode selection module according to the statistical analysis of the multi-viewpoint video signal correlation characteristics obtained by the statistical correlation analysis module and the According to the comprehensive performance requirements of multi-view video coding, such as compression efficiency, coding complexity, random access performance, and coding delay, dynamically select the predictive coding mode suitable for the characteristics of the multi-view video signal currently being coded from the candidate predictive coding modes; ② After encoding the input multi-view video signal with the selected predictive coding mode by the multi-view video predictive coding module, the encoded and compressed code stream signal is output; ③ when the mode update triggers the When the mode update trigger condition is not met, keep the current predictive coding mode, and when the mode update trigger condition in the mode update trigger module is satisfied, restart the correlation statistical analysis module to select and update the predictive coding mode.

所述的候选预测编码模式可以分为三大类:第1类为适用于以时间相关性为主的多视点视频信号的预测编码模式,该类预测编码模式以运动补偿预测为主;第2类为适用于以视点间相关性为主的多视点视频信号的预测编码模式,该类预测编码模式以视差补偿预测为主;第3类为适用于时间相关性和视点间相关性均衡的多视点视频信号的预测编码模式,该类预测编码模式为兼顾时、空域的联合预测编码模式。上述三大类预测编码模式中的每一类又可由若干个预测编码模式组成,分别适用于具有不同相关性特点的多视点视频信号编码,以及对多视点视频编码综合性能的不同要求(如编码复杂度、编码压缩效率、随机访问性能、编码延时等)。The candidate predictive coding modes can be divided into three categories: the first type is a predictive coding mode applicable to multi-viewpoint video signals based on time correlation, and this type of predictive coding mode is mainly based on motion compensation prediction; the second type The first category is the predictive coding mode suitable for multi-viewpoint video signals based on inter-viewpoint correlation, and this kind of predictive coding mode is mainly based on parallax compensation prediction; The predictive coding mode of the viewpoint video signal, this type of predictive coding mode is a joint predictive coding mode that takes into account the temporal and spatial domains. Each of the above three categories of predictive coding modes can be composed of several predictive coding modes, which are respectively suitable for multi-view video signal coding with different correlation characteristics, and different requirements for the comprehensive performance of multi-view video coding (such as coding complexity, encoding compression efficiency, random access performance, encoding delay, etc.).

所述的相关性统计分析模块的统计分析是对已编码或正在编码的图像组GOP(Group of picture)的时间相关性与视点间相关性进行统计分析,并定义相关性系数α用于表征得到的视频信号的时间相关性与视点间相关性的强弱对比。The statistical analysis of the described correlation statistical analysis module is to perform statistical analysis on the time correlation and inter-viewpoint correlation of the encoded or being encoded image group GOP (Group of picture), and define the correlation coefficient α for characterization to obtain The time correlation of the video signal is compared with the strength of the correlation between viewpoints.

在所述的相关性统计分析模块中,可以对已编码或正在编码的图像组中仅采用视差补偿预测进行编码的图像帧中的帧内编码块的数量ni D和仅采用运动补偿预测进行编码的图像帧中帧内编码块的数量ni P进行统计,以ni D和ni P的比例关系来描述当前多视点视频信号的时间和视点间相关性的强弱关系。In the correlation statistical analysis module, the number n i D of intra-coded blocks in the image frame that is coded or coded by using only parallax compensation prediction and only using motion compensation prediction can be calculated. The number n i P of intra-frame coding blocks in the coded image frame is counted, and the relationship between time and inter-viewpoint correlation of the current multi-view video signal is described by the proportional relationship between n i D and n i P.

在所述的相关性统计分析模块中,也可以对已编码或正在编码的图像组,以仅采用视差补偿预测进行编码的图像帧的预测误差和仅采用运动补偿预测进行编码的图像帧的预测误差的比例关系来分析当前多视点视频信号的时间和视点间相关性强弱关系。In the correlation statistical analysis module, it is also possible to use only the prediction error of the image frame encoded by parallax compensation prediction and the prediction of the image frame encoded by only motion compensation prediction for the image group that has been encoded or is being encoded The proportional relationship of the error is used to analyze the time of the current multi-viewpoint video signal and the relationship between the strength and weakness of the correlation between viewpoints.

所述的预测模式选择模块根据所述的相关性统计分析模块统计分析得到的多视点视频信号相关性特征以及对多视点视频编码的压缩效率、编码复杂度、随机访问性能、编码延时等综合性能的要求进行预测模式选择的方式如下:The prediction mode selection module is based on the multi-viewpoint video signal correlation characteristics obtained through statistical analysis by the correlation statistical analysis module, and the multi-viewpoint video coding compression efficiency, coding complexity, random access performance, coding delay, etc. The performance requirements for prediction mode selection are as follows:

(1)当时间相关性明显强于视点间相关性时,进一步判断相关性在时域内部的分布情况是否相对均衡,或是最邻近时刻的时间相关性明显强于次邻近时刻的时间相关性,选择以运动补偿预测为主的预测编码模式;(1) When the time correlation is obviously stronger than the inter-viewpoint correlation, further judge whether the distribution of the correlation in the time domain is relatively balanced, or whether the time correlation of the nearest moment is obviously stronger than that of the next nearest moment , select the prediction coding mode mainly based on motion compensation prediction;

(2)当时间相关性明显弱于视点间相关性时,选择以视差补偿预测为主的预测编码模式;(2) When the temporal correlation is obviously weaker than the inter-view correlation, select the predictive coding mode based on parallax compensation prediction;

(3)当时间相关性与视点间相关性大致相当时,选择兼顾时、空域的联合预测编码模式。(3) When the temporal correlation is roughly equivalent to the inter-view correlation, choose a joint predictive coding mode that takes both temporal and spatial domains into consideration.

所述的模式更新触发模块可以采用基于视频内容的模式更新方案,根据所述的相关性统计分析模块中得到的相关性系数α的变化情况,确定是否重新启用所述的预测模式选择模块以更新预测编码模式。The mode update triggering module may adopt a mode update scheme based on video content, and determine whether to re-enable the prediction mode selection module to update according to the variation of the correlation coefficient α obtained in the correlation statistical analysis module. Predictive coding mode.

所述的模式更新触发模块也可以采用定时更新触发的方式,定期开启所述的相关性统计分析模块对多视点视频信号的时间相关性和视点间相关性进行统计分析,并启用预测模式选择模块以确定预测编码模式。The mode update triggering module may also adopt a timing update triggering method to regularly open the correlation statistical analysis module to perform statistical analysis on the time correlation and inter-viewpoint correlation of multi-viewpoint video signals, and enable the prediction mode selection module to determine the predictive coding mode.

在多模式多视点视频编码器中,可以使所有候选的多视点视频预测编码模式的预测结构具有一定的共性,即所述的候选预测编码模式中位于和帧内编码帧同一时刻的图像帧以及位于与帧内编码帧同一视点的图像帧均先于图像组中其它图像帧被编码,而且上述这些图像帧在所有候选预测模式中都具有相同的预测方式,可以在编码这些最先被编码的图像帧的同时,获得当前正在编码的多视点视频信号的相关性统计分析结果,并在这些最先被编码的图像帧编码完成后及时确定当前正在编码的图像组中其它帧采取何种预测编码结构,即从所有候选预测编码模式中最终选定一个适合当前多视点视频信号特点以及多视点视频编码综合性能要求的预测编码模式进行编码。In the multi-mode multi-viewpoint video encoder, the prediction structures of all candidate multi-viewpoint video prediction coding modes can have certain commonality, that is, the image frames located at the same moment as the intra-frame coding frames in the candidate prediction coding modes and The image frames located at the same viewpoint as the intra-coded frame are coded before other image frames in the image group, and the above-mentioned image frames have the same prediction method in all candidate prediction modes, and these image frames that are coded first can be coded At the same time as the image frame, obtain the correlation statistical analysis results of the multi-viewpoint video signal currently being encoded, and determine in time which predictive encoding is adopted for other frames in the image group currently being encoded after the encoding of these first encoded image frames is completed Structure, that is, from all candidate predictive coding modes, finally select a predictive coding mode that is suitable for the characteristics of the current multi-viewpoint video signal and the comprehensive performance requirements of multi-viewpoint video coding for coding.

本发明针对多视点视频信号时间及视点间的内容相关性随多视点相机密度、光照、相机及对象运动等因素不同而变化的现象,提出基于多视点视频信号时间相关性及视点间相关性分析以及多视点视频编码综合性能要求的多模式多视点视频编码框架,根据多视点相机的密度、光照、相机及对象运动等的变化,设计相应的不同候选预测编码模式,通过对多视点视频信号的时间相关性和视点间相关性进行简单的统计特性分析,以及对多视点视频编码综合性能的不同要求(如编码复杂度、编码压缩效率、随机访问性能、编码延时等),从候选预测编码模式中动态选择适应于当前多视点视频信号特点的预测编码模式,从而提高多视点视频信号编码的综合性能。Aiming at the phenomenon that the multi-viewpoint video signal time and content correlation between viewpoints change with different factors such as multi-viewpoint camera density, illumination, camera and object motion, the present invention proposes an analysis based on multi-viewpoint video signal time correlation and inter-viewpoint correlation And the multi-mode multi-view video coding framework required by the comprehensive performance of multi-view video coding. According to the changes of multi-view camera density, illumination, camera and object motion, different candidate predictive coding modes are designed. Through the multi-view video signal Simple statistical analysis of time correlation and inter-view correlation, as well as different requirements for the comprehensive performance of multi-view video coding (such as coding complexity, coding compression efficiency, random access performance, coding delay, etc.), from candidate predictive coding Among the modes, the predictive coding mode suitable for the characteristics of the current multi-viewpoint video signal is dynamically selected, thereby improving the comprehensive performance of the multi-viewpoint video signal coding.

与现有技术相比,本发明的优点在于通过对多视点视频信号时间相关性与视点间相关性分析,动态选择适合于当前被编码的多视点视频信号特点以及多视点视频编码综合性能要求的预测编码模式,以取代现有单一模式的计算复杂的联合时间与空间预测的多参考帧预测编码方法,从而有效降低多视点视频信号编码压缩的计算复杂度,提高了多视点视频系统的随机访问性能,同时保证了编码压缩性能。Compared with the prior art, the advantage of the present invention is that by analyzing the temporal correlation of multi-view video signals and the inter-view correlation, dynamic selection is suitable for the characteristics of the currently encoded multi-view video signal and the comprehensive performance requirements of multi-view video coding. Predictive coding mode to replace the multi-reference frame predictive coding method of joint time and space prediction in the existing single mode, so as to effectively reduce the computational complexity of multi-viewpoint video signal coding and compression, and improve the random access of multi-viewpoint video system Performance, while ensuring the encoding compression performance.

附图说明 Description of drawings

图1为多视点视频系统示意图;Fig. 1 is a schematic diagram of a multi-viewpoint video system;

图2为本发明多模式多视点视频编码器结构与编码过程示意图;Fig. 2 is a schematic diagram of the structure and encoding process of the multi-mode multi-viewpoint video encoder of the present invention;

图3a为实施例中的第1类候选预测编码模式;Figure 3a is the first type of candidate predictive coding mode in the embodiment;

图3b为实施例中的第2类候选预测编码模式;Figure 3b is a second type of candidate predictive coding mode in the embodiment;

图3c为实施例中的第3类候选预测编码模式;Figure 3c is a third type of candidate predictive coding mode in the embodiment;

图4为采用P帧的顺序预测编码模式PSVP;FIG. 4 is a sequential predictive coding mode PSVP using P frames;

图5为采用B帧的顺序预测编码模式BSVP;FIG. 5 is a sequential predictive coding mode BSVP using B frames;

图6为Mpicture的多视点视频预测编码模式;Fig. 6 is the multi-view video prediction coding mode of Mpicture;

图7为Joint多视点视频测试序列;Fig. 7 is a Joint multi-viewpoint video test sequence;

图8为Joint多视点视频测试序列中Xmas序列部分的平均率失真曲线;Fig. 8 is the average rate-distortion curve of the Xmas sequence part in the Joint multi-viewpoint video test sequence;

图9为Joint多视点视频测试序列中exit序列部分的平均率失真曲线;Fig. 9 is the average rate-distortion curve of the exit sequence part in the Joint multi-viewpoint video test sequence;

图10为Joint多视点视频测试序列中ballroom序列部分的平均率失真曲线;Figure 10 is the average rate-distortion curve of the ballroom sequence part in the Joint multi-viewpoint video test sequence;

图11为Joint多视点视频测试序列的平均率失真曲线。Figure 11 is the average rate-distortion curve of the Joint multi-viewpoint video test sequence.

具体实施方式 Detailed ways

以下结合附图实施例对本发明作进一步详细描述。The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.

这里,以具有代表性的5×7图像组结构为例(如图3a、图3b和图3c所示,每个图像组共有5个视点、7个时刻,共35帧),就多模式多视点视频编码器的4个功能模块及其协同工作方式进行详细说明。Here, taking the representative 5×7 image group structure as an example (as shown in Fig. The four functional modules of the viewpoint video encoder and their cooperative working methods are described in detail.

1)多视点视频预测编码模块1) Multi-view video predictive coding module

该模块负责多视点视频信号的编码压缩,即采用由预测模式选择模块动态选择的某个候选预测编码模式对当前多视点视频信号进行编码。This module is responsible for encoding and compressing multi-viewpoint video signals, that is, using a candidate predictive coding mode dynamically selected by the prediction mode selection module to encode the current multi-viewpoint video signal.

根据多视点视频信号时间相关性与视点间相关性情况,候选的多视点视频预测编码模式分为三大类,第1类为适用于以时间相关性为主的多视点视频信号的预测编码模式;第2类为适用于以视点间相关性为主的多视点视频信号的预测编码模式;第3类为适用于时间相关性和视点间相关性均衡的多视点视频信号的预测编码模式。上述三大类预测编码模式中的每一类又可由若干个预测编码模式组成,以适应具有不同相关性特点的多视点视频信号编码,以及对多视点视频编码综合性能的不同要求(如编码复杂度、编码压缩效率、随机访问性能、编码延时等)。According to the temporal correlation and inter-view correlation of multi-view video signals, the candidate multi-view video predictive coding modes are divided into three categories. The first category is the predictive coding mode suitable for multi-view video signals with temporal correlation ; The second category is the predictive coding mode suitable for multi-viewpoint video signals based on inter-viewpoint correlation; the third category is the predictive coding mode suitable for multi-viewpoint video signal with balanced temporal correlation and inter-viewpoint correlation. Each of the above three categories of predictive coding modes can be composed of several predictive coding modes to adapt to multi-view video signal coding with different correlation characteristics, and different requirements for the comprehensive performance of multi-view video coding (such as complex coding degree, encoding compression efficiency, random access performance, encoding delay, etc.).

图3a、图3b和图3c分别表示所采用的3种不同类别的预测编码模式,图中I表示帧内编码帧,D表示视差补偿预测编码帧,P表示运动补偿预测编码帧,P′表示时、空双向预测编码帧,可参考D、P帧,B′为时、空联合预测帧,可参考D、P和P′帧。图3a的预测编码模式以运动补偿预测为主,适用于以时间相关性为主的多视点视频信号编码,属于第1类预测编码模式;图3b的预测编码模式以视差补偿预测为主,适用于以视点间相关性为主的多视点视频信号编码,属于第2类预测编码模式;图3c的预测编码模式则为兼顾时、空域的联合预测,适用于时间视点间相关性均衡的多视点视频信号编码,属于第3类预测编码模式。本实施例中,三大类中的每一类预测编码模式仅有一个候选模式,实际使用本发明时可根据需要设计多个不同的预测编码模式。Figure 3a, Figure 3b and Figure 3c respectively show the three different types of predictive coding modes adopted. In the figure, I represents the intra-frame coding frame, D represents the parallax compensation predictive coding frame, P represents the motion compensation predictive coding frame, and P' represents For time-space bidirectional predictive coding frames, D and P frames can be referred to, and B' is a time-space joint prediction frame, and D, P, and P' frames can be referred to. The predictive coding mode in Figure 3a is mainly based on motion compensation prediction, which is suitable for multi-view video signal coding based on temporal correlation, and belongs to the first type of predictive coding mode; the predictive coding mode in Figure 3b is mainly based on parallax compensation prediction, applicable It belongs to the second type of predictive coding mode for multi-view video signal coding based on inter-view correlation; the predictive coding mode in Fig. Video signal coding belongs to the third type of predictive coding mode. In this embodiment, there is only one candidate mode for each type of predictive coding mode in the three categories, and multiple different predictive coding modes can be designed according to requirements when the present invention is actually used.

2)相关性统计分析模块2) Correlation statistical analysis module

定义相关性系数α用于表征视频信号的时间相关性与视点间相关性的强弱对比,该系数可由对已编码或正在编码的图像组的时间相关性与视点间相关性进行统计分析得到。The correlation coefficient α is defined to characterize the strength contrast between the temporal correlation and inter-viewpoint correlation of the video signal. This coefficient can be obtained by statistical analysis of the temporal correlation and inter-viewpoint correlation of the encoded or being encoded image group.

在多模式多视点视频编码中,对于与I帧同一时刻但位于不同视点的图像,如图3a、图3b和图3c中位于I帧左右2侧的若干D帧,仅通过视差补偿预测对其进行编码,D帧中I块(即帧内编码块)的数量表示为ni D;对于与I帧为同一视点但不同时刻的图像,如图3a、图3b和图3c中位于I帧上下2侧(实际从时间上表现为I帧的前后帧)的若干P帧,仅通过运动补偿预测对其进行编码,P帧中I块的数量表示为ni P。相关性系数α可定义为In multi-mode multi-viewpoint video coding, for images at the same moment as I frame but located at different viewpoints, such as several D frames located on the left and right sides of I frame in Figure 3a, Figure 3b and Figure 3c, only through parallax compensation Encoding, the number of I blocks (i.e. intra-frame coded blocks) in the D frame is expressed as n i D ; for images with the same viewpoint but different moments with the I frame, as shown in Figure 3a, Figure 3b and Figure 3c, they are located above and below the I frame Several P frames on side 2 (actually shown as frames before and after the I frame in terms of time) are coded only by motion compensation prediction, and the number of I blocks in the P frame is expressed as n i P . The correlation coefficient α can be defined as

αα == 11 nno ΣΣ ii == 00 nno nno ii DD. // 11 mm ΣΣ ii == 00 mm nno ii PP

其中,n、m分别表示用于计算相关性系数的D帧和P帧的帧数。该相关性系数α可用于表征视频信号的时间相关性和视点间相关性的强弱对比。而且计算α所需的I块数量可在编码同时统计得到,额外计算开销极低,因而可以通过α来有效实现多模式多视点视频编码的视频信号相关性统计分析。本实施例即采用D帧和P帧中I块数量的比例关系来计算相关性系数α,并在预测模式选择模块中采用阈值法从图3a、图3b和图3c所示的3个候选预测编码模式中最终选择1个预测编码模式提交给多视点视频预测编码模块进行编码。Wherein, n and m represent the frame numbers of the D frame and the P frame used to calculate the correlation coefficient respectively. The correlation coefficient α can be used to characterize the time correlation of the video signal and the strength and weakness of the correlation between viewpoints. Moreover, the number of I-blocks required to calculate α can be statistically obtained at the same time of encoding, and the additional calculation overhead is extremely low. Therefore, α can be used to effectively implement statistical analysis of video signal correlation in multi-mode and multi-view video coding. In this embodiment, the correlation coefficient α is calculated by using the proportional relationship between the number of I blocks in the D frame and the P frame, and the threshold method is used in the prediction mode selection module from the three candidate predictions shown in Fig. 3a, Fig. 3b and Fig. 3c Among the encoding modes, one predictive encoding mode is finally selected and submitted to the multi-view video predictive encoding module for encoding.

除上述方案外,也可以通过已编码或正在编码的图像组中那些仅采用视差补偿预测进行编码的图像帧(D帧)的预测误差(例如SAD值),以及那些仅采用运动补偿预测进行编码的图像帧(P帧)的预测误差的比例关系,统计分析当前多视点视频信号的时间和视点间相关性强弱关系。In addition to the above schemes, it is also possible to use the prediction errors (such as SAD values) of those image frames (D frames) that are coded or are coded only using disparity compensation prediction for coding, and those that only use motion compensation prediction for coding The proportional relationship of the prediction error of the image frame (P frame), statistical analysis of the time of the current multi-viewpoint video signal and the strength of the correlation between the viewpoints.

3)预测模式选择模块3) Prediction mode selection module

根据相关性统计分析模块的多视点视频信号相关性统计分析结果,以及对多模式多视点视频编码的压缩效率、编码复杂度、随机访问性能、编码延时等综合性能的要求,从候选预测编码模式中选择适合当前多视点视频信号特点和编码综合性能要求的某个预测编码模式。预测编码模式的选择方式如下:According to the statistical analysis results of multi-viewpoint video signal correlation of the correlation statistical analysis module, and the comprehensive performance requirements of multi-mode multi-viewpoint video coding such as compression efficiency, coding complexity, random access performance, and coding delay, the candidate predictive coding Select a predictive coding mode that is suitable for the characteristics of the current multi-viewpoint video signal and the comprehensive performance requirements of coding. The predictive coding mode is selected as follows:

(1)当时间相关性明显强于视点间相关性时,可进一步判断相关性在时域内部的分布情况是否相对均衡,或是最邻近时刻的时间相关性明显强于次邻近时刻的时间相关性,以选择确定某个合适的第1类预测编码模式。(1) When the time correlation is obviously stronger than the inter-viewpoint correlation, it can be further judged whether the distribution of the correlation in the time domain is relatively balanced, or whether the time correlation at the nearest moment is obviously stronger than that at the next nearest moment to select and determine an appropriate Type 1 predictive coding mode.

(2)当时间相关性明显弱于视点间相关性时,选择某个以视差补偿预测为主的第2类预测编码模式,以便在多视点视频预测编码模块中采用该预测编码模式进行编码。(2) When the temporal correlation is obviously weaker than the inter-view correlation, select a type 2 predictive coding mode mainly based on disparity compensation prediction, so as to use this predictive coding mode in the multi-view video predictive coding module for coding.

(3)当时间相关性与视点间相关性大致相当时,则选择某个第3类兼顾时、空域的联合预测编码模式。(3) When the temporal correlation is approximately equivalent to the inter-view correlation, select a third type of joint predictive coding mode that takes into account temporal and spatial domains.

在本实施例中,由于所采用的3个候选预测编码模式在图3a、图3b和图3c中位于中心十字上的图像帧先于图像组中其它图像帧被编码,而且这3个预测编码模式的这些位于中心十字上的图像帧具有相同的预测方式,因此可以在编码这些图像帧的同时,获得相关性统计分析模块所需的ni D和ni P,从而能够获得当前正在编码的多视点视频信号的相关性统计分析结果,以便在这些位于中心十字上的图像帧编码完成后及时确定图像组中其它帧采取何种预测模式,即从图3a、图3b和图3c所示的3个候选预测编码模式中最终选定一个适合当前多视点视频信号特点的预测编码模式,提交给多视点视频预测编码模块进行编码。In this embodiment, due to the three candidate predictive coding modes adopted, the image frame located on the central cross in Fig. 3a, Fig. 3b and Fig. These image frames located on the central cross in the mode have the same prediction method, so while encoding these image frames, the n i D and n i P required by the correlation statistical analysis module can be obtained, so that the currently encoding The statistical analysis results of the correlation of multi-viewpoint video signals, in order to determine in time which prediction mode to adopt for other frames in the image group after the encoding of these image frames located on the central cross is completed, that is, from the Among the three candidate predictive coding modes, a predictive coding mode suitable for the characteristics of the current multi-view video signal is finally selected, and submitted to the multi-view video predictive coding module for coding.

4)模式更新触发模块4) Mode update trigger module

可以采用基于视频内容的模式更新方案,即根据相关性统计分析模块中得到的相关性系数α的变化情况,由阈值法确定是否重新启用预测模式选择模块以更新相应的预测编码模式;或者也可以采用定时更新触发的方式,定期启用相关性统计分析模块对多视点视频信号的时间相关性和视点间相关性进行统计分析,并启用预测模式选择模块以确定将要采用的预测编码模式。本实施例采用基于视频内容的模式更新方案。A mode update scheme based on video content can be used, that is, according to the change of the correlation coefficient α obtained in the correlation statistical analysis module, the threshold method is used to determine whether to re-enable the prediction mode selection module to update the corresponding prediction coding mode; or In the way of timing updating and triggering, the correlation statistical analysis module is regularly enabled to statistically analyze the time correlation and inter-viewpoint correlation of multi-viewpoint video signals, and the prediction mode selection module is enabled to determine the predictive coding mode to be adopted. This embodiment adopts a mode update scheme based on video content.

以下就本实施例进行多视点视频编码的性能进行说明:The following describes the performance of multi-viewpoint video coding in this embodiment:

1)多模式多视点视频编码方案的随机访问性能1) Random access performance of multi-mode multi-view video coding scheme

对于多视点视频,其随机访问包括快进、快退、视点切换和观看时刻冻结、视点滑动等访问方式。假设用于编码的v个视点、每个视点t帧的多视点视频帧总数s=v×t是有限的。令xi表示在对第i帧进行解码前需要提前解码的帧数,pi为用户随机访问第i帧的概率,则随机访问代价的数学期望 E n = Σ t = 1 v × t x i p i 是评价一个预测编码模式n对随机访问支持程度的重要指标。这个代价越高,说明解码端对随机访问的支持能力越低,为支持随机访问而消耗的资源就越多。设kn为采用第n个预测编码模式编码多视点视频信号的概率,候选预测编码模式个数为N,则多模式多视点视频编码的随机访问代价可表示为 E ( X ) = Σ n = 1 N ( k n × E n ) . For multi-viewpoint video, its random access includes fast forward, rewind, viewpoint switching, viewing moment freezing, viewpoint sliding and other access methods. Assuming v views for encoding, the total number of multi-view video frames s=v×t for each view t frames is limited. Let x i represent the number of frames that need to be decoded in advance before decoding the i-th frame, p i is the probability of the user randomly accessing the i-th frame, then the mathematical expectation of the random access cost E. no = Σ t = 1 v × t x i p i It is an important index to evaluate the support degree of a predictive coding mode n to random access. The higher the cost, the lower the decoder's ability to support random access, and the more resources consumed to support random access. Let k n be the probability of encoding a multi-view video signal using the nth predictive coding mode, and the number of candidate predictive coding modes is N, then the random access cost of multi-mode multi-view video coding can be expressed as E. ( x ) = Σ no = 1 N ( k no × E. no ) .

多模式多视点视频编码中各模式编码的概率kn直接与实际多视点视频信号的特点相关。本实施例中N=3,且假定各模式编码概率相同,即kn=1/3(n=1,2,3),则不同方案的随机访问代价如表1所示。表中PSVP和BSVP分别代表采用P帧、B帧的顺序预测方法,其预测编码模式分别如图4和图5所示。Mpiture为日本Fujii等人的Mpicture多视点视频编码方法,其预测编码模式如图6所示。PSVP、BSVP和Mpiture均为单一模式的多参考帧预测编码方法。MMVC为本发明的采用如图3所示的3种候选预测编码模式的多模式多视点视频编码方法(以本实施例为本发明方案的代表)。由表1可见,就随机访问性能而言,PSVP最差,BSVP和Mpicture相对好些。而本发明的多模式多视点视频编码方法MMVC的随机访问代价最低,相对PSVP、BSVP以及Mpicture方法,其随机访问代价降低了49%~72%,随机访问性能有明显提高。The coding probability k n of each mode in multi-mode multi-view video coding is directly related to the characteristics of the actual multi-view video signal. In this embodiment, N=3, and assuming that the encoding probability of each mode is the same, that is, k n =1/3 (n=1, 2, 3), the random access costs of different schemes are shown in Table 1. In the table, PSVP and BSVP respectively represent the sequential prediction method using P frame and B frame, and their predictive coding modes are shown in Fig. 4 and Fig. 5 respectively. Mpiture is the Mpicture multi-viewpoint video coding method developed by Japan Fujii et al., and its predictive coding mode is shown in FIG. 6 . PSVP, BSVP and Mpiture are single-mode multi-reference frame predictive coding methods. MMVC is a multi-mode multi-viewpoint video coding method of the present invention using three candidate predictive coding modes as shown in FIG. 3 (this embodiment is a representative of the solution of the present invention). It can be seen from Table 1 that in terms of random access performance, PSVP is the worst, while BSVP and Mpicture are relatively better. However, the multi-mode multi-viewpoint video coding method MMVC of the present invention has the lowest random access cost, compared with PSVP, BSVP and Mpicture methods, its random access cost is reduced by 49% to 72%, and the random access performance is obviously improved.

2)多模式多视点视频编码方案的计算复杂度2) Computational complexity of multi-mode multi-view video coding scheme

基于H.264/AVC编码框架的高精度视差补偿预测和运动补偿预测占整个多视点视频编码器75%以上的计算复杂度,因此可通过平均编码一个5×7图像组所需视差补偿预测和运动补偿预测的次数来表征整个编码器的计算复杂度。各方案计算复杂度比较如表1所示,由于采用了多参考帧方法,PSVP、BSVP和Mpicture方案的计算复杂度都很大,尤其是BSVP和Mpicuture方法。而与PSVP、BSVP和Mpicture方案相比,本发明方案的计算复杂度则相对降低了29%~57%。The high-precision disparity compensation prediction and motion compensation prediction based on the H.264/AVC coding framework account for more than 75% of the computational complexity of the entire multi-view video encoder. Therefore, the disparity compensation prediction and The number of motion-compensated predictions characterizes the computational complexity of the entire encoder. Computational complexity comparison of various schemes is shown in Table 1. Due to the use of multiple reference frame methods, the computational complexity of PSVP, BSVP and Mpicture schemes is very large, especially the BSVP and Mpicture methods. Compared with the PSVP, BSVP and Mpicture schemes, the computational complexity of the scheme of the present invention is relatively reduced by 29% to 57%.

表1本发明方案MMVC的随机访问代价和计算复杂度比较Table 1 Comparison of Random Access Cost and Computational Complexity of the MMVC of the present invention

  编码方案 encoding scheme   E(X) E(X)   随机访问代价倍数 Random access cost multiplier   计算复杂度 Computational complexity   计算复杂度倍数 Computational complexity multiple   PSVP PSVP   11.0 11.0   364% 364%   58 58   141% 141%   BSVP BSVP   7.5 7.5   248% 248%   83 83   202% 202%   Mpicture Mpicture   6.0 6.0   199% 199%   97 97   237% 237%   MMVC MMVC   3.02 3.02   100% 100%   41 41   100% 100%

3)多模式多视点视频编码方案的率失真性能3) Rate-distortion performance of multi-mode multi-view video coding schemes

为了评价本发明MMVC方案的编码效率,基于H.264/AVC(JM8.5mainprofile)视频编码框架,进行了多视点视频编码实验(量化参数QP分别为24、30、36、40)。多视点视频测试序列选用Tanimoto实验室和MERL的Xmas(相机间距9mm、视点间相关性大)、exit(运动缓慢,大视差,相机间距19.5cm)和ballroom(运动剧烈)的多视点测试序列集,3个序列均为平行相机系统所拍摄,分辨率为640×480。选取5个视点,5个场景,每个场景5个图像组,并将其拼接成如图7所示的Joint序列,即每个视点视频各5×5×7=175帧。实验中,通过序列拼接的方式模拟实际视频的场景切换,本实施例中,MMVC可以自适应根据视频内容从图3a、图3b和图3c所示的候选预测编码模式中选择合适的预测编码模式对Joint序列进行编码。In order to evaluate the coding efficiency of the MMVC scheme of the present invention, based on the H.264/AVC (JM8.5mainprofile) video coding framework, a multi-viewpoint video coding experiment was carried out (quantization parameters QP are 24, 30, 36, 40 respectively). The multi-viewpoint video test sequence uses the multi-viewpoint test sequence set of Xmas (camera distance 9mm, high correlation between viewpoints), exit (slow motion, large parallax, camera distance 19.5cm) and ballroom (vigorous movement) from Tanimoto Lab and MERL , all three sequences were captured by a parallel camera system with a resolution of 640×480. Select 5 viewpoints, 5 scenes, and 5 image groups in each scene, and stitch them into a Joint sequence as shown in Figure 7, that is, each viewpoint video has 5×5×7=175 frames. In the experiment, the scene switching of the actual video is simulated by sequence splicing. In this embodiment, MMVC can adaptively select an appropriate predictive coding mode from the candidate predictive coding modes shown in FIG. 3a, FIG. 3b and FIG. 3c according to the video content Encode the Joint sequence.

图8、9、10、11为采用本实施例的MMVC与顺序预测法PSVP、BSVP以及Mpicture等方法对Joint测试序列编码的率失真性能比较。其中图8、9、10分别为Joint序列中的Xmas、exit和ballroom三个序列各自的平均率失真曲线。图11所示的Joint序列的总体平均率失真曲线表明MMVC与BSVP、PSVP和Mpicture的率失真性能基本相当。Figures 8, 9, 10, and 11 are comparisons of the rate-distortion performance of joint test sequence encoding using MMVC of this embodiment and methods such as sequential prediction methods PSVP, BSVP, and Mpicture. Among them, Figures 8, 9, and 10 are the average rate-distortion curves of the Xmas, exit, and ballroom sequences in the Joint sequence, respectively. The overall average rate-distortion curve of the Joint sequence shown in Figure 11 shows that the rate-distortion performance of MMVC is basically equivalent to that of BSVP, PSVP and Mpicture.

综上所述,与现有技术相比,本发明的优点在于通过对多视点视频信号时间相关性与视点间相关性分析,动态选择适合于当前被编码的多视点视频信号特点以及多视点视频编码综合性能要求的预测编码模式,以取代现有单一模式的计算复杂的联合时间与空间预测的多参考帧预测编码方法,从而有效降低多视点视频信号编码压缩的计算复杂度,提高多视点视频系统的随机访问性能,同时保证编码压缩性能。In summary, compared with the prior art, the present invention has the advantage of dynamically selecting the characteristics of the currently encoded multi-view video signal and the multi-view video signal by analyzing the temporal correlation and inter-view correlation of the multi-view video signal. The predictive coding mode required by the comprehensive coding performance can replace the multi-reference frame predictive coding method of the existing single mode, which is computationally complex and joint temporal and spatial prediction, so as to effectively reduce the computational complexity of multi-viewpoint video signal coding and compression, and improve the performance of multi-viewpoint video. The random access performance of the system, while ensuring the encoding compression performance.

显而易见,多视点视频预测编码模式不仅限于本实施例的形式,因此在不背离权利要求及同等范围所限定的一般概念的精神和范围的情况下,本发明并不限于特定的细节和这里示出与描述的示例。It is obvious that the multi-view video predictive coding mode is not limited to the form of this embodiment, so the invention is not limited to the specific details and examples shown here without departing from the spirit and scope of the general concept defined by the claims and equivalents. Example with description.

Claims (9)

1. 一种多模式多视点视频信号编码压缩方法,其特征在于将编码器设置成多视点视频预测编码模块、相关性统计分析模块、预测模式选择模块和模式更新触发模块四个功能模块,对输入的多视点视频信号,先根据已知信息确定初始预测编码模式,由所述的多视点视频预测编码模块进行编码,然后按以下步骤进行编码:①由所述的预测模式选择模块根据所述的相关性统计分析模块统计分析得到的多视点视频信号相关性特征以及对多视点视频编码的综合性能的要求,从候选预测编码模式中动态选择确定适合当前正在编码的多视点视频信号特点的预测编码模式;②由所述的多视点视频预测编码模块以该选定的预测编码模式对输入的多视点视频信号进行编码后,输出编码压缩后的码流信号;③当所述的模式更新触发模块中的模式更新触发条件满足时,重新开启所述的相关性统计分析模块,以选择更新预测编码模式。1. A multi-mode multi-viewpoint video signal coding compression method is characterized in that encoder is set to four functional modules of multi-viewpoint video predictive coding module, correlation statistical analysis module, prediction mode selection module and mode update trigger module, for The input multi-viewpoint video signal first determines the initial predictive coding mode according to the known information, encodes by the multi-viewpoint video predictive coding module, and then performs coding according to the following steps: ① by the prediction mode selection module according to the described The statistical correlation analysis module of the multi-view video signal obtained by statistical analysis and the requirements for the comprehensive performance of multi-view video coding, dynamically select from the candidate predictive coding modes to determine the prediction suitable for the characteristics of the multi-view video signal currently being coded Coding mode; 2. After the multi-view video predictive coding module encodes the input multi-view video signal with the selected predictive coding mode, the encoded and compressed code stream signal is output; 3. When the mode update triggers When the mode update triggering condition in the module is met, the correlation statistical analysis module is restarted to select and update the predictive coding mode. 2. 如权利要求1所述的多模式多视点视频信号编码压缩方法,其特征在于所述的候选预测编码模式分为三大类:第1类为适用于以时间相关性为主的多视点视频信号的预测编码模式,该类预测编码模式以运动补偿预测为主;第2类为适用于以视点间相关性为主的多视点视频信号的预测编码模式,该类预测编码模式以视差补偿预测为主;第3类为适用于时间相关性和视点间相关性均衡的多视点视频信号的预测编码模式,该类预测编码模式为兼顾时、空域的联合预测编码模式。2. The method for encoding and compressing multi-mode and multi-viewpoint video signals as claimed in claim 1, wherein the candidate predictive coding modes are divided into three categories: the first category is applicable to multi-viewpoints based on temporal correlation The predictive coding mode of the video signal, this type of predictive coding mode is based on motion compensation prediction; the second type is the predictive coding mode suitable for multi-viewpoint video signals based on inter-viewpoint correlation, this type of predictive coding mode is based on parallax compensation Prediction-based; the third category is the predictive coding mode of multi-view video signals suitable for the balance of temporal correlation and inter-view correlation. 3. 如权利要求1所述的多模式多视点视频信号编码压缩方法,其特征在于所述的相关性统计分析模块的统计分析是对已编码或正在编码的图像组的时间相关性与视点间相关性进行统计分析,并定义相关性系数α用于表征得到的视频信号的时间相关性与视点间相关性的强弱对比。3. The method for encoding and compressing multi-mode and multi-viewpoint video signals as claimed in claim 1, wherein the statistical analysis of the statistical analysis module of correlation is to the temporal correlation between the encoded or the image group being encoded and between the viewpoints. The correlation is statistically analyzed, and the correlation coefficient α is defined to characterize the strength comparison between the time correlation of the obtained video signal and the correlation between viewpoints. 4. 如权利要求3所述的多模式多视点视频信号编码压缩方法,其特征在于在所述的相关性统计分析模块中,对已编码或正在编码的图像组中仅采用视差补偿预测进行编码的图像帧中的帧内编码块的数量nl D和仅采用运动补偿预测进行编码的图像帧中帧内编码块的数量nl P进行统计,以nl D和nl P的比例关系来描述当前多视点视频信号的时间和视点间相关性的强弱关系。4. The method for encoding and compressing multi-mode and multi-viewpoint video signals as claimed in claim 3, characterized in that in the described correlation statistics analysis module, only parallax compensation prediction is used for encoding in the image group that has been encoded or is being encoded The number n l D of the intra-coded blocks in the image frame and the number n l P of the intra-coded blocks in the image frame encoded only by motion compensation prediction are counted, and the ratio between n l D and n l P is calculated Describe the relationship between time and inter-view correlation strength of the current multi-view video signal. 5. 如权利要求3所述的多模式多视点视频信号编码压缩方法,其特征在于在所述的相关性统计分析模块中,对已编码或正在编码的图像组,以仅采用视差补偿预测进行编码的图像帧的预测误差和仅采用运动补偿预测进行编码的图像帧的预测误差的比例关系来分析当前多视点视频信号的时间和视点间相关性强弱关系。5. The method for encoding and compressing multi-mode and multi-viewpoint video signals as claimed in claim 3, characterized in that in the statistical correlation analysis module, the group of images that have been encoded or are being encoded are only predicted using parallax compensation The proportional relationship between the prediction error of the coded image frame and the prediction error of the coded image frame using only motion compensation prediction is used to analyze the temporal and inter-view correlation strength of the current multi-viewpoint video signal. 6. 如权利要求2所述的多模式多视点视频信号编码压缩方法,其特征在于所述的预测模式选择模块根据所述的相关性统计分析模块统计分析得到的多视点视频信号相关性特征以及对多视点视频编码的综合性能的要求进行预测模式选择的方式如下:6. The multi-mode multi-viewpoint video signal coding and compression method as claimed in claim 2, wherein said prediction mode selection module obtains multi-viewpoint video signal correlation characteristics and The method of selecting the prediction mode for the comprehensive performance requirements of multi-view video coding is as follows: (1)当时间相关性明显强于视点间相关性时,进一步判断相关性在时域内部的分布情况是否相对均衡,或是最邻近时刻的时间相关性明显强于次邻近时刻的时间相关性,选择以运动补偿预测为主的预测编码模式;(1) When the time correlation is obviously stronger than the inter-viewpoint correlation, further judge whether the distribution of the correlation in the time domain is relatively balanced, or whether the time correlation of the nearest moment is obviously stronger than that of the next nearest moment , select the prediction coding mode mainly based on motion compensation prediction; (2)当时间相关性明显弱于视点间相关性时,选择以视差补偿预测为主的预测编码模式;(2) When the temporal correlation is obviously weaker than the inter-view correlation, select the predictive coding mode based on parallax compensation prediction; (3)当时间相关性与视点间相关性大致相当时,选择兼顾时、空域的联合预测编码模式。(3) When the temporal correlation is roughly equivalent to the inter-view correlation, choose a joint predictive coding mode that takes both temporal and spatial domains into account. 7. 如权利要求1所述的多模式多视点视频信号编码压缩方法,其特征在于所述的模式更新触发模块采用基于视频内容的模式更新方案,根据所述的相关性统计分析模块中得到的相关性系数α的变化情况,确定是否重新启用所述的预测模式选择模块以更新预测编码模式。7. The method for encoding and compressing multi-mode and multi-viewpoint video signals as claimed in claim 1, wherein said mode update trigger module adopts a mode update scheme based on video content, and obtains according to said correlation statistical analysis module The variation of the correlation coefficient α determines whether to re-enable the prediction mode selection module to update the prediction coding mode. 8. 如权利要求1所述的多模式多视点视频信号编码压缩方法,其特征在于所述的模式更新触发模块采用定时更新触发的方式,定期开启所述的相关性统计分析模块对多视点视频信号的时间相关性和视点间相关性进行统计分析,并启用预测模式选择模块以确定预测编码模式。8. The method for encoding and compressing multi-mode and multi-viewpoint video signals as claimed in claim 1, wherein said mode update trigger module adopts the mode of timing update trigger, and regularly opens said correlation statistics analysis module for multi-viewpoint video The temporal correlation and inter-view correlation of the signal are statistically analyzed and the predictive mode selection module is enabled to determine the predictive coding mode. 9. 如权利要求1所述的多模式多视点视频信号编码压缩方法,其特征在于所述的候选预测编码模式的所有候选模式中位于和帧内编码帧同一时刻的图像帧以及位于与帧内编码帧同一视点的图像帧均先于图像组中其它图像帧被编码,且这些率先被编码的图像帧在所有候选模式中都具有相同的预测方式。9. The multi-mode multi-viewpoint video signal coding and compression method as claimed in claim 1, wherein in all candidate modes of said candidate predictive coding mode, the image frame located at the same moment as the intra-frame encoded frame and the image frame located at the same time as the intra-frame encoded frame The image frames of the same viewpoint of the coded frame are encoded before other image frames in the image group, and these image frames encoded first have the same prediction mode in all candidate modes.
CNB2006100528959A 2006-08-11 2006-08-11 Coding and compression method of multi-mode and multi-viewpoint video signal Expired - Fee Related CN100415002C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2006100528959A CN100415002C (en) 2006-08-11 2006-08-11 Coding and compression method of multi-mode and multi-viewpoint video signal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2006100528959A CN100415002C (en) 2006-08-11 2006-08-11 Coding and compression method of multi-mode and multi-viewpoint video signal

Publications (2)

Publication Number Publication Date
CN1913640A CN1913640A (en) 2007-02-14
CN100415002C true CN100415002C (en) 2008-08-27

Family

ID=37722378

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2006100528959A Expired - Fee Related CN100415002C (en) 2006-08-11 2006-08-11 Coding and compression method of multi-mode and multi-viewpoint video signal

Country Status (1)

Country Link
CN (1) CN100415002C (en)

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101291434A (en) * 2007-04-17 2008-10-22 华为技术有限公司 Encoding/decoding method and device for multi-video
US8761265B2 (en) * 2007-04-17 2014-06-24 Thomson Licensing Hypothetical reference decoder for multiview video coding
CN101087400B (en) * 2007-06-26 2011-09-21 中兴通讯股份有限公司 Video frame delay detection method and system
CN101389034B (en) * 2007-09-14 2010-06-09 华为技术有限公司 Image encoding/decoding method, apparatus and an image processing method, system
CN101170702B (en) * 2007-11-23 2010-08-11 四川虹微技术有限公司 Multi-view video coding method
CN101547010B (en) * 2008-03-24 2011-07-06 华为技术有限公司 Coding and decoding system, method and device
JP2011519227A (en) * 2008-04-25 2011-06-30 トムソン ライセンシング Depth signal encoding
JP5566385B2 (en) 2008-08-20 2014-08-06 トムソン ライセンシング Sophisticated depth map
CN102272778B (en) 2009-01-07 2015-05-20 汤姆森特许公司 Joint depth estimation
CN102148952B (en) * 2010-02-05 2013-12-11 鸿富锦精密工业(深圳)有限公司 Video image compression method and playing method thereof
CN101937578B (en) * 2010-09-08 2012-07-04 宁波大学 Method for drawing virtual view color image
CN103596012B (en) * 2013-11-14 2017-05-10 山东电子职业技术学院 Interframe macro block type selecting method used in real-time AVS-based video frame rate transcoding
CN103974070B (en) * 2014-04-25 2017-08-15 广州市香港科大霍英东研究院 Wireless video transmission method and system based on multi-user input and output
CN111669601B (en) * 2020-05-21 2022-02-08 天津大学 Intelligent multi-domain joint prediction coding method and device for 3D video
CN116962715A (en) * 2022-03-31 2023-10-27 华为技术有限公司 Encoding method, apparatus, storage medium, and computer program product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030202592A1 (en) * 2002-04-20 2003-10-30 Sohn Kwang Hoon Apparatus for encoding a multi-view moving picture
CN1613263A (en) * 2001-11-21 2005-05-04 韩国电子通信研究院 3d stereoscopic/multiview video processing system and its method
WO2006062377A1 (en) * 2004-12-10 2006-06-15 Electronics And Telecommunications Research Institute Apparatus for universal coding for multi-view video

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1613263A (en) * 2001-11-21 2005-05-04 韩国电子通信研究院 3d stereoscopic/multiview video processing system and its method
US20030202592A1 (en) * 2002-04-20 2003-10-30 Sohn Kwang Hoon Apparatus for encoding a multi-view moving picture
WO2006062377A1 (en) * 2004-12-10 2006-06-15 Electronics And Telecommunications Research Institute Apparatus for universal coding for multi-view video

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Encoding and reconstruction of multiview video objects. Olm, J-R.Signal Processing Magazine, IEEE,Vol.16 No.3. 1999
Encoding and reconstruction of multiview video objects. Olm, J-R.Signal Processing Magazine, IEEE,Vol.16 No.3. 1999 *
多视点视频编码中的视频目标提取与视差匹配. 朱仲杰,蒋刚毅,郁梅,吴训威.电子学报,第5期. 2004
多视点视频编码中的视频目标提取与视差匹配. 朱仲杰,蒋刚毅,郁梅,吴训威.电子学报,第5期. 2004 *

Also Published As

Publication number Publication date
CN1913640A (en) 2007-02-14

Similar Documents

Publication Publication Date Title
CN100415002C (en) Coding and compression method of multi-mode and multi-viewpoint video signal
CN101867813B (en) Multi-view video coding method oriented for interactive application
CN101600108B (en) Joint estimation method for movement and parallax error in multi-view video coding
CN101540926B (en) Stereoscopic Video Coding and Decoding Method Based on H.264
CN102055982B (en) Coding and decoding methods and devices for three-dimensional video
CN101729891B (en) Method for encoding multi-view depth video
CN101986716B (en) Quick depth video coding method
CN101404766B (en) A coding method for multi-viewpoint video signal
CN103037218B (en) Multi-view stereoscopic video compression and decompression method based on fractal and H.264
CN102291579B (en) A Fast Fractal Compression and Decompression Method for Multi-eye Stereo Video
CN103338370B (en) A kind of multi-view depth video fast encoding method
CN103024381B (en) A kind of macro block mode fast selecting method based on proper discernable distortion
CN103051894B (en) A kind of based on fractal and H.264 binocular tri-dimensional video compression & decompression method
CN102316323B (en) A Fast Fractal Compression and Decompression Method for Binocular Stereo Video
CN101568038B (en) Multi-viewpoint error resilient coding scheme based on disparity/movement joint estimation
CN101588487A (en) Video intraframe predictive coding method
CN101720042A (en) Method for coding multi-view video signal
CN103188500B (en) Encoding method for multi-view video signals
Yan et al. CTU layer rate control algorithm in scene change video for free-viewpoint video
CN101711001B (en) Evaluating method of compression properties of layered B forecasting structures
CN101980539A (en) A Fractal-Based Multi-eye Stereoscopic Video Compression Codec Method
CN101557519A (en) A Multi-View Video Coding Method
CN101986713B (en) View synthesis-based multi-viewpoint error-resilient encoding frame
CN104618714B (en) A kind of stereo video frame importance appraisal procedure
CN110139089A (en) A kind of the 3 d video encoding bit rate control method and storage equipment of combination scene detection

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: SHANGHAI SILICON INTELLECTUAL PROPERTY EXCHANGE CE

Free format text: FORMER OWNER: NINGBO UNIVERSITY

Effective date: 20120105

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 315211 NINGBO, ZHEJIANG PROVINCE TO: 200030 XUHUI, SHANGHAI

TR01 Transfer of patent right

Effective date of registration: 20120105

Address after: 200030 Shanghai City No. 333 Yishan Road Huixin International Building 1 building 1704

Patentee after: Shanghai Silicon Intellectual Property Exchange Co.,Ltd.

Address before: 315211 Zhejiang Province, Ningbo Jiangbei District Fenghua Road No. 818

Patentee before: Ningbo University

ASS Succession or assignment of patent right

Owner name: SHANGHAI SIPAI KESI TECHNOLOGY CO., LTD.

Free format text: FORMER OWNER: SHANGHAI SILICON INTELLECTUAL PROPERTY EXCHANGE CENTER CO., LTD.

Effective date: 20120217

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 200030 XUHUI, SHANGHAI TO: 201203 PUDONG NEW AREA, SHANGHAI

TR01 Transfer of patent right

Effective date of registration: 20120217

Address after: 201203 Shanghai Chunxiao Road No. 350 South Building Room 207

Patentee after: Shanghai spparks Technology Co.,Ltd.

Address before: 200030 Shanghai City No. 333 Yishan Road Huixin International Building 1 building 1704

Patentee before: Shanghai Silicon Intellectual Property Exchange Co.,Ltd.

ASS Succession or assignment of patent right

Owner name: SHANGHAI GUIZHI INTELLECTUAL PROPERTY SERVICE CO.,

Free format text: FORMER OWNER: SHANGHAI SIPAI KESI TECHNOLOGY CO., LTD.

Effective date: 20120606

C41 Transfer of patent application or patent right or utility model
C56 Change in the name or address of the patentee
CP02 Change in the address of a patent holder

Address after: 200030 Shanghai City No. 333 Yishan Road Huixin International Building 1 building 1706

Patentee after: Shanghai spparks Technology Co.,Ltd.

Address before: 201203 Shanghai Chunxiao Road No. 350 South Building Room 207

Patentee before: Shanghai spparks Technology Co.,Ltd.

TR01 Transfer of patent right

Effective date of registration: 20120606

Address after: 200030 Shanghai City No. 333 Yishan Road Huixin International Building 1 building 1704

Patentee after: Shanghai Guizhi Intellectual Property Service Co.,Ltd.

Address before: 200030 Shanghai City No. 333 Yishan Road Huixin International Building 1 building 1706

Patentee before: Shanghai spparks Technology Co.,Ltd.

DD01 Delivery of document by public notice

Addressee: Shi Lingling

Document name: Notification of Passing Examination on Formalities

TR01 Transfer of patent right

Effective date of registration: 20200121

Address after: 201203 block 22301-1450, building 14, No. 498, GuoShouJing Road, Pudong New Area (Shanghai) pilot Free Trade Zone, Shanghai

Patentee after: Shanghai spparks Technology Co.,Ltd.

Address before: 200030 Shanghai City No. 333 Yishan Road Huixin International Building 1 building 1704

Patentee before: Shanghai Guizhi Intellectual Property Service Co.,Ltd.

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20080827

Termination date: 20200811

CF01 Termination of patent right due to non-payment of annual fee