CN113039804B

CN113039804B - Bit stream merging

Info

Publication number: CN113039804B
Application number: CN201980074638.0A
Authority: CN
Inventors: 罗伯特·斯库宾; 亚戈·桑切斯德拉富恩特; 科内柳斯·海尔奇; 托马斯·斯切尔; 卡尔斯滕·祖灵; 托马斯·威甘德
Original assignee: Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Current assignee: Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority date: 2018-09-13
Filing date: 2019-09-12
Publication date: 2024-07-09
Anticipated expiration: 2039-09-12
Also published as: JP2023022096A; US11800128B2; JP2024123203A; MX2021002934A; KR20210047935A; JP2022501889A; WO2020053369A1; EP3850851A1; CN113039804A; JP7511733B2; KR20230128584A; US20230421793A1; JP7179163B2; US20210203973A1; CN118694980A; KR102750623B1; JP2023171865A; JP7359926B2; KR20250013252A; BR112021004636A2

Abstract

A video encoder (2) for providing an encoded video representation (12), wherein the video encoder (2) is configured to provide a video stream (12) comprising encoding parameter information describing a plurality of encoding parameters (20, 22, 26, 30), encoding video content information and one or more merge identifiers indicating whether the encoded video representation (12) can be merged with another encoded video representation and/or how the encoded video representation (12) can be merged with another encoded video representation.

Description

Bitstream merging

技术领域Technical Field

本申请涉及视频编码/解码。The present application relates to video encoding/decoding.

背景技术Background technique

视频合成用于向用户呈现多个视频源的合成的许多应用中。常见的示例是画中画(PiP)合成以及将覆盖物(overlay)与视频内容混合，例如用于广告或用户界面。在像素域中产生这种合成需要对输入视频比特流进行并行解码，这在计算上很复杂，在具有单个硬件解码器或其他有限资源的设备上甚至是不可行的。例如，在当前的IPTV系统设计中，有能力的机顶盒执行合成，并且由于其复杂性、分布和有限的使用寿命而成为主要的服务成本因素。减少这些成本因素促使人们不断努力将机顶盒功能虚拟化，例如将用户界面生成转移到云资源。采用这种方法时，仅视频解码器(所谓的零客户端)是保留在客户处所中的唯一硬件。这种系统设计的最新状态是基于转码进行合成，即以其最简单的形式进行合成：在传输之前或传输期间进行解码、像素域合成和重新编码。为了减少整个解码和编码周期的工作量，首先提出了在变换系数域而不是像素域中进行PiP合成的操作。从那时起，提出了许多融合或缩短各个合成步骤并将其应用于当前视频编解码器的技术。然而，用于一般合成的基于转码的方法在计算上仍然很复杂，这损害了系统的可伸缩性。取决于转码方法，这种合成还可能影响率失真(RD)性能。Video synthesis is used in many applications to present a synthesis of multiple video sources to the user. Common examples are picture-in-picture (PiP) synthesis and mixing overlays with video content, e.g. for advertising or user interfaces. Producing such synthesis in the pixel domain requires parallel decoding of the input video bitstreams, which is computationally complex and even not feasible on devices with a single hardware decoder or other limited resources. For example, in current IPTV system designs, capable set-top boxes perform the synthesis and are a major service cost factor due to their complexity, distribution and limited lifespan. Reducing these cost factors has led to continuous efforts to virtualize the set-top box functionality, such as moving user interface generation to cloud resources. With this approach, only the video decoder (so-called zero client) is the only hardware that remains at the customer's premises. The latest state of the art in such system designs is synthesis based on transcoding, i.e., synthesis in its simplest form: decoding, pixel-domain synthesis and re-encoding before or during transmission. In order to reduce the workload of the entire decoding and encoding cycle, it was first proposed to perform PiP synthesis in the transform coefficient domain instead of the pixel domain. Since then, many techniques have been proposed to merge or shorten the individual synthesis steps and apply them to current video codecs. However, transcoding-based approaches for general synthesis are still computationally complex, which hurts the scalability of the system. Depending on the transcoding method, such synthesis may also affect the rate-distortion (RD) performance.

另外，存在建立在图块上构建的广泛应用，其中图块是视频平面的空间子集，独立于相邻图块而被编码。用于360°视频的基于图块的流传输系统通过将360°视频分割为图块进行工作，所述图块以各种分辨率被编码到子比特流中，并取决于当前用户观看朝向在客户端侧合并为单个比特流。涉及子比特流的合并的另一应用例如是传统视频内容与横幅广告的组合。此外，子比特流的重新组合也可以是视频会议系统中的关键部分，其中多个用户将其各自的视频流发送到最终对这些流进行合并的接收器。甚至进一步，在将得到的图块比特流合并回单个比特流之前，基于图块的云编码系统依赖于故障安全检查，在该基于图块的云编码系统中，将视频分为图块，并将图块编码分发到单独且独立的实例。在所有这些应用中，流被合并以允许在具有已知一致性点的单个视频解码器上进行解码。本文中的合并是指轻量级的比特流重写，其不需要对比特流或像素值重构的熵编码数据进行完全的解码和编码操作。然而，用于确保成功形成比特流合并(即，得到的合并比特流的一致性)的技术源自许多不同的应用场景。In addition, there are a wide range of applications built on tiles, where a tile is a spatial subset of a video plane that is encoded independently of neighboring tiles. Tile-based streaming systems for 360° video work by splitting the 360° video into tiles, which are encoded into sub-bitstreams at various resolutions and merged into a single bitstream on the client side depending on the current user viewing orientation. Another application involving the merging of sub-bitstreams is, for example, the combination of traditional video content with banner ads. In addition, the recombination of sub-bitstreams can also be a key part of video conferencing systems, where multiple users send their respective video streams to a receiver that eventually merges the streams. Even further, before merging the resulting tile bitstreams back into a single bitstream, tile-based cloud coding systems rely on fail-safe checks, in which the video is divided into tiles and the tile encoding is distributed to separate and independent instances. In all of these applications, the streams are merged to allow decoding on a single video decoder with known consistency points. Merging in this context refers to lightweight bitstream rewriting that does not require full decoding and encoding operations on the bitstream or entropy coded data for pixel value reconstruction. However, techniques for ensuring successful bitstream merging (i.e., consistency of the resulting merged bitstream) are derived from many different application scenarios.

例如，传统编解码器支持被称为运动受限图块集合(MCTS)的技术，其中编码器将图片之间的帧间预测约束为仅限于图块或图片的边界内，即，不使用不属于同一图块或位于图片边界之外的样本值或语法元素值。该技术源自感兴趣区域(ROI)解码，其中解码器可以仅对比特流和编码图片的特定子部分进行解码，而不会遇到无法解决的依赖性且避免了漂移。该背景下的另一种最新技术是图片结构(SOP)SEI消息，其给出了所应用的比特流结构(即，图片顺序和帧间预测参考结构)的指示。通过针对两个随机接入点(RAP)之间的每个图片将图片顺序计数(POC)值、有效序列参数集合(SPS)标识符、参考图片集合(RPS)索引概括到有效SPS中来提供该信息。基于该信息，可以识别比特流结构，该比特流结构可以帮助转码器、中间盒或媒体感知网络实体(MANE)或媒体播放器对比特流进行操作或改变比特流，例如调整比特率、丢帧或快进。For example, conventional codecs support a technique called motion constrained tile sets (MCTS), in which the encoder constrains inter prediction between pictures to be limited to the boundaries of tiles or pictures, i.e., sample values or syntax element values that do not belong to the same tile or are outside the picture boundaries are not used. This technique is derived from region of interest (ROI) decoding, in which the decoder can decode only specific sub-parts of the bitstream and the coded picture without encountering unresolvable dependencies and avoiding drift. Another recent technology in this context is the picture structure (SOP) SEI message, which gives an indication of the applied bitstream structure (i.e., picture order and inter prediction reference structure). This information is provided by summarizing the picture order count (POC) value, the valid sequence parameter set (SPS) identifier, and the reference picture set (RPS) index into the valid SPS for each picture between two random access points (RAPs). Based on this information, the bitstream structure can be identified, which can help the transcoder, middlebox or media-aware network entity (MANE) or media player to operate or change the bitstream, such as adjusting the bitrate, dropping frames or fast forwarding.

尽管上述两种示例性信号传送技术在理解在没有显著的语法改变或甚至完全转码的情况下是否可以进行子比特流的轻量级合并中都是必要的，但是它们还远远不够。更详细地，该背景下的轻量级合并的特征在于，仅用较小的重写操作对源比特流的NAL单元进行交织，即，以新的图像大小和图块结构来写联合使用的参数集合，使得要合并的每个比特流将位于单独的图块区域中。下一级别的合并复杂度由条带头部元素的较小重写构成，理想情况下不改变条带头部中的可变长度码。存在其它级别的合并复杂度，例如，在条带数据上重新运行熵编码，以改变经熵编码但是可以在无需像素值重构的情况下改变的特定语法元素，与利用视频的解码和编码的完全转码(这被认为不是轻量级的)相比，这可以被认为是有益的且相当轻量级的。Although both of the above exemplary signaling techniques are necessary in understanding whether lightweight merging of sub-bitstreams can be performed without significant syntax changes or even full transcoding, they are far from sufficient. In more detail, lightweight merging in this context is characterized by interleaving the NAL units of the source bitstreams with only a small rewrite operation, i.e., writing the jointly used parameter set with the new image size and tile structure so that each bitstream to be merged will be located in a separate tile area. The next level of merging complexity consists of a small rewrite of the slice header elements, ideally without changing the variable length codes in the slice header. There are other levels of merging complexity, for example, re-running entropy coding on the slice data to change specific syntax elements that are entropy coded but can be changed without pixel value reconstruction, which can be considered beneficial and quite lightweight compared to full transcoding with decoding and encoding of the video (which is not considered lightweight).

在合并比特流中，所有条带必须参考相同的参数集合。当原始比特流的参数集合使用甚至显著不同的设置时，轻量级合并可能是不可能的，因为许多参数集合语法元素对条带头部和条带有效载荷语法有进一步的影响，并且它们的影响程度不同。语法元素参与解码过程的程度越深，合并/重写就变得越复杂。(参数集合和其他结构的)语法依赖性的一些值得注意的一般类别可以如下区别。In a merged bitstream, all slices must refer to the same parameter sets. Lightweight merging may not be possible when the parameter sets of the original bitstreams use even significantly different settings, because many parameter set syntax elements have further impact on slice header and slice payload syntax, and their impact varies in degree. The deeper the involvement of a syntax element in the decoding process, the more complex the merging/rewriting becomes. Some noteworthy general categories of syntax dependencies (of parameter sets and other structures) can be distinguished as follows.

A.语法存在指示A. Syntax Existence Indication

B.值计算依赖性B. Value Calculation Dependency

C.条带有效载荷编码工具控制C. Strip Payload Encoding Tool Control

a.解码过程初期使用的语法(例如，系数符号隐藏、块划分限制)a. Syntax used in the early stages of the decoding process (e.g. coefficient sign hiding, block partitioning restrictions)

b.解码过程后期使用的语法(例如，运动补偿、环路滤波器)或一般的解码过程控制(参考图片、比特流顺序)b. Syntax used later in the decoding process (e.g., motion compensation, loop filter) or general decoding process control (reference pictures, bitstream order)

D.源格式参数D. Source format parameters

针对类别A，参数集合携带用于各种工具的许多存在标志(例如，dependent_slice_segments_enabled_flag或output_flag_present_flag)。在这些标志存在差异的情况下，可以将标志设置为在联合参数集合中启用，并且可以将默认值显式地写入比特流中的条带的合并条带头部中，该合并条带头部不包括合并之前的语法元素，即，在这种情况下，合并需要改变参数集合和条带头部语法。For category A, the parameter set carries a number of presence flags (e.g., dependent_slice_segments_enabled_flag or output_flag_present_flag) for various tools. In the case where there are differences in these flags, the flags can be set to enable in the joint parameter set, and the default values can be explicitly written to the merged slice header of the slices in the bitstream that do not include the syntax elements before the merge, i.e., in this case, the merge requires changes to the parameter set and the slice header syntax.

针对类别B，可以在计算中使用参数集合语法的以信号传送的值以及其他参数集合的参数或条带头部，例如，在HEVC中，使用条带的条带量化参数(QP)来控制条带的残差信号的变换系数量化的粗糙度。比特流中条带QP(SliceQpY)的信号传送取决于图片参数集合(PPS)级别上的QP信号传送，如下所示：For category B, the signaled values of the parameter set syntax and parameters of other parameter sets or slice headers may be used in the calculation, for example, in HEVC, the slice quantization parameter (QP) of a slice is used to control the coarseness of the quantization of the transform coefficients of the residual signal of the slice. The signaling of the slice QP (SliceQpY) in the bitstream depends on the QP signaling at the picture parameter set (PPS) level, as shown below:

SliceQpY＝26+init_qp_minus26(来自PPS)+slice_qp_delta(来自条带头部)SliceQpY=26+init_qp_minus26 (from PPS)+slice_qp_delta (from slice header)

由于(合并的)编码视频图片中的每个条带需要参考相同的激活PPS，因此要合并的各个流的PPS中的init_qp_minus26的差异要求调整条带头部以反映init_qp_minus26的新的公共值。即，在这种情况下，合并需要改变参数集合和条带头部语法，如针对类别1做出的。Since each slice in the (merged) coded video picture needs to reference the same active PPS, the difference in init_qp_minus26 in the PPS of the individual streams to be merged requires adjustment of the slice header to reflect the new common value of init_qp_minus26. That is, in this case, the merge requires changes to the parameter sets and slice header syntax, as made for category 1.

针对类别C，其他参数集合语法元素控制影响条带有效载荷的比特流结构的编码工具。可以根据语法元素参与解码过程的程度和与对这些语法元素进行改变相关联的复杂度(与语法元素参与解码过程的程度相关)来区别子类别C.a和C.b，即，语法元素参与熵编码和像素水平重构之间的哪些过程。For category C, other parameter set syntax elements control coding tools that affect the bitstream structure of the slice payload. Subcategories C.a and C.b can be distinguished according to the degree to which the syntax elements participate in the decoding process and the complexity associated with making changes to these syntax elements (related to the degree to which the syntax elements participate in the decoding process), i.e., which processes between entropy coding and pixel level reconstruction the syntax elements participate in.

例如，类别C.a中的一个元素是sign_data_hiding_enabled_flag，其控制编码变换系数的符号数据的导出。符号数据隐藏可以容易地停用，并且将相应的推断的符号数据显式地写入条带有效载荷中。然而，在再次对逐像素合并的视频进行编码之前，对该类别中的条带有效载荷的这些改变不需要通过完整解码进入像素域。另一示例是推断的块划分决策、或者可以容易地将推断值写入比特流的任何其他语法。即，在这种情况下，合并需要改变参数集合、条带头部语法以及需要熵解码/编码的条带有效载荷。For example, one element in category C.a is sign_data_hiding_enabled_flag, which controls the derivation of sign data for coded transform coefficients. Sign data hiding can be easily disabled, and the corresponding inferred sign data is explicitly written into the slice payload. However, these changes to the slice payload in this category do not need to be fully decoded into the pixel domain before encoding the pixel-by-pixel merged video again. Another example is an inferred block partitioning decision, or any other syntax that can easily write inferred values to the bitstream. That is, in this case, the merge requires changes to the parameter set, the slice header syntax, and the slice payload that requires entropy decoding/encoding.

然而，子类别C.b需要与深入(far down)解码链的过程相关的语法元素，因此，无论如何，许多复杂的解码过程都必须以如下方式执行：避免剩余解码步骤在实现和计算方面是不希望的。例如，与运动补偿约束“时间运动约束图块集合SEI(图片信息的结构)消息”相关联的语法元素或各种与解码过程相关的语法(例如，“SEI消息”)中的差异使得像素级别转码不可避免。与子类别C.a相比，有许多编码器决策以在未进行全像素级别转码的情况下无法改变的方式影响条带有效载荷。However, subcategory C.b requires syntax elements related to processes far down the decoding chain, so many complex decoding processes must be performed anyway in a way that avoiding remaining decoding steps is undesirable in terms of implementation and computation. For example, differences in syntax elements associated with motion compensation constraints "Temporal Motion Constraints Tile Sets SEI (Structure of Picture Information) Messages" or various syntax related to the decoding process (e.g., "SEI messages") make pixel-level transcoding inevitable. Compared to subcategory C.a, there are many encoder decisions that affect the slice payload in ways that cannot be changed without full pixel-level transcoding.

针对类别D，存在参数集合语法元素(例如，通过chroma_format_idc指示的色度子采样)，其中不同的值使得像素级别转码针对合并子比特流是不可避免的。即，在这种情况下，合并需要完整的解码、像素级别合并和完整的编码过程。For category D, there are parameter set syntax elements (e.g., chroma subsampling indicated by chroma_format_idc) where different values make pixel level transcoding inevitable for merging sub-bitstreams. That is, in this case, merging requires a full decoding, pixel level merging, and a full encoding process.

上面的列举绝不是穷举的，但是很清楚，各种各样的参数以不同的方式影响将子比特流合并到公共比特流中带来的优点，并且跟踪和分析这些参数很麻烦。The above list is by no means exhaustive, but it is clear that various parameters affect the benefits of merging sub-bitstreams into a common bitstream in different ways, and that tracking and analyzing these parameters is cumbersome.

发明内容Summary of the invention

本发明的目的是提供一种视频编解码器，以有效地合并视频比特流。An object of the present invention is to provide a video codec to efficiently merge video bit streams.

该目的是通过本申请的权利要求的主题来实现的。This object is achieved by the subject matter of the claims of the present application.

本发明的基本思想是通过包括一个或多个合并标识符来实现对合并多个视频流的改进。这种方法可以减少计算资源的负载，还可以加速合并过程。The basic idea of the present invention is to improve the merging of multiple video streams by including one or more merging identifiers. This method can reduce the load on computing resources and can also speed up the merging process.

根据本申请的实施例，视频编码器被配置为提供包括编码参数信息、编码视频内容信息(即，使用由参数信息定义的编码参数进行了编码)和一个或多个合并标识符的视频流，所述编码参数信息描述多个编码参数，所述一个或多个合并标识符指示编码视频表示是否可以与另一编码视频表示合并和/或编码视频表示可以如何(例如，使用哪个复杂度)与另一编码视频表示合并。可以基于由参数信息定义的参数值来确定使用哪个复杂度。合并标识符可以是多个编码参数的串接，这些参数在两个不同的编码视频表示中必须相等，以便能够使用预定复杂度来合并两个不同的编码视频表示。另外，合并标识符可以是多个编码参数的串接的散列值，这些参数在两个不同的编码视频表示中必须相等，以便能够使用预定复杂度来合并两个不同的编码视频表示。According to an embodiment of the present application, a video encoder is configured to provide a video stream including encoding parameter information, encoded video content information (i.e., encoded using encoding parameters defined by the parameter information), and one or more merge identifiers, wherein the encoding parameter information describes multiple encoding parameters, and the one or more merge identifiers indicate whether the encoded video representation can be merged with another encoded video representation and/or how the encoded video representation can be merged with another encoded video representation (e.g., which complexity to use). Which complexity to use can be determined based on the parameter value defined by the parameter information. The merge identifier can be a concatenation of multiple encoding parameters, which must be equal in two different encoded video representations so that the two different encoded video representations can be merged using a predetermined complexity. In addition, the merge identifier can be a hash value of the concatenation of multiple encoding parameters, which must be equal in two different encoded video representations so that the two different encoded video representations can be merged using a predetermined complexity.

根据本申请的实施例，合并标识符可以指示合并标识符类型，所述合并标识符类型表示与合并过程的复杂度相关联，或者一般地，可以指示通过参数集合重写进行合并、或者通过参数集合和条带头部重写进行合并、或者通过参数集合、条带头部、条带有效载荷重写进行合并的“合适的”合并方法，其中，合并标识符与合并标识符类型相关联，其中，与合并标识符类型相关联的合并标识符包括在两个不同的编码视频表示中必须相等的那些编码参数，使得能够使用由合并标识符类型表示的合并过程的复杂度来合并两个不同的编码视频表示。合并标识符类型的值可以指示合并过程，其中，视频编码器被配置为在合并标识符类型的以下值中的至少两个值之间切换；合并标识符类型的第一值，其表示通过参数集合重写的合并过程；合并标识符类型的第二值，其表示通过参数集合和条带头部重写的合并过程；以及合并标识符类型的第三值，其表示通过参数集合、条带头部和条带有效载荷重写的合并过程。According to an embodiment of the present application, a merge identifier may indicate a merge identifier type, wherein the merge identifier type indicates an association with the complexity of a merge process, or generally, may indicate a "suitable" merge method of merging by rewriting a parameter set, or by rewriting a parameter set and a slice header, or by rewriting a parameter set, a slice header, or a slice payload, wherein the merge identifier is associated with the merge identifier type, wherein the merge identifier associated with the merge identifier type includes those encoding parameters that must be equal in two different encoded video representations, so that the two different encoded video representations can be merged using the complexity of the merge process indicated by the merge identifier type. The value of the merge identifier type may indicate a merge process, wherein the video encoder is configured to switch between at least two of the following values of the merge identifier type; a first value of the merge identifier type, which indicates a merge process rewritten by a parameter set; a second value of the merge identifier type, which indicates a merge process rewritten by a parameter set and a slice header; and a third value of the merge identifier type, which indicates a merge process rewritten by a parameter set, a slice header, and a slice payload.

根据本申请的实施例，多个合并标识符与合并过程的不同复杂度相关联，例如，每个标识符指示参数集合/散列和合并过程的类型。编码器可以被配置为检查为了提供合并标识符而评估的编码参数是否在视频序列的所有单元中都相同，以及根据所述检查来提供合并标识符。According to an embodiment of the present application, multiple merge identifiers are associated with different complexities of the merge process, for example, each identifier indicates a parameter set/hash and the type of the merge process. The encoder can be configured to check whether the encoding parameters evaluated to provide the merge identifier are the same in all units of the video sequence, and to provide the merge identifier based on the check.

根据本申请的实施例，多个编码参数可以包括合并相关参数，所述合并相关参数在不同的视频流(即，编码视频表示)中必须相同，以允许与通过全像素解码的合并相比，该合并的复杂度较低，并且其中，视频编码器被配置为基于合并相关参数来提供一个或多个合并标识符，即，在两个编码视频表示之间没有公共参数的情况下，执行全像素解码，即，不可能降低合并过程的复杂度。合并相关参数包括以下参数中的一个或多个或全部：描述图块边界处的运动约束的参数(例如，运动约束图块集合补充增强信息(MCTS SEI))；关于图片组(GOP)结构的信息(即，编码顺序到显示顺序的映射)；随机接入点指示；例如，具有图片结构SOP的时间分层；SEI，描述色度编码格式的参数和描述亮度编码格式的参数，例如，至少包括色度格式和比特深度亮度/色度的集合；描述高级运动矢量预测的参数；描述样本自适应偏移的参数；描述时间运动矢量预测的参数；以及描述环路滤波器和其他编码参数的参数，即，重写参数集合(包括参考图片集合、基本量化参数等)、条带头部、条带有效载荷以合并两个编码视频表示。According to an embodiment of the present application, multiple encoding parameters may include merge-related parameters, which must be the same in different video streams (i.e., encoded video representations) to allow the merging to be less complex than the merging performed by full-pixel decoding, and wherein the video encoder is configured to provide one or more merge identifiers based on the merge-related parameters, i.e., full-pixel decoding is performed in the absence of common parameters between two encoded video representations, i.e., it is not possible to reduce the complexity of the merging process. The merge-related parameters include one or more or all of the following parameters: parameters describing motion constraints at tile boundaries (e.g., motion constraint tile set supplemental enhancement information (MCTS SEI)); information about the picture group (GOP) structure (i.e., mapping of coding order to display order); random access point indication; for example, temporal layering with picture structure SOP; SEI, parameters describing the chroma coding format and parameters describing the luma coding format, for example, a set including at least the chroma format and bit depth luma/chroma; parameters describing advanced motion vector prediction; parameters describing sample adaptive offset; parameters describing temporal motion vector prediction; and parameters describing loop filters and other coding parameters, i.e., rewriting parameter sets (including reference picture sets, basic quantization parameters, etc.), slice headers, slice payloads to merge two coded video representations.

根据本申请的实施例，当基于第二编码参数集合确定了与合并过程的第二复杂度相关联的合并参数时，可以基于第一编码参数集合来确定与合并过程的第一复杂度相关联的合并标识符，所述第二复杂度比所述第一复杂度高，所述第二编码参数集合是所述第一编码参数集合的真子集。可以基于第三编码参数集合来确定与合并过程的第三复杂度相关联的合并标识符，所述第三复杂度比所述第二复杂度高，所述第三编码参数集合是所述第二编码参数集合的真子集。视频编码器被配置为：基于编码参数集合来确定与合并过程的第一复杂度相关联的合并标识符，所述编码参数集合例如是在两个不同的视频流(例如，编码视频表示)中必须相等以允许如下方式的视频流的合并的第一集合：仅修改可应用于多个条带的参数集合，同时保持条带头部和条带有效载荷不变，即，(仅)通过参数集合重写的合并过程。视频编码器可以被配置为：基于编码参数集合来确定与合并过程的第二复杂度相关联的合并标识符，所述编码参数集合例如是在两个不同的视频流(例如，编码视频表示)中必须相等以允许如下方式的视频流的合并的第二集合：修改可应用于多个条带的参数集合并且还修改条带头部，同时保持条带有效载荷不变，即，通过参数集合和条带头部重写的合并过程。视频编码器可以被配置为基于编码参数集合来确定与合并过程的第三复杂度相关联的合并标识符，所述编码参数集合例如是在两个不同的视频流(例如，编码视频表示)中必须相等以允许如下方式的视频流的合并的第三集合：修改可应用于多个条带的参数集合并且还修改条带头部和条带有效载荷，但不执行全像素解码和像素重新编码，即，通过参数集合、条带头部和条带有效载荷重写的合并过程。According to an embodiment of the present application, when a merge parameter associated with a second complexity of a merge process is determined based on a second encoding parameter set, a merge identifier associated with a first complexity of the merge process can be determined based on a first encoding parameter set, the second complexity being higher than the first complexity, the second encoding parameter set being a proper subset of the first encoding parameter set. A merge identifier associated with a third complexity of the merge process can be determined based on a third encoding parameter set, the third complexity being higher than the second complexity, the third encoding parameter set being a proper subset of the second encoding parameter set. The video encoder is configured to determine a merge identifier associated with the first complexity of the merge process based on a coding parameter set, the coding parameter set being, for example, a first set that must be equal in two different video streams (e.g., encoded video representations) to allow merging of video streams in the following manner: only modifying a parameter set that is applicable to multiple slices while keeping the slice header and slice payload unchanged, i.e., a merge process rewritten (only) by a parameter set. The video encoder may be configured to determine a merge identifier associated with a second complexity of a merge process based on a set of encoding parameters, e.g., a second set that must be equal in two different video streams (e.g., encoded video representations) to allow a merge of the video streams in a manner that modifies a set of parameters applicable to multiple slices and also modifies a slice header while keeping the slice payload unchanged, i.e., a merge process through rewriting of parameter sets and slice headers. The video encoder may be configured to determine a merge identifier associated with a third complexity of a merge process based on a set of encoding parameters, e.g., a third set that must be equal in two different video streams (e.g., encoded video representations) to allow a merge of the video streams in a manner that modifies a set of parameters applicable to multiple slices and also modifies a slice header and a slice payload, but does not perform full pixel decoding and pixel re-encoding, i.e., a merge process through rewriting of parameter sets, slice headers and slice payloads.

根据本申请的实施例，一种用于基于多个编码视频表示(例如，视频流)来提供合并视频表示的视频合并器，其中，所述视频合并器被配置为接收包括编码参数信息、编码视频内容信息(即，使用由参数信息定义的编码参数进行了编码)和一个或多个合并标识符的多个视频流，所述编码参数信息描述多个编码参数，所述一个或多个合并标识符指示编码视频表示是否可以与另一编码视频表示合并和/或编码视频表示可以如何与另一编码视频表示合并。其中，视频合并器被配置为根据合并标识符(即，根据对不同视频流的合并标识符的比较)来决定合并方法的用法(例如，合并类型；合并过程)。视频合并器被配置为根据合并标识符，从多个合并方法中选择合并方法。视频合并器可以被配置为在以下合并方法中的至少两个合并方法之间进行选择；第一合并方法，所述第一合并方法是如下方式的视频流的合并：仅修改能够应用于多个条带的参数集合，同时保持条带头部和条带有效载荷不变；第二合并方法，所述第二合并方法是如下方式的视频流的合并：修改能够应用于多个条带的参数集合并且还修改条带头部，同时保持条带有效载荷不变；以及第三合并方法，所述第三合并方法是如下方式的视频流的合并：修改能够应用于多个条带的参数集合并且还修改条带头部和条带有效载荷，但是不执行全像素解码和像素重新编码，其是根据一个或多个合并标识符具有不同复杂度的合并方法。According to an embodiment of the present application, a video merger for providing a merged video representation based on multiple encoded video representations (e.g., video streams), wherein the video merger is configured to receive multiple video streams including encoding parameter information, encoded video content information (i.e., encoded using encoding parameters defined by the parameter information) and one or more merge identifiers, wherein the encoding parameter information describes multiple encoding parameters, and the one or more merge identifiers indicate whether the encoded video representation can be merged with another encoded video representation and/or how the encoded video representation can be merged with another encoded video representation. The video merger is configured to determine the usage of the merging method (e.g., merging type; merging process) based on the merge identifier (i.e., based on a comparison of the merge identifiers of different video streams). The video merger is configured to select a merging method from multiple merging methods based on the merge identifier. The video merger can be configured to select between at least two of the following merger methods; a first merger method, wherein the first merger method is a merger of video streams in the following manner: only a parameter set that can be applied to multiple slices is modified while keeping a slice header and a slice payload unchanged; a second merger method, wherein the second merger method is a merger of video streams in the following manner: a parameter set that can be applied to multiple slices is modified and a slice header is also modified while keeping a slice payload unchanged; and a third merger method, wherein the third merger method is a merger of video streams in the following manner: a parameter set that can be applied to multiple slices is modified and a slice header and a slice payload are also modified, but full pixel decoding and pixel re-encoding are not performed, which is a merger method with different complexity according to one or more merger identifiers.

根据本申请的实施例，视频合并器被配置为对与相同的给定合并方法相关联或与相同的合并标识符类型相关联的两个或更多个视频流的合并标识符进行比较，并且根据比较的结果决定是否使用给定合并方法来执行合并。视频合并器可以被配置为：如果所述比较指示与给定合并方法相关联的两个或更多个视频流的合并标识符相等，则使用给定合并方法来选择性地执行合并。视频合并器可以被配置为：如果合并标识符的比较指示与给定合并方法相关联的两个或更多个视频流的合并标识符不同，则使用复杂度比与所比较的合并标识符相关联的给定合并方法的复杂度高的合并方法，即，无需进一步比较编码参数本身。视频合并器可以被配置为：如果合并标识符的比较指示与给定合并方法相关联的两个或更多个视频流的合并标识符相等，则选择性地比较在两个或更多个视频流中必须相等以允许使用给定合并方法来合并视频流的编码参数，并且其中，视频合并器被配置为：如果对所述编码参数(即，在两个或更多个视频流中必须相等以允许使用给定合并方法来合并视频流的编码参数)的比较指示所述编码参数相等，则选择性地使用给定合并方法来执行合并，并且其中，视频合并器被配置为：如果对所述编码参数的比较指示所述编码参数包括差异，则使用复杂度比给定合并方法的复杂度高的合并方法来执行合并。According to an embodiment of the present application, a video merger is configured to compare the merge identifiers of two or more video streams associated with the same given merge method or associated with the same merge identifier type, and decide whether to use the given merge method to perform the merge based on the result of the comparison. The video merger can be configured to: if the comparison indicates that the merge identifiers of two or more video streams associated with the given merge method are equal, then the given merge method is used to selectively perform the merge. The video merger can be configured to: if the comparison of the merge identifiers indicates that the merge identifiers of two or more video streams associated with the given merge method are different, then a merge method with a higher complexity than the given merge method associated with the compared merge identifier is used, that is, without further comparing the encoding parameters themselves. The video merger may be configured to selectively compare encoding parameters that must be equal in the two or more video streams to allow the video streams to be merged using the given merger method if a comparison of the merger identifiers indicates that the merger identifiers of two or more video streams associated with a given merger method are equal, and wherein the video merger is configured to selectively use the given merger method to perform the merger if a comparison of the encoding parameters (i.e., encoding parameters that must be equal in the two or more video streams to allow the video streams to be merged using the given merger method) indicates that the encoding parameters are equal, and wherein the video merger is configured to perform the merger using a merger method with a higher complexity than the given merger method if a comparison of the encoding parameters indicates that the encoding parameters include differences.

根据本申请的实施例，视频合并器可以被配置为：对与具有不同复杂度的合并方法相关联的合并标识符进行比较，即，散列的比较，并且其中，视频合并器被配置为：识别最低复杂度的合并方法，用于该最低复杂度的合并方法的相关联的合并标识符在要合并的两个或更多个视频流中相等；以及其中，视频合并器被配置为：对编码参数(即，各个编码参数，而不是其散列版本)集合进行比较，这些编码参数在要合并的两个或更多个视频流中必须相等，以允许使用所识别的合并方法进行合并，其中，其中不同的、通常重叠的编码参数集合与不同复杂度的合并方法相关联，并且其中，视频合并器被配置为：如果比较指示与所识别的合并方法相关联的编码参数集合的编码参数在要合并的视频流中相等，则使用所识别的合并方法来合并两个或更多个视频流，并且其中，视频合并器被配置为：如果比较指示与所识别的合并方法相关联的编码参数集合的编码参数在要合并的视频流中包括差异，则使用复杂度比所识别的合并方法的复杂度高的合并方法来合并两个或更多个视频流。视频合并器被配置为：根据例如与要合并的不同视频流的相同合并方法或“合并标识符类型”相关联的合并标识符之间的一个或多个差异来确定应在合并过程(即，合并视频流)中修改哪些编码参数。According to an embodiment of the present application, a video merger may be configured to: compare merger identifiers associated with merger methods having different complexities, i.e., compare hashes, and wherein the video merger is configured to: identify a merger method of the lowest complexity, the associated merger identifiers for which are equal in two or more video streams to be merged; and wherein the video merger is configured to: compare sets of encoding parameters (i.e., individual encoding parameters, not hashed versions thereof), which must be equal in two or more video streams to be merged, to allow the merger to be performed using the identified merger method. and, wherein different, typically overlapping sets of coding parameters are associated with merging methods of different complexity, and wherein the video merger is configured to merge the two or more video streams using the identified merging method if the comparison indicates that the coding parameters of the coding parameter set associated with the identified merging method are equal in the video streams to be merged, and wherein the video merger is configured to merge the two or more video streams using a merging method of higher complexity than the identified merging method if the comparison indicates that the coding parameters of the coding parameter set associated with the identified merging method include differences in the video streams to be merged. The video merger is configured to determine which coding parameters should be modified in the merging process (i.e., merging the video streams) based on one or more differences between merge identifiers associated with the same merging method or "merge identifier type" of different video streams to be merged.

根据本申请的实施例，视频合并器被配置为：基于要合并的视频流的编码参数来获得与要合并的所有视频流的条带相关联的联合编码参数或联合编码参数集合，例如，序列参数集合SPS和图片参数集合PPS，并且例如，在一个编码视频表示和另一编码视频表示的所有编码参数的值都相同的情况下，将联合编码参数包括在合并视频流中，在编码视频表示的编码参数之间存在任何差异的情况下，通过复制公共参数来更新编码参数，基于主编码视频表示(即，一个编码视频表示是主编码视频表示，且另一编码视频表示是子(sub)编码视频表示)来更新编码参数，例如，一些编码参数(例如，总图片大小)可以根据视频流的组合来适配。视频合并器被配置为：适配分别与各个视频条带相关联的编码参数(例如在条带头部中定义)，或者例如，在使用具有比最低复杂度高的复杂度的合并方法时，以便获得要包括在合并视频流中的修改后的条带。经适配的编码参数包括表示合并编码视频表示的图片大小的参数，其中，所述图片大小基于要合并的编码视频表示的图片大小(即，在各个维度上及在它们的空间布置的上下文中)计算。According to an embodiment of the present application, a video merger is configured to obtain joint coding parameters or a joint coding parameter set associated with slices of all video streams to be merged based on the coding parameters of the video streams to be merged, for example, a sequence parameter set SPS and a picture parameter set PPS, and, for example, in the case where the values of all coding parameters of one coded video representation and another coded video representation are the same, include the joint coding parameters in the merged video stream, update the coding parameters by copying the common parameters in the case of any difference between the coding parameters of the coded video representations, update the coding parameters based on the main coded video representation (i.e., one coded video representation is the main coded video representation and the other coded video representation is the sub (sub) coded video representation), for example, some coding parameters (e.g., the total picture size) can be adapted according to the combination of video streams. The video merger is configured to adapt the coding parameters associated with each video slice (e.g., defined in the slice header), or, for example, when a merging method with a complexity higher than the minimum complexity is used, so as to obtain a modified slice to be included in the merged video stream. The adapted encoding parameters comprise parameters representing a picture size of the merged coded video representation, wherein the picture size is calculated based on picture sizes of the coded video representations to be merged (ie in each dimension and in the context of their spatial arrangement).

根据本申请的实施例，一种用于提供编码视频表示(即，视频流)的视频编码器，其中，视频编码器可以被配置为提供粗糙粒度能力需求信息，例如，级别信息；级别3或级别4或级别5，其描述了视频流与视频解码器的兼容性，例如，视频解码器对视频流的可解码性，该视频解码器具有多个预定能力级别中的能力级别；并且其中，视频编码器被配置为提供精细粒度能力需求信息，例如，合并级别限制信息，其描述了需要与预定能力级别之一相关联的容许能力需求(即，解码器能力)的多大的分数，以对编码视频表示进行解码，和/或描述了编码视频表示(即，子比特流)向视频流合并到的合并视频流贡献了容许能力需求的多大的分数(即，“合并比特流的级别限制”)，该合并比特流的能力需求与预定能力级别之一一致，即，该合并比特流的能力需求比容许能力需求小或等于容许能力需求，“合并比特流的与预定能力级别之一一致的容许能力需求”与“合并比特流的级别限制”相对应”。视频编码器被配置为提供精细粒度能力需求信息，使得精细粒度能力需求信息包括参考预定预定能力级别之一的比率值或百分比值。视频编码器被配置为提供精细粒度能力需求信息，使得精细粒度能力需求信息包括参考信息和分数信息，并且其中，参考信息描述分数信息参考预定能力级别中的哪一个，使得精细粒度能力需求信息整体上描述预定能力级别之一的分数。According to an embodiment of the present application, a video encoder for providing a coded video representation (i.e., a video stream), wherein the video encoder can be configured to provide coarse-grained capability requirement information, such as level information; level 3 or level 4 or level 5, which describes the compatibility of the video stream with a video decoder, such as the decodability of the video stream by a video decoder having a capability level among multiple predetermined capability levels; and wherein the video encoder is configured to provide fine-grained capability requirement information, such as merge level limit information, which describes how large a fraction of an allowable capability requirement (i.e., decoder capability) associated with one of the predetermined capability levels is required to decode the coded video representation, and/or describes how much of the allowable capability requirement is contributed by the coded video representation (i.e., a sub-bitstream) to the merged video stream into which the video stream is merged. The video encoder is configured to provide fine-grained capability requirement information such that the fine-grained capability requirement information includes a ratio value or a percentage value that refers to one of the predetermined capability levels. The video encoder is configured to provide fine-grained capability requirement information such that the fine-grained capability requirement information includes reference information and score information, and wherein the reference information describes which of the predetermined capability levels the score information refers to, so that the fine-grained capability requirement information as a whole describes the score of one of the predetermined capability levels.

根据本申请的实施例，一种视频合并器，用于基于多个编码视频表示(即，视频流)来提供合并视频表示，其中，视频合并器可以被配置为接收多个视频流，所述多个视频流包括编码参数信息、编码视频内容信息(例如使用由参数信息定义的编码参数进行了编码)、粗糙粒度能力需求信息和精细粒度能力需求信息(例如，合并级别限制信息)，所述编码参数信息描述多个编码参数，所述粗糙粒度能力需求信息例如是级别信息——级别3或级别4或级别5，其描述了视频流与视频解码器的兼容性，即，视频解码器对视频流的可解码性，所述视频解码器具有多个预定能力级别之一，其中，所述视频合并器被配置为根据粗糙粒度能力需求信息和精细粒度能力需求信息来合并两个或更多个视频流。视频合并器可以被配置为：即，在不违反容许能力需求(即，合并视频流的这种能力需求与预定能力级别之一一致)的情况下，根据精细粒度能力需求信息，来决定哪些视频流可以是合并视频流或可以被包括在合并视频流中。视频合并器可以被配置为：决定有效的合并视频流(例如在不违反容许能力需求的情况下，即，使得合并视频流的能力需求与预定能力级别之一达成一致)是否可以通过根据精细粒度能力需求信息合并两个或更多个视频流来获得。视频合并器被配置为对要合并的多个视频流的精细粒度能力需求信息进行综合，例如，以确定可以包括哪些视频流，或者以决定是否可以获得有效的合并视频流。According to an embodiment of the present application, a video merger is used to provide a merged video representation based on multiple encoded video representations (i.e., video streams), wherein the video merger can be configured to receive multiple video streams, the multiple video streams including encoding parameter information, encoded video content information (e.g., encoded using encoding parameters defined by the parameter information), coarse-grained capability requirement information and fine-grained capability requirement information (e.g., merge level restriction information), the encoding parameter information describes multiple encoding parameters, the coarse-grained capability requirement information is, for example, level information - level 3 or level 4 or level 5, which describes the compatibility of the video stream with a video decoder, i.e., the decodability of the video stream by the video decoder, the video decoder having one of multiple predetermined capability levels, wherein the video merger is configured to merge two or more video streams according to the coarse-grained capability requirement information and the fine-grained capability requirement information. The video merger can be configured to: i.e., without violating the permissible capability requirement (i.e., such capability requirement of the merged video stream is consistent with one of the predetermined capability levels), determine which video streams can be a merged video stream or can be included in the merged video stream according to the fine-grained capability requirement information. The video merger may be configured to determine whether a valid merged video stream (e.g., without violating the allowed capability requirements, i.e., such that the capability requirements of the merged video stream are consistent with one of the predetermined capability levels) can be obtained by merging two or more video streams according to the fine-grained capability requirement information. The video merger is configured to synthesize the fine-grained capability requirement information of the plurality of video streams to be merged, e.g., to determine which video streams can be included, or to determine whether a valid merged video stream can be obtained.

根据本申请的实施例，一种用于对所提供的视频表示进行解码的视频解码器，其中，视频解码器被配置为接收包括多个子视频流(即，整个比特流的多个子比特流)的视频表示，所述多个子视频流包括编码参数信息、编码视频内容信息、粗糙粒度能力需求信息和精细粒度能力需求信息，所述编码参数信息描述多个编码参数，所述粗糙粒度能力需求信息描述视频流与具有多个预定能力级别中的能力级别的视频解码器的兼容性，即，各个信息例如在SEI消息中携带，其中，所述视频解码器被配置为确定要合并的所述多个子视频流的由所述精细粒度能力需求信息描述的组合能力需求是否与要遵守的预定限制(即，解码器的级别特定限制)匹配。视频解码器可以进一步被配置为对接收到的粗糙粒度能力需求信息和精细粒度能力需求信息进行解析，以获得所述能力级别的指示和容许能力需求的分数。According to an embodiment of the present application, a video decoder for decoding a provided video representation, wherein the video decoder is configured to receive a video representation comprising multiple sub-video streams (i.e., multiple sub-bitstreams of the entire bitstream), the multiple sub-video streams comprising encoding parameter information, encoded video content information, coarse-grained capability requirement information, and fine-grained capability requirement information, wherein the encoding parameter information describes multiple encoding parameters, and the coarse-grained capability requirement information describes the compatibility of the video stream with a video decoder having a capability level in multiple predetermined capability levels, i.e., the respective information is carried, for example, in an SEI message, wherein the video decoder is configured to determine whether the combined capability requirements described by the fine-grained capability requirement information of the multiple sub-video streams to be merged match the predetermined restrictions to be complied with (i.e., the level-specific restrictions of the decoder). The video decoder may be further configured to parse the received coarse-grained capability requirement information and fine-grained capability requirement information to obtain an indication of the capability level and a score of the allowed capability requirements.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

以下参照附图描述本申请的优选实施例，其中：The preferred embodiments of the present application are described below with reference to the accompanying drawings, in which:

图1示出了作为视频编码器的示例的用于提供编码视频表示的装置的框图，其中可以实现根据本申请的实施例的比特流合并概念；FIG1 shows a block diagram of an apparatus for providing an encoded video representation as an example of a video encoder, in which the bitstream merging concept according to an embodiment of the present application can be implemented;

图2示出了说明根据本申请的实施例的比特流结构的示例的示意图；FIG2 shows a schematic diagram illustrating an example of a bitstream structure according to an embodiment of the present application;

图3示出了作为视频编码器的另一示例的用于提供编码视频表示的装置的框图，其中可以实现根据本申请的实施例的比特流合并概念；FIG3 shows a block diagram of an apparatus for providing an encoded video representation as another example of a video encoder, in which the bitstream merging concept according to an embodiment of the present application can be implemented;

图4示出了根据本申请的实施例的编码参数的示例的示意图；FIG4 is a schematic diagram showing an example of encoding parameters according to an embodiment of the present application;

图5a、图5b示出了图4中指示的详细的序列参数集合(SPS)示例；FIG5a, FIG5b show a detailed sequence parameter set (SPS) example indicated in FIG4;

图6a、图6b示出了图4中指示的详细的图片参数集合(PPS)示例；FIG. 6 a and FIG. 6 b show a detailed picture parameter set (PPS) example indicated in FIG. 4 ;

图7a至图7d示出了图4中指示的详细的条带头部示例；7a to 7d show detailed stripe header examples indicated in FIG. 4;

图8示出了图4中指示的补充增强信息(SEI)消息中的图片的详细结构；FIG8 shows a detailed structure of a picture in the Supplemental Enhancement Information (SEI) message indicated in FIG4 ;

图9示出了图4中指示的SEI消息中的详细的运动约束图块集合；FIG. 9 shows a detailed motion constraint tile set in the SEI message indicated in FIG. 4 ;

图10示出了作为视频合并器的示例的用于提供合并视频表示的装置的框图，其中可以实现根据本申请的实施例的比特流合并概念；FIG10 shows a block diagram of an apparatus for providing a merged video representation as an example of a video merger, in which the bitstream merging concept according to an embodiment of the present application can be implemented;

图11示出了说明根据本申请的实施例的合并复杂度的确定过程的示意图；FIG11 shows a schematic diagram illustrating a process of determining a merge complexity according to an embodiment of the present application;

图12示出了根据本申请的比特流合并概念的要合并的多个视频表示的比特流结构和合并视频表示的比特流结构的示意图；FIG12 is a schematic diagram showing a bitstream structure of multiple video representations to be merged and a bitstream structure of a merged video representation according to the bitstream merging concept of the present application;

图13示出了作为视频合并器的另一示例的用于提供合并视频表示的装置的框图，其中可以实现根据本申请的实施例的比特流合并概念；FIG13 shows a block diagram of an apparatus for providing a merged video representation as another example of a video merger, in which the bitstream merging concept according to an embodiment of the present application can be implemented;

图14示出了作为视频编码器的示例的用于提供编码视频表示的装置的框图，其提供根据本申请的实施例可以实现的合并视频表示能力需求信息；以及FIG. 14 shows a block diagram of an apparatus for providing encoded video representation as an example of a video encoder, which provides merged video representation capability requirement information that can be implemented according to an embodiment of the present application; and

图15示出了作为视频合并器的示例的用于提供合并视频表示的装置的框图，其提供根据本申请的实施例可以实现的合并视频表示能力需求信息。FIG. 15 shows a block diagram of an apparatus for providing a merged video representation as an example of a video merger, which provides merged video representation capability requirement information that can be implemented according to an embodiment of the present application.

具体实施方式Detailed ways

以下描述阐述了具体的细节，如为了解释而不是限制的目的的特定的实施例、过程、技术等。本领域技术人员将理解，除了这些特定细节，可以使用其他实施例。例如，尽管使用非限制性示例应用促进了以下描述，但是该技术可以应用于任何类型的视频编解码器。在一些实例中，省略了对公知方法、节点、接口、电路和设备的详细描述，从而不以不必要的细节模糊描述。The following description sets forth specific details, such as specific embodiments, processes, techniques, etc., for purposes of explanation rather than limitation. Those skilled in the art will appreciate that, in addition to these specific details, other embodiments may be used. For example, although the following description is facilitated using non-limiting example applications, the technology may be applied to any type of video codec. In some instances, detailed descriptions of well-known methods, nodes, interfaces, circuits, and devices are omitted so as not to obscure the description with unnecessary detail.

在以下描述中，通过相同或等同附图标记来表示相同或等同元件或者具有相同或等同功能的元件。In the following description, the same or equivalent elements or elements having the same or equivalent functions are denoted by the same or equivalent reference numerals.

本文的发明旨在给诸如VVC(多功能视频编码)之类的未来视频编解码器提供在每个子比特流中放入指示的手段，该指示允许识别可以一起合并到合法比特流中的子比特流，或者识别不能以给定级别的复杂度一起合并到合法比特流中的子比特流。在下文中被称为“合并标识符”的指示还通过被称为“合并标识符类型”的指示来提供关于合适的合并方法的信息。假定两个比特流携带“合并标识符”及其相同的值，则可以在给定级别的合并复杂度(与合并方法相关)下将子比特流合并为新的联合比特流。The invention herein aims to provide future video codecs such as VVC (Versatile Video Coding) with a means of putting an indication in each sub-bitstream that allows identification of sub-bitstreams that can be merged together into a legal bitstream, or identification of sub-bitstreams that cannot be merged together into a legal bitstream at a given level of complexity. The indication, referred to hereinafter as the "merge identifier", also provides information about a suitable merging method through an indication referred to as the "merge identifier type". Assuming that two bitstreams carry a "merge identifier" and its same value, the sub-bitstreams can be merged into a new joint bitstream at a given level of merging complexity (related to the merging method).

图1示出了用于基于所提供的(输入)视频流提供编码视频表示(即，视频流12)的视频编码器2，其包括编码器核4，编码器核4包括编码参数确定部件14和合并标识符提供6。所提供的视频流以及视频流12分别具有比特流结构，例如，如图2所示作为简化配置。比特流结构由多个网络抽象层(NAL)单元组成，每个NAL单元包括各种参数和/或数据，例如序列参数集合(SPS)20、图片参数集合(PPS)22、即时解码器刷新(IDR)24、补充增强信息(SEI)26和多个条带28。SEI 26包括各种消息，即，图片的结构、运动约束图块集合等。条带28包括头部30和有效载荷32。编码参数确定部件14基于SPS20、PPS22、SEI 26和条带头部30来确定编码参数。根据本申请，IDR 24不是确定编码参数的必要因素，但是可以可选地包括IDR 24以用于确定编码参数。合并标识符提供6提供合并标识符，该合并标识符指示编码视频流是否可以与另一编码视频流合并和/或编码视频流可以如何(以哪个复杂度)与另一编码视频流合并。合并标识符确定合并标识符，所述合并标识符指示合并标识符类型，所述合并标识符类型表示与合并过程的复杂度相关联，或者一般地，指示“合适的”合并方法，例如，通过参数集合重写进行合并、或者通过参数集合和条带头部重写进行合并、或者通过参数集合、条带头部、条带有效载荷重写进行合并，其中，合并标识符与合并标识符类型相关联，其中，与合并标识符类型相关联的合并标识符包括在两个不同的编码视频表示中必须相等的那些编码参数，使得能够使用由合并标识符类型表示的合并过程的复杂度来合并两个不同的编码视频表示。FIG1 shows a video encoder 2 for providing an encoded video representation (i.e., a video stream 12) based on a provided (input) video stream, comprising an encoder core 4, the encoder core 4 comprising a coding parameter determination unit 14 and a merge identifier provision 6. The provided video stream and the video stream 12 respectively have a bitstream structure, for example, as shown in FIG2 as a simplified configuration. The bitstream structure consists of a plurality of network abstraction layer (NAL) units, each NAL unit comprising various parameters and/or data, such as a sequence parameter set (SPS) 20, a picture parameter set (PPS) 22, an instantaneous decoder refresh (IDR) 24, a supplemental enhancement information (SEI) 26 and a plurality of slices 28. The SEI 26 comprises various messages, i.e., the structure of a picture, a motion constraint tile set, etc. The slice 28 comprises a header 30 and a payload 32. The coding parameter determination unit 14 determines the coding parameters based on the SPS 20, the PPS 22, the SEI 26 and the slice header 30. According to the present application, IDR 24 is not a necessary factor for determining encoding parameters, but IDR 24 can be optionally included for determining encoding parameters. A merge identifier is provided 6 to provide a merge identifier, which indicates whether a coded video stream can be merged with another coded video stream and/or how (at which complexity) the coded video stream can be merged with another coded video stream. The merge identifier determines a merge identifier, the merge identifier indicates a merge identifier type, the merge identifier type representation is associated with the complexity of the merge process, or generally indicates a "suitable" merge method, for example, merging by rewriting parameter sets, or merging by rewriting parameter sets and slice headers, or merging by rewriting parameter sets, slice headers, and slice payloads, wherein the merge identifier is associated with the merge identifier type, wherein the merge identifier associated with the merge identifier type includes those encoding parameters that must be equal in two different coded video representations, so that the two different coded video representations can be merged using the complexity of the merge process represented by the merge identifier type.

结果，编码视频流包括描述多个编码参数的编码参数信息、编码视频内容信息和一个或多个合并标识符。在该实施例中，基于编码参数来确定合并标识符。然而，还可以根据编码器操作者的意愿在很少保证冲突避免(这对于封闭系统来说可能是足够的)的情况下设置合并标识符值。诸如DVB(数字视频广播)、ATSC(高级电视系统委员会)之类的第三方实体可以定义要在其系统内使用的合并标识符的值。As a result, the encoded video stream includes encoding parameter information describing a plurality of encoding parameters, encoded video content information, and one or more merge identifiers. In this embodiment, the merge identifier is determined based on the encoding parameters. However, the merge identifier value can also be set according to the will of the encoder operator with little guarantee of conflict avoidance (which may be sufficient for a closed system). Third-party entities such as DVB (Digital Video Broadcasting), ATSC (Advanced Television Systems Committee) can define the value of the merge identifier to be used within their system.

考虑使用图3至图9的根据本发明的另一实施例，在下面描述合并标识符类型。Considering another embodiment according to the present invention using FIGS. 3 to 9 , the merge identifier types are described below.

图3示出了编码器2(2a)，其包括编码器核4和合并标识符6a，并指示编码器2a中的数据流。合并标识符6a包括第一散列部件16a和第二散列部件16b，它们提供散列值作为合并标识符以指示合并类型。也就是说，合并值可以根据比特流的定义的语法元素(编码参数)集合的编码值的串接而形成，在下文中称为散列集合。合并标识符值还可以通过将上述编码值的串接送入公知的散列函数(例如，MD5、SHA-3)或任何其他合适的函数来形成。如图3所示，输入视频流包括输入视频信息10，并且在编码器核4处处理输入视频流。编码器核4对输入视频内容进行编码，并且将编码视频内容信息存储在有效载荷32中。编码参数确定部件14接收包括SPS、PPS、条带头部和SEI消息在内的参数信息。该参数信息被存储在每个对应单元内，并在合并标识符提供6a处接收该参数信息，即，分别在第一散列部件16a和第二散列部件16b处接收。第一散列部件16a基于编码参数来产生例如指示合并标识符类型2的散列值，并且第二散列部件16b基于编码参数来产生例如指示合并标识符类型1的散列值。FIG. 3 shows an encoder 2 (2a), which includes an encoder core 4 and a merge identifier 6a, and indicates a data stream in the encoder 2a. The merge identifier 6a includes a first hash component 16a and a second hash component 16b, which provide a hash value as a merge identifier to indicate a merge type. That is, the merge value can be formed by concatenating the coded values of the syntax element (coding parameter) set defined by the bitstream, hereinafter referred to as a hash set. The merge identifier value can also be formed by concatenating the above-mentioned coded values into a well-known hash function (e.g., MD5, SHA-3) or any other suitable function. As shown in FIG. 3, the input video stream includes input video information 10, and the input video stream is processed at the encoder core 4. The encoder core 4 encodes the input video content and stores the encoded video content information in the payload 32. The encoding parameter determination component 14 receives parameter information including SPS, PPS, slice header and SEI message. The parameter information is stored in each corresponding unit and is received at the merge identifier provider 6a, i.e., at the first hash component 16a and the second hash component 16b, respectively. The first hash component 16a generates a hash value indicating, for example, the merge identifier type 2 based on the encoding parameter, and the second hash component 16b generates a hash value indicating, for example, the merge identifier type 1 based on the encoding parameter.

散列集合的内容(即串接哪些语法元素(即，合并参数)值以形成合并标识符值)确定关于上述语法类别的可合并性指示的质量。The content of the hash set (ie which syntax element (ie merge parameter) values are concatenated to form the merge identifier value) determines the quality of the mergeability indication with respect to the above syntax category.

例如，合并标识符类型指示关于合并标识符的适合的合并方法，即，与并入散列集合中的语法元素相对应的不同级别的可合并性：For example, the merge identifier type indicates the appropriate merge method for the merge identifier, i.e., the different levels of mergeability corresponding to the syntax elements merged into the hash set:

类型0-用于通过参数集合重写进行合并的合并标识符Type 0 - Merge identifier for merging via parameter set override

类型1-用于通过参数集合和条带头部重写进行合并的合并标识符Type 1 - Merge identifier for merging with parameter set and stripe header rewrite

类型2-用于通过参数集合、条带头部和条带有效载荷重写进行合并的合并标识符Type 2 - Merge identifier for merging by parameter set, slice header, and slice payload rewrite

例如，给定要在本申请的上下文中合并的两个输入子比特流，设备可以对合并标识符和合并标识符类型的值进行比较，并且得出在两个子比特流上使用与合并标识符类型相关联的方法的前景的结论。For example, given two input sub-bitstreams to be merged in the context of the present application, a device may compare the values of the merge identifier and the merge identifier type and conclude on the prospects of using the method associated with the merge identifier type on both sub-bitstreams.

下表给出了语法元素类别和相关联的合并方法之间的映射。The following table gives the mapping between syntax element categories and the associated merging methods.

语法类别Grammatical categories 合并方法Merge Method AA 00 BB 11 C.aC.a 22 C.b、DC.b, D 完全转码Fully transcoded

如上所述，语法类别绝不是穷举的，但是很清楚，各种各样的参数以不同的方式影响将子比特流合并到公共比特流中带来的优点，并且跟踪和分析这些参数很麻烦。另外，语法类别和合并方法(类型)不完全对应，因此，例如针对类别B，需要一些参数，但是合并方法1不需要相同的参数。As mentioned above, the syntax categories are by no means exhaustive, but it is clear that various parameters affect the benefits of merging sub-bitstreams into a common bitstream in different ways, and it is cumbersome to keep track of and analyze these parameters. In addition, the syntax categories and merging methods (types) do not correspond exactly, so for example, some parameters are required for category B, but the same parameters are not required for merging method 1.

产生根据以上标识符类型值中的两个或更多个的合并标识符值，并将这些合并标识符值写入比特流以允许设备容易识别可应用的合并方法。合并方法(类型)0指示合并标识符类型的第一值，合并方法(类型)1指示合并标识符类型的第二值，并且合并方法(类型)3指示合并标识符类型的第三值。Generate a merged identifier value according to two or more of the above identifier type values, and write these merged identifier values to the bitstream to allow the device to easily identify the applicable merge method. Merge method (type) 0 indicates a first value of the merged identifier type, merge method (type) 1 indicates a second value of the merged identifier type, and merge method (type) 3 indicates a third value of the merged identifier type.

应将以下示例性语法并入散列集合中。The following example syntax should be incorporated into the hash set.

●时间运动约束图块集合SEI消息，其指示图块和图片边界处的运动约束(合并方法0、1、2，语法类别C.b)● Temporal motion constraints tile set SEI message, which indicates motion constraints at tile and picture boundaries (merge methods 0, 1, 2, syntax category C.b)

●图片信息SEI消息的结构，其定义GOP结构(即，编码顺序到显示顺序的映射、随机接入点指示、时间分层、参考结构)(合并方法0、1、2，语法类别C.b)● Structure of the Picture Information SEI message, which defines the GOP structure (i.e., coding order to display order mapping, random access point indication, temporal layering, reference structure) (Merge methods 0, 1, 2, syntax category C.b)

●参数集合语法元素值●Parameter set syntax element value

○参考图片集合(合并方法0，语法类别B)○ Reference image set (merging method 0, syntax category B)

○色度格式(合并方法0、1、2，语法类别D)○ Chroma format (combination method 0, 1, 2, syntax category D)

○基本QP、色度QP偏移(合并方法0，语法类别B)○Base QP, Chroma QP offset (Merge method 0, syntax category B)

○比特深度亮度/色度(合并方法0、1、2，语法类别D)○ Bit depth Luma/Chroma (merging methods 0, 1, 2, syntax class D)

○hrd参数○hrd parameter

■初始到达延迟(合并方法0，语法类别B)■ Initial arrival delay (merging method 0, grammatical class B)

■初始移除延迟(合并方法0，语法类别B)■ Initial removal delay (merge method 0, syntax category B)

○编码工具○Coding tools

■编码块结构(最大/最小块大小、推断的分割)(合并方法0、1，语法类别C.a)■Coding block structure (maximum/minimum block size, inferred segmentation) (merging method 0, 1, syntax category C.a)

■变换大小(最小/最大)(合并方法0、1，语法类别C.a)■Transformation size (minimum/maximum) (merge method 0, 1, syntax category C.a)

■PCM块使用(合并方法0、1，语法类别C.a)■PCM block usage (merging method 0, 1, syntax category C.a)

■高级运动矢量预测(合并方法0、1、2，语法类别C.b)■Advanced motion vector prediction (merge methods 0, 1, 2, syntax class C.b)

■样本自适应偏移(合并方法0，语法类别C.b)■ Sample Adaptive Offset (Merge Method 0, Syntax Category C.b)

■时间运动矢量预测(合并方法0、1、2，语法类别C.b)■Temporal motion vector prediction (merge method 0, 1, 2, syntax category C.b)

■帧内平滑(合并方法0、1，语法类别C.a)■Intra-frame smoothing (merging method 0, 1, syntax category C.a)

■相关条带(合并方法0，语法类别A)■Related strips (merging method 0, syntax category A)

■符号隐藏(合并方法0、1，语法类别C.a)■Symbol hiding (merging methods 0, 1, syntax category C.a)

■加权预测(合并方法0，语法类别A)■Weighted prediction (merging method 0, grammatical category A)

■变换量化(transquant)旁路(合并方法0、1，语法类别C.a)■Transquant bypass (merge method 0, 1, syntax category C.a)

■熵编码同步(合并方法0、1，语法类别C.a)(Skup4)■ Entropy coding synchronization (merge method 0, 1, syntax category C.a) (Skup4)

■环路滤波器(合并方法0、1、2，语法类别C.b)■ Loop filter (merging methods 0, 1, 2, syntax category C.b)

●条带头部值●Strip header value

○参数集合ID(合并方法0，语法类别C.a)○Parameter set ID (merge method 0, syntax category C.a)

○参考图片集合(合并方法0、1，语法类别B)○ Reference image set (merging method 0, 1, grammatical category B)

●使用隐式CTU地址信号传送cp.(被欧洲专利申请号：EP 18153516参考)(合并方法0，语法元素A)●Use implicit CTU address signal to transmit cp. (Referenced by European Patent Application No.: EP 18153516) (Merge Method 0, Syntax Element A)

也就是说，针对合并标识符类型的第一值，即类型0，语法元素(参数)图块和图片边界处的运动约束、GOP结构、参考图片集合、色度格式、基本量化参数和色度量化参数、比特深度亮度/色度、包括关于初始到达延迟的参数和关于初始移除延迟的参数在内的假设参考解码器参数、编码块结构、变换最小和/或最大大小、脉冲编码调制块使用、高级运动矢量预测、样本自适应偏移、时间运动矢量预测，描述了帧内平滑、相关条带、符号隐藏、加权预测、变换量化旁路、熵编码同步、环路滤波器、包括参数集合ID的条带头部值、包括参考图片集合的条带头部值以及隐式编码变换单元地址信号传送的使用应被并入散列集合中。That is, for the first value of the merge identifier type, i.e. type 0, the syntax elements (parameters) motion constraints at tile and picture boundaries, GOP structure, reference picture sets, chroma format, basic quantization parameter and chroma quantization parameter, bit depth luminance/chroma, assumed reference decoder parameters including parameters on initial arrival delay and parameters on initial removal delay, coding block structure, transform minimum and/or maximum size, pulse code modulation block usage, advanced motion vector prediction, sample adaptive offset, temporal motion vector prediction, description of intra-frame smoothing, dependent slices, sign hiding, weighted prediction, transform quantization bypass, entropy coding synchronization, loop filter, slice header value including parameter set ID, slice header value including reference picture set, and use of implicit coded transform unit address signaling should be incorporated into the hash set.

针对合并标识符类型的第二值，即类型1，语法元素(参数)：图块和图片边界处的运动约束、GOP结构、色度格式、比特深度亮度/色度、编码块结构、变换最小和/或最大大小、脉冲编码调制块使用、高级运动矢量预测、样本自适应偏移、时间运动矢量预测、帧内平滑、符号隐藏、变换量化旁路、熵编码同步、环路滤波器和包括参考图片集合的条带头部值。For the second value of the merge identifier type, i.e. type 1, syntax elements (parameters): motion constraints at tile and picture boundaries, GOP structure, chroma format, bit depth luma/chroma, coding block structure, transform minimum and/or maximum size, pulse code modulation block usage, advanced motion vector prediction, sample adaptive offset, temporal motion vector prediction, intra-frame smoothing, sign hiding, transform quantization bypass, entropy coding synchronization, loop filter, and slice header values including reference picture sets.

针对合并标识符类型的第三值，即类型2，语法元素(参数)：图块和图片边界处的运动约束、GOP结构、色度格式、比特深度亮度/色度、高级运动矢量预测、样本自适应偏移、时间运动矢量预测、环路滤波器。For the third value of the merge identifier type, type 2, syntax elements (parameters): motion constraints at tile and picture boundaries, GOP structure, chroma format, bit depth luma/chroma, advanced motion vector prediction, sample adaptive offset, temporal motion vector prediction, loop filter.

图4示出了说明根据本申请的实施例的编码参数的示例的示意图。在图4中，附图标记40表示类型0，并且由点线指示属于类型0的语法元素。附图标记42表示类型1，并且由常规线(normal line)指示属于类型1的语法元素。附图标记44表示类型2，并且由短划线指示属于类型1的语法元素。FIG4 shows a schematic diagram illustrating an example of coding parameters according to an embodiment of the present application. In FIG4 , reference numeral 40 represents type 0, and a syntax element belonging to type 0 is indicated by a dotted line. Reference numeral 42 represents type 1, and a syntax element belonging to type 1 is indicated by a normal line. Reference numeral 44 represents type 2, and a syntax element belonging to type 1 is indicated by a dashed line.

图5a和图5b是序列参数集合(SPS)20的示例，类型0所需的语法元素由附图标记40指示。以相同的方式，类型1所需的语法元素由附图标记42指示，类型2所需的语法元素由附图标记44指示。5a and 5b are examples of a sequence parameter set (SPS) 20, and syntax elements required for type 0 are indicated by reference numeral 40. In the same manner, syntax elements required for type 1 are indicated by reference numeral 42, and syntax elements required for type 2 are indicated by reference numeral 44.

图6a和图6b是图片参数集合(PPS)22的示例，类型0所需的语法元素由附图标记40指示，类型1所需的语法元素由附图标记42指示。6 a and 6 b are examples of a picture parameter set (PPS) 22 , where syntax elements required for type 0 are indicated by reference numeral 40 , and syntax elements required for type 1 are indicated by reference numeral 42 .

图7a至图7d是条带头部30的示例，并且条带头部的仅一个语法元素是类型0必需的，如图7c中的附图标记40所指示。7a to 7d are examples of a slice header 30, and only one syntax element of the slice header is necessary for type 0, as indicated by reference numeral 40 in FIG. 7c.

图8是图片结构(SOP)26a的示例，并且属于SOP的所有语法元素都是类型2需要的。FIG. 8 is an example of a picture structure (SOP) 26 a , and all syntax elements belonging to the SOP are required for type 2 .

图9是运动约束图块集合(MCTS)26b的示例，并且属于MCTS的所有语法元素都是类型2需要的。FIG. 9 is an example of a motion constraint tile set (MCTS) 26 b , and all syntax elements belonging to the MCTS are type 2 required.

如上所述，图3的合并标识符6a使用散列部件16a以及散列部件16b处的散列函数来生成合并标识符值。在使用散列函数生成两个或更多个合并标识符值的情况下，以以下方式连接散列：输入到包含散列集合中相对于第一合并标识符的附加元素的第二合并标识符的散列函数使用第一标识符值(散列结果)而不是各个语法元素值，以将散列函数的输入串接。As described above, the merge identifier 6a of Figure 3 uses hash functions at hash components 16a and 16b to generate a merge identifier value. In the case where two or more merge identifier values are generated using a hash function, the hashes are connected in the following manner: the hash function input to the second merge identifier containing additional elements in the hash set relative to the first merge identifier uses the first identifier value (hash result) instead of the individual syntax element values to concatenate the inputs to the hash function.

另外，合并标识符的存在还提供以下保证：并入散列集合中的语法元素在编码视频序列(CVS)和/或比特流的所有接入单元(AU)中具有相同的值。此外，该保证在参数集合的配置文件/级别语法中具有约束标志的形式。In addition, the presence of the merge identifier also provides the following guarantee: the syntax elements incorporated into the hash set have the same value in all access units (AUs) of a coded video sequence (CVS) and/or bitstream. In addition, this guarantee has the form of a constraint flag in the profile/level syntax of the parameter set.

考虑使用图10至图13的根据本发明的另一实施例，在下面描述合并过程。Considering another embodiment according to the present invention using FIGS. 10 to 13 , the merging process is described below.

图10示出了用于基于多个编码视频表示来提供合并视频流的视频合并器。视频合并器50包括接收器52、合并方法标识符54和合并处理器56，接收器52接收输入视频流12(图12中所示的12a和12b)。合并视频流60被发送到解码器。在视频合并器50被包括在解码器中的情况下，合并视频流被发送到用户设备或任何其他装置以显示合并视频流。FIG. 10 shows a video merger for providing a merged video stream based on multiple encoded video representations. Video merger 50 includes a receiver 52, a merge method identifier 54, and a merge processor 56, and receiver 52 receives input video stream 12 (12a and 12b shown in FIG. 12). Merged video stream 60 is sent to a decoder. In the case where video merger 50 is included in a decoder, the merged video stream is sent to a user device or any other device to display the merged video stream.

合并过程由上述合并标识符和合并标识符类型驱动。合并过程可以只需要生成参数集合加上NAL单元的交织，这是与合并标识符类型值0相关联的合并的最轻量级的形式，即第一复杂度。图12示出了第一复杂度合并方法的示例。如图12所示，参数集合，即视频流12a的SPS1和视频流12b的SPS2被合并(基于SPS1和SPS2来生成合并的SPS)，且视频流12a的PPS1和视频流12b的PPS被合并(基于PPS1和PPS2来生成合并的PPS)。IDR是可选数据，因此省略说明。另外，将视频流12a的条带1，1和1，2以及视频流12b的条带2，1和2，2交织，如合并视频流60所示。在图10和图12中，作为示例，输入两个视频流12a和12b。然而，也可以以相同方式输入和合并更多的视频流。The merging process is driven by the above-mentioned merging identifier and the merging identifier type. The merging process may only need to generate the interleaving of parameter sets plus NAL units, which is the lightest form of the merging associated with the merging identifier type value 0, i.e. the first complexity. Figure 12 shows an example of the first complexity merging method. As shown in Figure 12, parameter sets, i.e., the SPS1 of video stream 12a and the SPS2 of video stream 12b are merged (generating the merged SPS based on SPS1 and SPS2), and the PPS1 of video stream 12a and the PPS of video stream 12b are merged (generating the merged PPS based on PPS1 and PPS2). IDR is optional data, so description is omitted. In addition, the strips 1,1 and 1,2 of video stream 12a and the strips 2,1 and 2,2 of video stream 12b are interleaved, as shown in the merged video stream 60. In Figures 10 and 12, as examples, two video streams 12a and 12b are input. However, more video streams may also be input and merged in the same manner.

当需要时，合并过程还可以包括在与合并标识符类型值1(即，第二复杂度)相关联的NAL单元交织期间在比特流中重写条带头部。此外及最后，存在条带有效载荷中的语法元素需要调整的情况，这与合并标识符类型值2(即，第三复杂度)相关联，并且需要在NAL单元交织期间进行熵解码和编码。合并标识符和合并标识符类型驱动对选择要执行的合并过程之一及其细节的决定。The merge process may also include rewriting the slice header in the bitstream during NAL unit interleaving associated with a merge identifier type value of 1 (i.e., second complexity), when needed. Additionally and finally, there are cases where syntax elements in the slice payload require adjustment, which is associated with a merge identifier type value of 2 (i.e., third complexity), and requires entropy decoding and encoding during NAL unit interleaving. The merge identifier and merge identifier type drive the decision to select one of the merge processes to be performed and its details.

合并过程的输入是输入子比特流的列表，该列表还表示空间布置。该过程的输出是合并比特流。一般地，在所有比特流合并过程中，需要生成用于新的输出比特流的参数集合，其可以基于输入子比特流(例如，第一输入子比特流)的参数集合。对参数集合的必要更新包括图片大小。例如，将输出比特流的图片大小计算为输入比特流在各个维度上及在它们的空间布置的上下文中的图片大小的总和。The input to the merging process is a list of input sub-bitstreams, which also represents the spatial arrangement. The output of the process is a merged bitstream. Generally, in all bitstream merging processes, a parameter set for a new output bitstream needs to be generated, which can be based on the parameter set of the input sub-bitstream (e.g., the first input sub-bitstream). Necessary updates to the parameter set include picture size. For example, the picture size of the output bitstream is calculated as the sum of the picture sizes of the input bitstreams in each dimension and in the context of their spatial arrangement.

比特流合并过程要求所有输入子比特流携带合并标识符和合并标识符类型中至少一个实例的相同值。在一个实施例中，执行与合并标识符类型的最小值相关联的合并过程，针对该合并标识符类型的最小值，所有子比特流携带合并标识符的相同值。The bitstream merging process requires that all input sub-bitstreams carry the same value of the merge identifier and at least one instance of the merge identifier type. In one embodiment, the merging process is performed associated with the minimum value of the merge identifier type for which all sub-bitstreams carry the same value of the merge identifier.

例如，在合并过程中使用具有某个合并标识符类型值的合并标识符值中的差异，以根据与合并标识符值匹配的合并标识符类型的差异值确定合并过程的细节。例如，如图11所示，当具有等于0的合并标识符类型值的第一合并标识符70在两个输入子比特流70a和70b之间不匹配，但是具有等于1的合并标识符类型值的第二合并标识符80在两个子比特流80a和80b之间匹配时，第一合并标识符70的比特位置差异指示所有条带中需要调整的(条带头部相关)语法元素。For example, the difference in the merge identifier value having a certain merge identifier type value is used in the merge process to determine the details of the merge process according to the difference value of the merge identifier type that matches the merge identifier value. For example, as shown in FIG11, when the first merge identifier 70 having a merge identifier type value equal to 0 does not match between the two input sub-bitstreams 70a and 70b, but the second merge identifier 80 having a merge identifier type value equal to 1 matches between the two sub-bitstreams 80a and 80b, the difference in the bit position of the first merge identifier 70 indicates the (slice header related) syntax elements that need to be adjusted in all slices.

图13示出了视频合并器50(50a)，其包括接收器(未示出)以及合并处理器56，合并方法标识符包括合并标识符比较器54a和编码参数比较器54b。在输入视频流12包括散列值作为合并标识符值的情况下，在合并标识符比较器54a处比较每个输入视频流的值。例如，当两个输入视频流都具有相同的合并标识符值时，在编码参数比较器54b处比较每个输入视频流的各自的编码参数。基于编码参数比较结果来决定合并方法，并且合并处理器56通过使用所决定的合并方法来合并输入视频流12。在合并标识符值(散列值)也指示合并方法的情况下，不需要各个编码参数比较。FIG13 shows a video merger 50 (50a), which includes a receiver (not shown) and a merger processor 56, and the merger method identifier includes a merger identifier comparator 54a and a coding parameter comparator 54b. In the case where the input video stream 12 includes a hash value as a merger identifier value, the value of each input video stream is compared at the merger identifier comparator 54a. For example, when both input video streams have the same merger identifier value, the respective coding parameters of each input video stream are compared at the coding parameter comparator 54b. The merger method is determined based on the coding parameter comparison result, and the merger processor 56 merges the input video streams 12 by using the determined merger method. In the case where the merger identifier value (hash value) also indicates the merger method, individual coding parameter comparisons are not required.

以上解释了三种合并方法——第一复杂度合并方法、第二复杂度合并方法和第三复杂度合并方法。第四合并方法是使用全像素解码和像素重新编码来合并视频流。如果所有三种合并方法都不可应用，则应用第四合并方法。Three merging methods are explained above - the first complexity merging method, the second complexity merging method and the third complexity merging method. The fourth merging method is to merge video streams using full pixel decoding and pixel re-encoding. If all three merging methods are not applicable, the fourth merging method is applied.

下面考虑使用图14和图15的根据本发明的另一实施例来描述用于识别合并结果级别的过程。合并结果级别的识别意味着将信息放入子比特流中，并且关于子比特流对包括子比特流的合并比特流的级别限制的贡献有多大。The following describes a process for identifying the merge result level according to another embodiment of the present invention using Figures 14 and 15. The identification of the merge result level means putting information into the sub-bitstream and how much the sub-bitstream contributes to the level restriction of the merged bitstream including the sub-bitstream.

图14示出了编码器2(2b)，编码器2(2b)包括编码器核，编码器核包括编码参数确定14、合并标识符提供(未示出)以及粒度能力提供器8。FIG. 14 shows an encoder 2 ( 2 b ) comprising an encoder core including encoding parameter determination 14 , merge identifier provision (not shown) and granularity capability provider 8 .

一般地，当要将子比特流合并为联合比特流时，关于各个子比特流对潜在的合并比特流必须遵守的编解码器系统的级别特定的限制的贡献程度如何的指示对于确保创建合法的联合比特流是至关重要的。尽管传统上，编解码器级别粒度相当粗糙(例如，区分如720p、1080p或4K的主分辨率)，但是合并级别限制指示需要更精细的粒度。这种传统级别指示的粒度不足以表示各个子比特流对合并比特流的贡献。鉴于要合并的图块数量事先未知，因此需要在灵活性和比特率开销之间找到合理的折衷，但是通常，它远超过传统的级别限制粒度。一个示例性用例是360度视频流，其中服务供应商需要在不同的图块化结构(例如，每360度视频12、24或96个图块)之间进行选择的自由，其中每个图块流将贡献整个级别限制(例如，假设速率分布相等，则为8K)的1/12、1/24或1/96。甚至进一步，假设图块之间的速率分布不均匀(例如，为了实现视频平面上的恒定质量)，可能需要任意精细粒度。In general, when sub-bitstreams are to be merged into a joint bitstream, an indication of how much each sub-bitstream contributes to the level-specific restrictions of the codec system that the potential merged bitstream must comply with is critical to ensuring that a legal joint bitstream is created. Although traditionally, the codec level granularity is quite coarse (e.g., distinguishing between primary resolutions such as 720p, 1080p, or 4K), the merged level restriction indication requires finer granularity. The granularity of this traditional level indication is not enough to represent the contribution of each sub-bitstream to the merged bitstream. Given that the number of tiles to be merged is unknown in advance, a reasonable compromise needs to be found between flexibility and bitrate overhead, but typically, it far exceeds the traditional level restriction granularity. An exemplary use case is 360-degree video streaming, where the service provider needs the freedom to choose between different tile structures (e.g., 12, 24, or 96 tiles per 360-degree video), where each tile stream will contribute 1/12, 1/24, or 1/96 of the entire level restriction (e.g., 8K assuming equal rate distribution). Even further, assuming that the rate distribution between tiles is not uniform (e.g., to achieve constant quality across the video plane), an arbitrarily fine granularity may be required.

例如，这种信号传送将是另外以信号传送的级别的以信号传送的比率和/或百分比。例如，在具有四个参与者的会议场景中，每个参与者将发送合法的级别3比特流，该比特流还包括指示，即包括在粗糙粒度能力信息中的级别信息，该指示表明发送的比特流遵循级别5限制的1/3和/或33％，即该信息可以包括在精细粒度能力信息中，多个这样的流的接收器(即，图15中所示的视频合并器50(50b))因此例如可以知道这三个比特流可以被合并到遵循级别5的单个联合比特流中。For example, such signaling would be a signaled ratio and/or percentage of an otherwise signaled level. For example, in a conference scenario with four participants, each participant would send a legitimate level 3 bitstream that also includes an indication, i.e., level information included in the coarse granularity capability information, that the sent bitstream complies with 1/3 and/or 33% of the level 5 limits, i.e., this information may be included in the fine granularity capability information, and a receiver of multiple such streams (i.e., the video merger 50 (50b) shown in FIG. 15) would therefore, for example, know that the three bitstreams can be merged into a single joint bitstream that complies with level 5.

在图15中，视频合并器被示为包括接收器52和合并处理器56，并且将多个视频流输入到视频合并器中。视频合并器还可以被包括在视频解码器中，或者不在图中示出视频解码器，然而，接收器可以作为解码器工作。解码器(图中未示出)接收比特流，所述比特流包括多个子比特流、多个编码参数、粗糙粒度能力信息和精细粒度能力信息。粗糙粒度能力信息和精细粒度能力信息被携带在SEI消息中，并且解码器对接收到的信息进行解析，并将级别和分数解释为解码器可用的级别限制。然后，解码器检查它正在遇到的比特流是否实际上遵循所表示的限制。将检查的结果提供给视频合并器或合并处理器。因此，如前所述，可能的是，视频合并器能够知道多个子比特流能够合并为单个联合比特流。In Figure 15, the video merger is shown as including a receiver 52 and a merging processor 56, and multiple video streams are input into the video merger. The video merger can also be included in a video decoder, or the video decoder is not shown in the figure, but the receiver can work as a decoder. The decoder (not shown) receives a bitstream, which includes multiple sub-bitstreams, multiple coding parameters, coarse granularity capability information and fine granularity capability information. The coarse granularity capability information and the fine granularity capability information are carried in the SEI message, and the decoder parses the received information and interprets the level and score as the level restrictions available to the decoder. Then, the decoder checks whether the bitstream it is encountering actually follows the restrictions represented. The result of the inspection is provided to the video merger or the merging processor. Therefore, as previously mentioned, it is possible that the video merger can know that multiple sub-bitstreams can be merged into a single joint bitstream.

粒度能力信息可以具有比率和/或百分比的指示作为值的向量，每个维度都涉及编解码器级别限制的另一方面，例如，每秒最大允许亮度样本数、最大图像大小、比特率、缓冲器满度、图块数量等。另外，比率和/或百分比是指视频比特流的一般编解码器级别。The granularity capability information may have indications of ratios and/or percentages as a vector of values, with each dimension relating to another aspect of the codec level limitations, e.g., maximum allowed number of luma samples per second, maximum picture size, bitrate, buffer fullness, number of tiles, etc. Additionally, the ratios and/or percentages refer to a general codec level for the video bitstream.

在下文中，将描述本发明的另外的实施例和方面，这些实施例和方面可以被单独使用或与本文描述的任何特征、功能和细节结合使用。[0013] Hereinafter, additional embodiments and aspects of the invention will be described, which may be used alone or in combination with any of the features, functions, and details described herein.

第一方面涉及一种用于提供编码视频表示的视频编码器，其中，所述视频编码器被配置为提供包括编码参数信息、编码视频内容信息和一个或多个合并标识符的视频流，所述编码参数信息描述多个编码参数，所述一个或多个合并标识符指示编码视频表示是否可以与另一编码视频表示合并和/或所述编码视频表示可以如何与另一编码视频表示合并。A first aspect relates to a video encoder for providing an encoded video representation, wherein the video encoder is configured to provide a video stream comprising encoding parameter information, encoded video content information, and one or more merge identifiers, wherein the encoding parameter information describes a plurality of encoding parameters, and the one or more merge identifiers indicate whether the encoded video representation can be merged with another encoded video representation and/or how the encoded video representation can be merged with another encoded video representation.

根据参考第一方面的第二方面，在所述视频编码器中，基于由参数信息定义的参数值来确定使用哪个复杂度。According to a second aspect referring to the first aspect, in the video encoder, which complexity to use is determined based on a parameter value defined by parameter information.

根据参考第一或第二方面的第三方面，在所述视频编码器中，所述合并标识符是多个编码参数的并置。According to the third aspect referring to the first or second aspect, in the video encoder, the merge identifier is a concatenation of a plurality of encoding parameters.

根据参考第一至第三方面中的任一方面的第四方面，在所述视频编码器中，所述合并标识符是多个编码参数的串接的散列值。According to the fourth aspect referring to any one of the first to third aspects, in the video encoder, the merge identifier is a hash value of a concatenation of a plurality of encoding parameters.

根据参考第一至第四方面中的任一方面的第五方面，在所述视频编码器中，所述合并标识符指示表示合并过程的复杂度的合并标识符类型。According to the fifth aspect referring to any one of the first to fourth aspects, in the video encoder, the merge identifier indicates a merge identifier type representing complexity of a merge process.

根据参考第五方面的第六方面，在所述视频编码器中，所述合并标识符类型的值指示合并过程，其中，所述视频编码器被配置为在所述合并标识符类型的以下值中的至少两个值之间切换：合并标识符类型的第一值，表示通过参数集合重写的合并过程；合并标识符类型的第二值，表示通过参数集合和条带头部重写的合并过程；以及合并标识符类型的第三值，表示通过参数集合、条带头部和条带有效载荷重写的合并过程。According to the sixth aspect with reference to the fifth aspect, in the video encoder, the value of the merge identifier type indicates a merge process, wherein the video encoder is configured to switch between at least two of the following values of the merge identifier type: a first value of the merge identifier type, indicating a merge process rewritten by a parameter set; a second value of the merge identifier type, indicating a merge process rewritten by a parameter set and a slice header; and a third value of the merge identifier type, indicating a merge process rewritten by a parameter set, a slice header, and a slice payload.

根据参考第一至第六方面中的任一方面的第七方面，在所述视频编码器中，多个合并标识符与合并过程的不同复杂度相关联。According to the seventh aspect referring to any one of the first to sixth aspects, in the video encoder, a plurality of merge identifiers are associated with different complexities of the merge process.

根据参考第一至第七方面中的任一方面的第八方面，在所述视频编码器中，编码器被配置为检查为了提供所述合并标识符而评估的编码参数是否在视频序列的所有单元中都相同，以及根据所述检查来提供所述合并标识符。According to the eighth aspect with reference to any one of the first to seventh aspects, in the video encoder, the encoder is configured to check whether the encoding parameters evaluated to provide the merge identifier are the same in all units of the video sequence, and to provide the merge identifier based on the check.

根据参考第一至第八方面中的任一方面的第九方面，在所述视频编码器中，所述多个编码参数包括合并相关参数，所述合并相关参数在不同的视频流中必须相同，以允许合并具有比通过全像素解码的合并低的复杂度，以及所述视频编码器被配置为基于所述合并相关参数来提供所述一个或多个合并标识符。According to the ninth aspect with reference to any one of the first to eighth aspects, in the video encoder, the multiple encoding parameters include merge-related parameters, which must be the same in different video streams to allow the merge to have a lower complexity than the merge through full-pixel decoding, and the video encoder is configured to provide the one or more merge identifiers based on the merge-related parameters.

根据参考第九方面的第十方面，在所述视频编码器中，所述合并相关参数包括以下参数中的一个或多个或全部：描述图块边界处的运动约束的参数、关于图片组(GOP)结构的信息、描述色度编码格式的参数和描述亮度编码格式的参数、描述高级运动矢量预测的参数、描述样本自适应偏移的参数、描述时间运动矢量预测的参数、以及描述环路滤波器的参数。According to the tenth aspect with reference to the ninth aspect, in the video encoder, the merge-related parameters include one or more or all of the following parameters: parameters describing motion constraints at tile boundaries, information about a group of pictures (GOP) structure, parameters describing a chroma coding format and parameters describing a luminance coding format, parameters describing advanced motion vector prediction, parameters describing sample adaptive offset, parameters describing temporal motion vector prediction, and parameters describing a loop filter.

根据参考第一至第七方面中的任一方面的第十一方面，在所述视频编码器中，当基于第二编码参数集合确定了与合并过程的第二复杂度相关联的合并参数时，基于第一编码参数集合来确定与合并过程的第一复杂度相关联的合并标识符，所述第二复杂度比所述第一复杂度高，所述第二编码参数集合是所述第一编码参数集合的真子集。According to the eleventh aspect referring to any one of the first to seventh aspects, in the video encoder, when a merge parameter associated with a second complexity of a merge process is determined based on a second encoding parameter set, a merge identifier associated with a first complexity of the merge process is determined based on a first encoding parameter set, the second complexity is higher than the first complexity, and the second encoding parameter set is a proper subset of the first encoding parameter set.

根据参考第十一方面的第十二方面，在所述视频编码器中，基于第三编码参数集合来确定与合并过程的第三复杂度相关联的合并标识符，所述第三复杂度比所述第二复杂度高，所述第三编码参数集合是所述第二参数集合的真子集。According to the twelfth aspect with reference to the eleventh aspect, in the video encoder, a merge identifier associated with a third complexity of the merging process is determined based on a third encoding parameter set, wherein the third complexity is higher than the second complexity, and the third encoding parameter set is a proper subset of the second parameter set.

根据参考第十一或十二方面的第十三方面，所述视频编码器被配置为基于在两个不同的视频流中必须相等以允许如下方式的视频流的合并的编码参数集合来确定与合并过程的第一复杂度相关联的合并标识符：仅修改能够应用于多个条带的参数集合，同时保持条带头部和条带有效载荷不变。According to the thirteenth aspect referring to the eleventh or twelfth aspect, the video encoder is configured to determine a merge identifier associated with a first complexity of the merging process based on a set of encoding parameters that must be equal in two different video streams to allow merging of the video streams in the following way: only the set of parameters that can be applied to multiple slices is modified while keeping the slice header and the slice payload unchanged.

根据参考第十一或十二方面的第十四方面，所述视频编码器被配置为被配置为基于以下参数中的一个或多个或全部参数来确定与所述第一复杂度相关联的合并标识符：According to the fourteenth aspect with reference to the eleventh or twelfth aspect, the video encoder is configured to determine a merge identifier associated with the first complexity based on one or more or all of the following parameters:

指示图块和图片边界处的运动约束的参数，Parameters indicating motion constraints at tile and picture boundaries,

定义GOP结构的参数，Parameters that define the GOP structure,

描述参考图片集合的参数，Parameters describing the reference picture set,

描述色度格式的参数，Parameters describing the chroma format,

描述基本量化参数和色度量化参数的参数，Parameters describing the base quantization parameters and the chrominance quantization parameters,

描述比特深度亮度/色度的参数，Parameters describing the bit depth luma/chroma,

描述假设参考解码器参数的参数，所述假设参考解码器参数包括关于初始到达延迟的参数和关于初始移除延迟的参数，parameters describing hypothesized reference decoder parameters, the hypothesized reference decoder parameters comprising parameters relating to an initial arrival delay and parameters relating to an initial removal delay,

描述编码块结构的参数，Parameters describing the structure of the coding block,

描述变换最小大小和/或最大大小的参数，Parameters describing the minimum and/or maximum size of the transform,

描述脉冲编码调制块使用的参数，Describes the parameters used by the Pulse Code Modulation block,

描述高级运动矢量预测的参数，Describes the parameters of advanced motion vector prediction,

描述样本自适应偏移的参数，Parameters describing sample adaptive offset,

描述时间运动矢量预测的参数，Parameters describing the temporal motion vector prediction,

描述帧内平滑的参数，Parameters describing intra-frame smoothing,

描述相关条带的参数，Parameters describing the relevant strip,

描述符号隐藏的参数，Describes the parameters hidden by the symbol,

描述加权预测的参数，Parameters describing the weighted prediction,

描述变换量化旁路的参数，Parameters describing the transform quantization bypass,

描述熵编码同步的参数，Parameters describing entropy coding synchronization,

描述环路滤波器的参数，Describe the parameters of the loop filter,

描述包括参数集合ID在内的条带头部值的参数，Parameters describing the stripe header value including the parameter set ID,

描述包括参考图片集合在内的条带头部值的参数，以及Parameters describing the slice header value including the reference picture set, and

描述使用隐式编码变换单元地址信号传送的参数。Describes the parameters transmitted using implicitly encoded transform unit address signals.

根据参考第十一至第十四方面中的任一方面的第十五方面，所述视频编码器被配置为基于在两个不同的视频流中必须相等以允许如下方式的视频流的合并的编码参数集合来确定与合并过程的第二复杂度相关联的合并标识符：修改能够应用于多个条带的参数集合并且还修改条带头部，同时保持条带有效载荷不变。According to the fifteenth aspect referring to any one of the eleventh to fourteenth aspects, the video encoder is configured to determine a merge identifier associated with a second complexity of the merging process based on a set of encoding parameters that must be equal in two different video streams to allow merging of the video streams in the following manner: modifying a set of parameters that can be applied to multiple slices and also modifying a slice header while keeping the slice payload unchanged.

根据参考第十一至第十五方面中的任一方面的第十六方面，所述视频编码器被配置为基于以下参数中的一个或多个或全部参数来确定与所述第二复杂度相关联的合并标识符：According to the 16th aspect referring to any one of the 11th to 15th aspects, the video encoder is configured to determine the merge identifier associated with the second complexity based on one or more or all of the following parameters:

定义GOP结构的参数，Parameters that define the GOP structure,

描述色度格式的参数，Parameters describing the chroma format,

描述帧内平滑的参数，Parameters describing intra-frame smoothing,

描述符号隐藏的参数，Describes the parameters hidden by the symbol,

描述环路滤波器的参数，以及Describe the parameters of the loop filter, and

描述包括参考图片集合在内的条带头部值的参数。Parameters describing the slice header values including the reference picture set.

根据参考第十一至第十六方面中的任一方面的第十七方面，所述视频编码器被配置为基于在两个不同的视频流中必须相等以允许如下方式的视频流的合并的编码参数集合来确定与合并过程的第三复杂度相关联的合并标识符：修改能够应用于多个条带的参数集合并且还修改条带头部和条带有效载荷，但是不执行全像素解码和像素重新编码。According to the seventeenth aspect referring to any one of the eleventh to sixteenth aspects, the video encoder is configured to determine a merge identifier associated with a third complexity of a merging process based on a set of coding parameters that must be equal in two different video streams to allow merging of the video streams in the following manner: modifying a set of parameters that can be applied to multiple slices and also modifying a slice header and a slice payload, but not performing full pixel decoding and pixel re-encoding.

根据参考第十一至第十七方面中的任一方面的第十八方面，所述视频编码器被配置为基于以下参数中的一个或多个或全部参数来确定与所述第三复杂度相关联的合并标识符：According to the 18th aspect referring to any one of the 11th to 17th aspects, the video encoder is configured to determine a merge identifier associated with the third complexity based on one or more or all of the following parameters:

定义GOP结构的参数，Parameters that define the GOP structure,

描述色度格式的参数，Parameters describing the chroma format,

描述时间运动矢量预测的参数，以及Parameters describing the temporal motion vector prediction, and

描述环路滤波器的参数。Describes the parameters of the loop filter.

根据参考第一至第十八方面中的任一方面的第十九方面，所述视频编码器被配置为：将散列函数应用于第二合并标识符与在确定所述第二合并标识符时没有被考虑的一个或多个编码参数的串接，以便获得与合并过程的第一复杂度相关联的第一合并标识符，其中所述第二合并标识符与合并过程的第二复杂度相关联，所述第一复杂度比所述第二复杂度低。According to aspect 19 with reference to any one of aspects 1 to 18, the video encoder is configured to: apply a hash function to a concatenation of a second merge identifier and one or more encoding parameters that are not considered when determining the second merge identifier so as to obtain a first merge identifier associated with a first complexity of the merging process, wherein the second merge identifier is associated with a second complexity of the merging process, and the first complexity is lower than the second complexity.

根据参考第一至第十九方面中的任一方面的第二十方面，所述视频编码器被配置为：将散列函数应用于第三合并标识符与在确定所述第三合并标识符时没有被考虑的一个或多个编码参数的串接，以便获得与合并过程的第二复杂度相关联的第二合并标识符，其中所述第三合并标识符与合并过程的第三复杂度相关联，所述第二复杂度比所述第三复杂度低。According to the 20th aspect with reference to any one of the first to nineteenth aspects, the video encoder is configured to: apply a hash function to the concatenation of a third merge identifier and one or more encoding parameters that are not considered when determining the third merge identifier, so as to obtain a second merge identifier associated with a second complexity of the merging process, wherein the third merge identifier is associated with a third complexity of the merging process, and the second complexity is lower than the third complexity.

第二十一方面涉及一种视频合并器，用于基于多个编码视频表示来提供合并视频表示，其中，所述视频合并器被配置为接收包括编码参数信息、编码视频内容信息和一个或多个合并标识符的多个视频流，所述编码参数信息描述多个编码参数，所述一个或多个合并标识符指示编码视频表示是否能够与另一编码视频表示合并和/或编码视频表示能够如何与另一编码视频表示合并；其中，所述视频合并器被配置为根据所述合并标识符来决定合并方法的用法。A twenty-first aspect relates to a video merger for providing a merged video representation based on multiple encoded video representations, wherein the video merger is configured to receive multiple video streams including encoding parameter information, encoded video content information and one or more merge identifiers, the encoding parameter information describing multiple encoding parameters, and the one or more merge identifiers indicating whether the encoded video representation can be merged with another encoded video representation and/or how the encoded video representation can be merged with another encoded video representation; wherein the video merger is configured to decide the usage of the merging method based on the merge identifier.

根据参考二十一方面的第二十二方面，所述视频合并器被配置为根据所述合并标识符，从多个合并方法中选择合并方法。According to the twenty-second aspect with reference to the twenty-first aspect, the video merger is configured to select a merging method from a plurality of merging methods according to the merging identifier.

根据参考二十二方面的第二十三方面，所述视频合并器被配置为根据一个或多个合并标识符，在以下合并方法中的至少两个合并方法之间进行选择：第一合并方法，所述第一合并方法是如下方式的视频流的合并：仅修改能够应用于多个条带的参数集合，同时保持条带头部和条带有效载荷不变；第二合并方法，所述第二合并方法是如下方式的视频流的合并：修改能够应用于多个条带的参数集合并且还修改条带头部，同时保持条带有效载荷不变；以及第三合并方法，所述第三合并方法是如下方式的视频流的合并：修改能够应用于多个条带的参数集合并且还修改条带头部和条带有效载荷，但是不执行全像素解码和像素重新编码。According to the twenty-third aspect of reference twenty-second aspect, the video merger is configured to select between at least two of the following merger methods based on one or more merger identifiers: a first merger method, wherein the first merger method is a merger of video streams in the following manner: only a parameter set that can be applied to multiple slices is modified while keeping a slice header and a slice payload unchanged; a second merger method, wherein the second merger method is a merger of video streams in the following manner: a parameter set that can be applied to multiple slices is modified and a slice header is also modified while keeping a slice payload unchanged; and a third merger method, wherein the third merger method is a merger of video streams in the following manner: a parameter set that can be applied to multiple slices is modified and a slice header and a slice payload are also modified, but full pixel decoding and pixel re-encoding are not performed.

根据参考二十三方面的第二十四方面，所述视频合并器被配置为：根据所述一个或多个合并标识符，选择性地使用第四合并方法，所述第四合并方法是使用全像素解码和像素重新编码的视频流的合并。According to the twenty-fourth aspect based on the twenty-third aspect, the video merger is configured to: selectively use a fourth merger method based on the one or more merger identifiers, and the fourth merger method is a merger of video streams using full pixel decoding and pixel re-encoding.

根据参考第二十二至第二十四方面中的任一方面的第二十五方面，所述视频合并器被配置为：对与相同的给定合并方法相关联的两个或更多个视频流的合并标识符进行比较，以根据比较的结果来决定是否使用所述给定合并方法来执行合并。According to aspect 25 with reference to any one of aspects 22 to 24, the video merger is configured to compare merger identifiers of two or more video streams associated with the same given merger method to decide whether to use the given merger method to perform the merger based on the result of the comparison.

根据参考二十五方面的第二十六方面，所述视频合并器被配置为：如果所述比较指示与所述给定合并方法相关联的所述两个或更多个视频流的合并标识符相等，则使用所述给定合并方法来选择性地执行合并。According to the twenty-sixth aspect of reference twenty-fifth aspect, the video merger is configured to: if the comparison indicates that the merger identifiers of the two or more video streams associated with the given merger method are equal, then use the given merger method to selectively perform the merger.

根据参考二十五方面的第二十七方面，所述视频合并器被配置为：如果所述比较指示与所述给定合并方法相关联的所述两个或更多个视频流的合并标识符不同，则使用复杂度比所述给定合并方法的复杂度高的合并方法。According to aspect twenty-seven of reference aspect twenty-fifth, the video merger is configured to: if the comparison indicates that the merger identifiers of the two or more video streams associated with the given merger method are different, use a merger method with a higher complexity than the given merger method.

根据参考二十七方面的第二十八方面，所述视频合并器被配置为：在所述比较指示与所述给定合并方法相关联的两个或更多个视频流的合并标识符相等的情况下，选择性地对在所述两个或更多个视频流中必须相等以允许使用所述给定合并方法的视频流的合并的编码参数进行比较，以及其中，所述视频合并器被配置为：如果对所述编码参数的比较指示所述编码参数相等，则使用所述给定合并方法来选择性地执行合并，以及其中，所述视频合并器被配置为：如果对所述编码参数的比较指示所述编码参数包括差异，则使用复杂度比所述给定合并方法的复杂度高的合并方法来执行合并。According to aspect twenty-eight based on aspect twenty-seven, the video merger is configured to: selectively compare encoding parameters of the two or more video streams that must be equal to allow the merging of the video streams using the given merging method when the comparison indicates that the merging identifiers of the two or more video streams associated with the given merging method are equal, and wherein the video merger is configured to: selectively perform the merging using the given merging method if the comparison of the encoding parameters indicates that the encoding parameters are equal, and wherein the video merger is configured to: perform the merging using a merging method with a higher complexity than the given merging method if the comparison of the encoding parameters indicates that the encoding parameters include differences.

根据参考第二十一至第二十八方面中的任一方面的第二十九方面，所述视频合并器被配置为对与具有不同复杂度的合并方法相关联的合并标识符进行比较，以及其中，所述视频合并器被配置为使用最低复杂度的合并方法来合并两个或更多个视频流，针对所述最低复杂度的合并方法的相关联的合并标识符在要合并的两个或更多个视频流中相等。According to aspect 29 with reference to any one of aspects 21 to 28, the video merger is configured to compare merge identifiers associated with merge methods having different complexities, and wherein the video merger is configured to merge two or more video streams using a merge method having a lowest complexity, the associated merge identifiers for the merge method having the lowest complexity being equal in the two or more video streams to be merged.

根据参考第二十二至第二十九方面中的任一方面的第三十方面，所述视频合并器被配置为对与具有不同复杂度的合并方法相关联的合并标识符进行比较，并且其中，所述视频合并器被配置为识别最低复杂度的合并方法，针对所述最低复杂度的合并方法的相关联的合并标识符在要合并的两个或更多个视频流中相等；以及其中，所述视频合并器被配置为对在要合并的两个或更多个视频流中必须相等以允许使用所识别的合并方法进行合并的编码参数集合进行比较，以及其中，所述视频合并器被配置为：如果所述比较指示与所识别的合并方法相关联的编码参数集合的编码参数在要合并的视频流中相等，则使用所识别的合并方法来合并所述两个或更多个视频流。According to the 30th aspect with reference to any one of the 22nd to 29th aspects, the video merger is configured to compare merge identifiers associated with merge methods having different complexities, and wherein the video merger is configured to identify the merge method of lowest complexity, the associated merge identifiers for the merge method of lowest complexity being equal in two or more video streams to be merged; and wherein the video merger is configured to compare a set of encoding parameters that must be equal in two or more video streams to be merged to allow merging using the identified merge method, and wherein the video merger is configured to: if the comparison indicates that the encoding parameters of the encoding parameter set associated with the identified merge method are equal in the video streams to be merged, then merge the two or more video streams using the identified merge method.

根据参考第二十二至第三十方面中的任一方面的第三十一方面，所述视频合并器被配置为根据要合并的不同视频流的合并标识符之间的一个或多个差异，来确定在合并过程中应修改哪些编码参数。According to aspect 31 with reference to any one of aspects 22 to 30, the video merger is configured to determine which encoding parameters should be modified during the merger process based on one or more differences between merger identifiers of different video streams to be merged.

根据参考第三十一方面的第三十二方面，所述视频合并器被配置为根据与要合并的不同视频流的合并标识符之间的一个或多个差异，来确定在具有给定复杂度的合并方法中应修改哪些编码参数，其中所述合并标识符与具有比所述给定复杂度低的复杂度的合并方法相关联。According to aspect 32 with reference to aspect 31, the video merger is configured to determine which encoding parameters should be modified in a merging method having a given complexity based on one or more differences between merging identifiers of different video streams to be merged, wherein the merging identifier is associated with a merging method having a complexity lower than the given complexity.

根据参考第二十一至第三十二方面中的任一方面的第三十三方面，所述视频合并器被配置为基于要合并的视频流的编码参数来获得与要合并的所有视频流的条带相关联的联合编码参数，以及将所述联合编码参数包括在合并视频流中。According to aspect 33 with reference to any one of aspects 21 to 32, the video merger is configured to obtain joint coding parameters associated with slices of all video streams to be merged based on coding parameters of the video streams to be merged, and include the joint coding parameters in the merged video stream.

根据参考第三十三方面的第三十四方面，所述视频合并器被配置为适配分别与各个视频条带相关联的编码参数，以获得要包括在所述合并视频流中的修改后的条带。According to the thirty-fourth aspect with reference to the thirty-third aspect, the video merger is configured to adapt coding parameters respectively associated with the individual video slices to obtain modified slices to be included in the merged video stream.

根据参考第三十四方面的第三十五方面，在所述视频合并器中，所述经适配的编码参数包括表示合并编码视频表示的图片大小的参数，其中，基于要合并的编码视频表示的图片大小来计算所述图片大小。According to the thirty-fifth aspect with reference to the thirty-fourth aspect, in the video merger, the adapted encoding parameters include parameters representing a picture size of the merged encoded video representation, wherein the picture size is calculated based on the picture size of the encoded video representation to be merged.

第三十六方面涉及一种用于提供编码视频表示的方法，包括：提供包括编码参数信息、编码视频内容信息和一个或多个合并标识符的视频流，所述编码参数信息描述多个编码参数，所述一个或多个合并标识符指示所述编码视频表示是否能够与另一编码视频表示合并和/或所述编码视频表示能够如何与另一编码视频表示合并。A thirty-sixth aspect relates to a method for providing an encoded video representation, comprising: providing a video stream including encoding parameter information, encoded video content information and one or more merge identifiers, wherein the encoding parameter information describes multiple encoding parameters, and the one or more merge identifiers indicate whether the encoded video representation can be merged with another encoded video representation and/or how the encoded video representation can be merged with another encoded video representation.

第三十七方面涉及一种用于基于多个编码视频表示来提供合并视频表示的方法，包括：接收包括编码参数信息、编码视频内容信息和一个或多个合并标识符的多个视频流，所述编码参数信息描述多个编码参数，所述一个或多个合并标识符指示编码视频表示是否能够与另一编码视频表示合并和/或编码视频表示能够如何与另一编码视频表示合并；以及根据所述合并标识符，从多个合并方法中选择合并方法。A thirty-seventh aspect relates to a method for providing a merged video representation based on multiple encoded video representations, comprising: receiving multiple video streams including encoding parameter information, encoded video content information and one or more merge identifiers, wherein the encoding parameter information describes multiple encoding parameters, and the one or more merge identifiers indicate whether the encoded video representation can be merged with another encoded video representation and/or how the encoded video representation can be merged with another encoded video representation; and selecting a merging method from multiple merging methods based on the merging identifier.

第三十八方面涉及一种用于合并两个或更多个视频流的合并方法，包括：基于不同视频流的编码参数信息来提供公共编码参数信息，同时保持编码视频内容信息不变；基于不变的编码视频内容信息来选择合并过程；以及使用所选择的合并过程来合并两个或更多个视频流。A thirty-eighth aspect relates to a merging method for merging two or more video streams, comprising: providing common coding parameter information based on coding parameter information of different video streams while keeping the encoded video content information unchanged; selecting a merging process based on the unchanged encoded video content information; and using the selected merging process to merge the two or more video streams.

第三十九方面涉及一种具有程序代码的计算机程序，所述程序代码用于当在计算机上运行时执行根据第三十六至第三十八方面的方法中的任一方法。A thirty-ninth aspect relates to a computer program having a program code for executing any one of the methods according to aspects thirty-six to thirty-eight when the program code is run on a computer.

第四十方面涉及一种由根据第三十六至第三十八方面的方法中任一方法生成的数据流。The fortieth aspect relates to a data stream generated by any one of the methods according to aspects thirty-sixth to thirty-eighth.

虽然已经在装置或系统的上下文中描述了一些方面，但是将清楚的是，这些方面还表示对应方法的描述，其中，块或设备对应于方法步骤或方法步骤的特征。类似地，在方法步骤上下文中描述的方面也表示对相应块或项或者相应装置和/或系统的特征的描述。可以由(或使用)硬件设备(诸如，微处理器、可编程计算机或电子电路)来执行一些或全部方法步骤。在一些实施例中，可以由这种装置来执行最重要方法步骤中的一个或多个方法步骤。Although some aspects have been described in the context of an apparatus or system, it will be clear that these aspects also represent the description of the corresponding method, wherein a block or device corresponds to a method step or a feature of a method step. Similarly, the aspects described in the context of a method step also represent the description of the features of the corresponding block or item or corresponding apparatus and/or system. Some or all of the method steps may be performed by (or using) a hardware device (such as a microprocessor, a programmable computer or an electronic circuit). In some embodiments, one or more of the most important method steps may be performed by such a device.

本发明的数据流可以存储在数字存储介质上，或者可以在诸如无线传输介质或有线传输介质(例如，互联网)等的传输介质上传输。The data stream of the present invention may be stored on a digital storage medium, or may be transmitted on a transmission medium such as a wireless transmission medium or a wired transmission medium (eg, the Internet).

取决于某些实现要求，可以在硬件中或在软件中实现本发明的实施例。可以通过使用其上存储有电子可读控制信号的数字存储介质(例如，软盘、DVD、Blu-Ray、CD、ROM、PROM和EPROM、EEPROM或闪存)来执行所述实现方案，所述控制信号与可编程计算机系统协作(或能够与之协作)，从而执行相应方法。因此，数字存储介质可以是计算机可读的。Depending on certain implementation requirements, embodiments of the present invention may be implemented in hardware or in software. The implementation may be performed using a digital storage medium (e.g., a floppy disk, DVD, Blu-Ray, CD, ROM, PROM and EPROM, EEPROM or flash memory) on which electronically readable control signals are stored, which cooperate (or are capable of cooperating) with a programmable computer system to perform the corresponding method. Thus, the digital storage medium may be computer readable.

根据本发明的一些实施例包括具有电子可读控制信号的数据载体，其能够与可编程计算机系统协作以便执行本文所述的方法之一。Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.

通常，本发明的实施例可以实现为具有程序代码的计算机程序产品，程序代码可操作以在计算机程序产品在计算机上运行时执行方法之一。程序代码可以例如存储在机器可读载体上。Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer.The program code may, for example, be stored on a machine readable carrier.

其他实施例包括存储在机器可读载体上的计算机程序，该计算机程序用于执行本文所述的方法之一。Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.

换言之，本发明方法的实施例因此是具有程序代码的计算机程序，该程序代码用于在计算机程序在计算机上运行时执行本文所述的方法之一。In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.

因此，本发明方法的另一实施例是其上记录有计算机程序的数据载体(或者数字存储介质或计算机可读介质)，该计算机程序用于执行本文所述的方法之一。数据载体、数字存储介质或记录介质通常是有形的和/或非瞬时性的。A further embodiment of the inventive method is therefore a data carrier (or a digital storage medium or a computer-readable medium) on which is recorded the computer program for performing one of the methods described herein. The data carrier, the digital storage medium or the recorded medium is typically tangible and/or non-transitory.

因此，本发明方法的另一实施例是表示计算机程序的数据流或信号序列，所述计算机程序用于执行本文所述的方法之一。数据流或信号序列可以例如被配置为经由数据通信连接(例如，经由互联网)传送。Therefore, another embodiment of the inventive method is a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may, for example, be configured to be transmitted via a data communication connection (eg, via the Internet).

另一实施例包括处理装置，例如，计算机或可编程逻辑器件，所述处理装置被配置为或适于执行本文所述的方法之一。A further embodiment comprises a processing means, for example a computer or a programmable logic device, configured to or adapted to perform one of the methods described herein.

另一实施例包括其上安装有计算机程序的计算机，该计算机程序用于执行本文所述的方法之一。A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.

根据本发明的另一实施例包括被配置为向接收器(例如，以电子方式或以光学方式)传送计算机程序的装置或系统，该计算机程序用于执行本文所述的方法之一。接收器可以是例如计算机、移动设备、存储设备等。装置或系统可以例如包括用于向接收器传送计算机程序的文件服务器。Another embodiment according to the invention comprises an apparatus or system configured to transmit a computer program to a receiver (e.g. electronically or optically), the computer program being used to perform one of the methods described herein. The receiver may be, for example, a computer, a mobile device, a storage device, etc. The apparatus or system may, for example, comprise a file server for transmitting the computer program to the receiver.

在一些实施例中，可编程逻辑器件(例如，现场可编程门阵列)可以用于执行本文所述的方法的功能中的一些或全部。在一些实施例中，现场可编程门阵列可以与微处理器协作以执行本文所述的方法之一。通常，方法优选地由任意硬件装置来执行。In some embodiments, a programmable logic device (e.g., a field programmable gate array) can be used to perform some or all of the functions of the methods described herein. In some embodiments, a field programmable gate array can collaborate with a microprocessor to perform one of the methods described herein. Typically, the method is preferably performed by any hardware device.

本文描述的装置可以使用硬件装置、或者使用计算机、或者使用硬件装置和计算机的组合来实现。The devices described herein may be implemented using hardware devices, or using computers, or using a combination of hardware devices and computers.

本文描述的装置或本文描述的装置的任何组件可以至少部分地在硬件和/或软件中实现。The apparatus described herein or any component of an apparatus described herein may be implemented at least partially in hardware and/or software.

本文描述的方法可以使用硬件装置、或者使用计算机、或者使用硬件装置和计算机的组合来执行。The methods described herein may be performed using a hardware device, or using a computer, or using a combination of a hardware device and a computer.

本文描述的方法或本文描述的装置的任何组件可以至少部分地由硬件和/或由软件执行。Any component of a method described herein or an apparatus described herein may be performed at least in part by hardware and/or by software.

上述实施例对于本发明的原理仅是说明性的。应当理解的是，本文所述的布置和细节的修改和变形对于本领域其他技术人员将是显而易见的。因此，旨在仅由所附专利权利要求的范围来限制而不是由借助对本文的实施例的描述和解释所给出的具体细节来限制。The above embodiments are merely illustrative of the principles of the present invention. It should be understood that modifications and variations of the arrangements and details described herein will be apparent to other persons skilled in the art. Therefore, it is intended that the scope of the present invention be limited only by the scope of the appended patent claims and not by the specific details given by way of the description and explanation of the embodiments herein.

Claims

1. A video encoder for providing an encoded video representation, comprising:

A processor;

A memory storing a computer program that, when executed by the processor, causes the video encoder to:

Providing coarse grain capability requirement information describing: compatibility of video streams with video decoders having a capability level of a plurality of predetermined capability levels, and

Providing fine granularity capability requirement information describing: which score of the allowable capability requirements associated with one of the predetermined capability levels is needed to decode the encoded video representation to enable a video combiner to combine two or more video streams according to the coarse granularity capability requirement information and the fine granularity capability requirement information.

2. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: providing the fine granularity capability requirement information such that the fine granularity capability requirement information includes a ratio value or a percentage value.

3. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: the fine granularity capability requirement information is provided such that the fine granularity capability requirement information includes a ratio value or a percentage value that references a predetermined capability level described by the coarse granularity capability requirement information.

4. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: providing the fine granularity capability requirement information such that the fine granularity capability requirement information includes reference information and score information, and

Wherein the reference information describes which of the predetermined capability levels the score information references.

5. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: the fine granularity capability requirement information is provided such that the fine granularity capability requirement information describes capability requirements of the encoded video representation in terms of capability requirements with respect to a plurality of standards.

6. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: the fine granularity capability requirement information is provided such that the fine granularity capability requirement information includes a plurality of score values associated with different criteria.

7. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: providing the fine granularity capability requirement information such that the fine granularity capability requirement information includes one or more score values describing one or more of the following criteria:

-a fraction of the maximum number of allowed luminance samples per second;

-a fraction of the maximum image size;

-a fraction of the maximum bit rate;

-a fraction of buffer fullness; and

-A fraction of the maximum number of tiles.

8. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: providing such resolution to the fine granularity capability requirement information.

9. A video combiner for providing a combined video representation based on a plurality of encoded video representations, the video combiner comprising:

A processor;

a memory storing a computer program that, when executed by the processor, causes the video combiner to:

receiving a plurality of video streams including encoding parameter information describing a plurality of encoding parameters, encoding video content information, coarse granularity capability requirement information describing compatibility of the video streams with a video decoder having a capability level of a plurality of predetermined capability levels, and fine granularity capability requirement information describing which score of an allowable capability requirement associated with one of the predetermined capability levels is required to decode an encoded video representation, and

Two or more video streams are merged according to the coarse granularity capability requirement information and the fine granularity capability requirement information.

10. The video combiner of claim 9, wherein the computer program, when executed by the processor, further causes the video combiner to: which video streams can be included in the combined video stream is determined based on the fine granularity capability requirement information.

11. The video combiner of claim 9, wherein the computer program, when executed by the processor, further causes the video combiner to: it is decided whether or not a valid combined video stream can be obtained by combining two or more video streams according to the fine granularity capability requirement information.

12. The video combiner of claim 9, wherein the computer program, when executed by the processor, further causes the video combiner to: the fine granularity capability requirement information of the multiple video streams to be combined is integrated.

13. The video combiner of claim 9, wherein the fine granularity capability requirement information describes capability requirements of the encoded video representation in terms of capability requirements with respect to a plurality of standards; and

Wherein the computer program, when executed by the processor, further causes the video combiner to: it is determined whether the combined capability requirement of the multiple video streams to be combined as described by the fine granularity capability requirement information is within predetermined limits for all criteria.

14. The video combiner of claim 9, wherein the fine granularity capability requirement information comprises a plurality of score values relating to different criteria.

15. The video combiner of claim 9, wherein the fine granularity capability requirement information describes one or more score values for one or more of the following criteria:

-a fraction of the maximum number of allowed luminance samples per second;

-a fraction of the maximum image size;

-a fraction of the maximum bit rate;

-a fraction of buffer fullness; and

-A fraction of the maximum number of tiles.

16. A video decoder, wherein the video decoder comprises the video combiner of claim 9.

17. A method of providing an encoded video representation, comprising:

Providing fine granularity capability requirement information describing: which fraction of the allowable capability requirement associated with one of the predetermined capability levels is needed to decode the encoded video representation to enable a video combiner to combine two or more video streams according to the coarse granularity capability requirement information and the fine granularity capability requirement information.

18. A method for providing a combined video representation based on a plurality of encoded video representations, comprising:

Receiving a plurality of video streams comprising encoding parameter information describing a plurality of encoding parameters, encoding video content information, coarse granularity capability requirement information describing compatibility of the video streams with a video decoder having a capability level of a plurality of predetermined capability levels, and fine granularity capability requirement information describing which fraction of an allowable capability requirement associated with one of the predetermined capability levels is required to decode an encoded video representation, wherein two or more video streams are combined according to the coarse granularity capability requirement information and the fine granularity capability requirement information.

19. A computer readable storage medium storing a computer program for performing the method of claim 17 or 18 when run on a computer.

20. A video decoder for decoding a provided video representation, wherein the video decoder comprises:

A processor;

A memory storing a computer program that, when executed by the processor, causes the video decoder to: receiving a video representation comprising a plurality of video streams, the plurality of video streams comprising encoding parameter information, encoding video content information, coarse granularity capability requirement information, and fine granularity capability requirement information, the encoding parameter information describing a plurality of encoding parameters, the coarse granularity capability requirement information describing compatibility of the video streams with a video decoder having a capability level of a plurality of predetermined capability levels, the fine granularity capability requirement information describing which fraction of a permissible capability requirement associated with one of the predetermined capability levels is required to decode the encoded video representation, and

A determination is made as to whether the combined capability requirements described by the fine granularity capability requirement information of the plurality of video streams to be combined match a predetermined limit to be adhered to.

21. The video decoder of claim 20, wherein the computer program, when executed by the processor, further causes the video decoder to: the received coarse granularity capability requirement information and fine granularity capability requirement information are parsed to obtain an indication of the capability level and a fraction of the allowable capability requirement.

22. The video decoder of claim 20, wherein the fine granularity capability requirement information describes capability requirements of the encoded video representation in terms of capability requirements with respect to a plurality of standards; and

Wherein the computer program, when executed by the processor, further causes the video decoder to: when the combined capability requirements of the plurality of video streams to be combined are within predetermined limits with respect to all criteria, it is determined that the combined capability requirements of the plurality of video streams to be combined are matched.

23. The video decoder of claim 20, wherein the fine granularity capability requirement information includes a plurality of score values associated with different criteria.

24. The video decoder of claim 20, wherein the fine granularity capability requirement information includes one or more score values describing one or more of the following criteria:

-a fraction of the maximum number of allowed luminance samples per second;

-a fraction of the maximum image size;

-a fraction of the maximum bit rate;

-a fraction of buffer fullness; and

-A fraction of the maximum number of tiles.