Nothing Special   »   [go: up one dir, main page]

CN113039804B - Bit stream merging - Google Patents

Bit stream merging Download PDF

Info

Publication number
CN113039804B
CN113039804B CN201980074638.0A CN201980074638A CN113039804B CN 113039804 B CN113039804 B CN 113039804B CN 201980074638 A CN201980074638 A CN 201980074638A CN 113039804 B CN113039804 B CN 113039804B
Authority
CN
China
Prior art keywords
video
requirement information
capability
capability requirement
fine granularity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201980074638.0A
Other languages
Chinese (zh)
Other versions
CN113039804A (en
Inventor
罗伯特·斯库宾
亚戈·桑切斯德拉富恩特
科内柳斯·海尔奇
托马斯·斯切尔
卡尔斯滕·祖灵
托马斯·威甘德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Original Assignee
Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV filed Critical Fraunhofer Gesellschaft zur Forderung der Angewandten Forschung eV
Priority to CN202410913430.6A priority Critical patent/CN118694980A/en
Publication of CN113039804A publication Critical patent/CN113039804A/en
Application granted granted Critical
Publication of CN113039804B publication Critical patent/CN113039804B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/186Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/48Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using compressed domain processing techniques other than decoding, e.g. modification of transform coefficients, variable length coding [VLC] data or run-length data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A video encoder (2) for providing an encoded video representation (12), wherein the video encoder (2) is configured to provide a video stream (12) comprising encoding parameter information describing a plurality of encoding parameters (20, 22, 26, 30), encoding video content information and one or more merge identifiers indicating whether the encoded video representation (12) can be merged with another encoded video representation and/or how the encoded video representation (12) can be merged with another encoded video representation.

Description

Bit stream merging
Technical Field
The present application relates to video encoding/decoding.
Background
Video composition is used in many applications to present a composition of multiple video sources to a user. A common example is picture-in-picture (PiP) composition and blending an overlay (overlay) with video content, for example for advertising or user interfaces. Generating such a synthesis in the pixel domain requires parallel decoding of the input video bitstream, which is computationally complex and even not feasible on devices with a single hardware decoder or other limited resources. For example, in current IPTV system designs, a capable set-top box performs the composition and is a major cost of service factor due to its complexity, distribution and limited lifetime. Reducing these cost factors has prompted continual efforts to virtualize set-top box functions, such as transferring user interface generation to cloud resources. With this approach, only the video decoder (the so-called zero client) is the only hardware that remains in the client. The latest state of the art of such system design is based on transcoding, i.e. synthesis in its simplest form: decoding, pixel domain synthesis and re-encoding are performed before or during transmission. In order to reduce the workload of the whole decoding and encoding cycle, an operation of PiP synthesis in the transform coefficient domain instead of the pixel domain is first proposed. Since then, a number of techniques have been proposed to fuse or shorten individual synthesis steps and apply them to current video codecs. However, transcoding-based methods for general synthesis remain computationally complex, which compromises the scalability of the system. Depending on the transcoding method, such synthesis may also affect rate-distortion (RD) performance.
In addition, there are widespread applications built on tiles, where tiles are a spatial subset of a video plane, encoded independently of neighboring tiles. Tile-based streaming systems for 360 ° video work by dividing 360 ° video into tiles that are encoded into sub-bitstreams at various resolutions and are combined into a single bitstream at the client side depending on the current user viewing orientation. Another application involving merging of sub-bitstreams is, for example, the combination of traditional video content with banner advertisements. In addition, recombination of sub-bit streams may also be a critical part in a video conferencing system, where multiple users send their respective video streams to a receiver that eventually merges the streams. Even further, tile-based cloud coding systems, in which video is split into tiles and tile codes are distributed to separate and independent instances, rely on failsafe checks before merging the resulting tile bitstreams back into a single bitstream. In all these applications, the streams are combined to allow decoding on a single video decoder with a known point of consistency. Merging in this context refers to a lightweight bitstream overwriting that does not require complete decoding and encoding operations on the bitstream or the entropy encoded data of the pixel value reconstruction. However, techniques for ensuring successful formation of bitstream merging (i.e., consistency of the resulting merged bitstreams) originate from many different application scenarios.
For example, conventional codecs support a technique called motion limited tile set (MCTS) in which the encoder constrains inter-prediction between pictures to be limited to within the boundaries of tiles or pictures only, i.e., without using sample values or syntax element values that do not belong to the same tile or lie outside of the boundaries of the pictures. The technique results from region of interest (ROI) decoding, where a decoder can decode only specific sub-portions of the bitstream and encoded pictures without encountering unresolved dependencies and avoiding drift. Another recent technique in this context is the picture Structure (SOP) SEI message, which gives an indication of the applied bitstream structure (i.e. picture order and inter prediction reference structure). This information is provided by summarizing, for each picture between two Random Access Points (RAPs), a Picture Order Count (POC) value, a valid Sequence Parameter Set (SPS) identifier, a Reference Picture Set (RPS) index into a valid SPS. Based on this information, a bitstream structure may be identified that may assist a transcoder, middlebox, or Media Aware Network Entity (MANE) or media player in operating on the bitstream or changing the bitstream, such as adjusting the bitrate, dropping frames, or fast forwarding.
Although both of the above-described exemplary signaling techniques are necessary in understanding whether lightweight merging of sub-bitstreams is possible without significant syntax changes or even complete transcoding, they are far from adequate. In more detail, the lightweight merging in this context is characterized in that the NAL units of the source bitstream are interleaved with only small overwriting operations, i.e. the jointly used parameter sets are written with new image sizes and tile structures, such that each bitstream to be merged will be located in a separate tile region. The next level of merging complexity consists of a smaller overwriting of the stripe header elements, ideally without changing the variable length code in the stripe header. There are other levels of merging complexity, e.g., re-running entropy encoding on the stripe data to change the particular syntax element that is entropy encoded but can be changed without pixel value reconstruction, which can be considered beneficial and quite lightweight compared to full transcoding (which is not considered lightweight) with decoding and encoding of video.
In the combined bitstream, all slices must reference the same parameter set. When parameter sets of the original bitstream use even significantly different settings, lightweight merging may not be possible because many parameter set syntax elements have a further impact on the slice header and the slice payload syntax and their impact levels are different. The deeper the syntax elements participate in the decoding process, the more complex the merging/overwriting becomes. Some notable general categories of syntax dependencies (of parameter sets and other structures) can be distinguished as follows.
A. Grammar presence indication
B. value calculation dependencies
C. Stripe payload encoding tool control
A. syntax used early in the decoding process (e.g., coefficient symbol concealment, block partition restriction)
B. syntax used later in the decoding process (e.g. motion compensation, loop filter) or general decoding process control (reference pictures, bitstream order)
D. Source format parameters
For category a, the parameter set carries many presence flags (e.g., dependent_slice_segments_enabled_flag or output_flag_present_flag) for the various tools. In case these flags differ, the flags may be set to be enabled in the joint parameter set and default values may be explicitly written in the merge slice header of the slice in the bitstream, which does not include syntax elements before merging, i.e. in this case merging requires changing the parameter set and slice header syntax.
For class B, signaled values of the parameter set syntax and parameters of other parameter sets or slice header may be used in the computation, e.g., in HEVC, slice Quantization Parameters (QP) of slices are used to control the coarseness of transform coefficient quantization of residual signals of slices. The signaling of the stripe QP (SliceQpY) in the bitstream depends on QP signaling at the Picture Parameter Set (PPS) level, as follows:
SliceQpY = 26+ init_qp_minus26 (from PPS) +slice_qp_delta (from slice header)
Since each slice in the (merged) encoded video picture needs to reference the same active PPS, the difference in PPS of the respective streams to be merged requires that the slice header be adjusted to reflect the new common value of init_qp_minus 26. That is, in this case, the merging requires changing the parameter set and the slice header syntax, as done for category 1.
For category C, other parameter set syntax elements control the coding tools that affect the bitstream structure of the stripe payload. The subcategories C.a and C.b, i.e., which processes between entropy encoding and pixel-level reconstruction the syntax elements participate in, may be distinguished according to the degree to which the syntax elements participate in the decoding process and the complexity associated with making changes to these syntax elements (related to the degree to which the syntax elements participate in the decoding process).
For example, one element in category C.a is sign_data_ hiding _enabled_flag, which controls the derivation of symbol data for the coded transform coefficients. Symbol data hiding can be easily disabled and the corresponding inferred symbol data explicitly written into the stripe payload. However, these changes to the stripe payloads in this class do not need to enter the pixel domain through full decoding before the pixel-wise merged video is encoded again. Another example is an inferred block partitioning decision, or any other syntax that can easily write inferred values to a bitstream. That is, in this case, the merging needs to change the parameter set, the slice header syntax, and the slice payload that needs entropy decoding/encoding.
However, subcategory C.b requires syntax elements related to the process of deep (far down) decoding chain, and thus, in any case, many complex decoding processes must be performed in the following manner: avoiding the remaining decoding steps is undesirable in terms of implementation and computation. For example, differences in syntax elements associated with the motion compensation constraint "temporal motion constraint tile set SEI (structure of picture information) messages" or various syntax related to the decoding process (e.g., "SEI messages") make pixel-level transcoding unavoidable. In contrast to subcategory C.a, there are many encoder decisions that affect the stripe payload in a way that cannot be changed without full pixel level transcoding.
For class D, there is a parameter set syntax element (e.g., chroma sub-sampling indicated by chroma format idc), where different values make pixel level transcoding unavoidable for merging sub-bitstreams. That is, in this case, the merging requires a complete decoding, pixel-level merging, and a complete encoding process.
The above list is by no means exhaustive, but it is clear that a wide variety of parameters affect the advantages of merging sub-bitstreams into a common bitstream in different ways and that tracking and analysing these parameters is cumbersome.
Disclosure of Invention
It is an object of the present invention to provide a video codec to efficiently merge video bitstreams.
This object is achieved by the subject matter of the claims of the present application.
The basic idea of the invention is to achieve an improvement of merging multiple video streams by including one or more merging identifiers. This approach may reduce the load on computing resources and may also speed up the merging process.
According to an embodiment of the application, the video encoder is configured to provide a video stream comprising encoding parameter information describing a plurality of encoding parameters, encoding video content information (i.e. encoded using encoding parameters defined by the parameter information), and one or more merge identifiers indicating whether an encoded video representation may be merged with another encoded video representation and/or how the encoded video representation may be merged with the other encoded video representation (e.g. using which complexity). Which complexity to use may be determined based on the parameter values defined by the parameter information. The merge identifier may be a concatenation of multiple encoding parameters that must be equal in two different encoded video representations in order to be able to merge the two different encoded video representations using a predetermined complexity. In addition, the merge identifier may be a concatenated hash value of a plurality of encoding parameters that must be equal in two different encoded video representations in order to be able to merge the two different encoded video representations using a predetermined complexity.
According to an embodiment of the application, the merge identifier may indicate a merge identifier type, which represents a "suitable" merge method associated with the complexity of the merge process, or in general, may indicate that the merge is performed by parameter set overwriting, or by parameter set and stripe header overwriting, or by parameter set, stripe header, stripe payload overwriting, wherein the merge identifier is associated with a merge identifier type, wherein the merge identifier associated with the merge identifier type comprises those encoding parameters that have to be equal in two different encoded video representations, such that the two different encoded video representations can be merged using the complexity of the merge process represented by the merge identifier type. The value of the merge identifier type may indicate a merge process, wherein the video encoder is configured to switch between at least two of the following values of the merge identifier type; a first value of a merge identifier type, which represents a merge procedure that is overwritten by a parameter set; a second value of the merge identifier type, which represents a merge procedure that is overwritten by the parameter set and the stripe header; and a third value of the merge identifier type, which represents a merge procedure by parameter set, stripe header, and stripe payload overwrite.
According to an embodiment of the application, a plurality of merge identifiers are associated with different complexities of the merge process, e.g., each identifier indicates a parameter set/hash and a type of merge process. The encoder may be configured to check whether the encoding parameters evaluated for providing the merge identifier are the same in all units of the video sequence, and to provide the merge identifier in dependence of said check.
According to an embodiment of the application, the plurality of encoding parameters may comprise merging related parameters that have to be the same in different video streams (i.e. encoded video representations) to allow a lower complexity of the merging than merging by full-pixel decoding, and wherein the video encoder is configured to provide the one or more merging identifiers based on the merging related parameters, i.e. without a common parameter between the two encoded video representations, performing full-pixel decoding, i.e. it is not possible to reduce the complexity of the merging process. The merge related parameters include one or more or all of the following: parameters describing motion constraints at tile boundaries (e.g., motion constraint tile set supplemental enhancement information (MCTS SEI)); information about the group of pictures (GOP) structure (i.e., mapping of coding order to display order); a random access point indication; for example, temporal layering with a picture structure SOP; SEI, parameters describing the chroma coding format and parameters describing the luma coding format, e.g. a set comprising at least the chroma format and the bit depth luma/chroma; parameters describing advanced motion vector prediction; parameters describing the sample adaptive offset; parameters describing temporal motion vector prediction; and parameters describing the loop filter and other coding parameters, i.e., a set of overwrite parameters (including a set of reference pictures, basic quantization parameters, etc.), a slice header, a slice payload to merge the two coded video representations.
According to an embodiment of the present application, the combining identifier associated with a first complexity of the combining process may be determined based on a first set of coding parameters when the combining parameters associated with a second complexity of the combining process are determined based on a second set of coding parameters, the second complexity being higher than the first complexity, the second set of coding parameters being a proper subset of the first set of coding parameters. A combining identifier associated with a third complexity of a combining process, the third complexity being higher than the second complexity, may be determined based on a third set of coding parameters, the third set of coding parameters being a proper subset of the second set of coding parameters. The video encoder is configured to: a merge identifier associated with a first complexity of the merge process is determined based on a set of coding parameters, such as a first set of merges of video streams that must be equal in two different video streams (e.g., coded video representations) to allow for: only the parameter sets applicable to multiple stripes are modified while leaving the stripe header and stripe payload unchanged, i.e., the merging process (only) by parameter set overwriting. The video encoder may be configured to: a merge identifier associated with a second complexity of the merge process is determined based on a set of coding parameters, e.g., a second set of video streams that must be equal in two different video streams (e.g., coded video representations) to allow merging of the video streams in the following manner: the modification may be applied to parameter sets of multiple stripes and also to the stripe header while keeping the stripe payload unchanged, i.e. by a merging procedure of parameter sets and stripe header overwrites. The video encoder may be configured to determine a merge identifier associated with a third complexity of the merge process based on a set of encoding parameters, e.g., a third set of video streams that must be equal in two different video streams (e.g., encoded video representations) to allow merging of the video streams in the following manner: the parameter set applicable to multiple slices is modified and the slice header and slice payload are also modified, but full pixel decoding and pixel re-encoding are not performed, i.e. a merging process by parameter set, slice header and slice payload re-writing.
According to an embodiment of the application, a video combiner for providing a combined video representation based on a plurality of encoded video representations (e.g. video streams), wherein the video combiner is configured to receive a plurality of video streams comprising encoding parameter information describing a plurality of encoding parameters, encoding video content information (i.e. encoded using encoding parameters defined by the parameter information), and one or more combining identifiers indicating whether an encoded video representation may be combined with another encoded video representation and/or how an encoded video representation may be combined with another encoded video representation. Wherein the video combiner is configured to decide the usage of the combining method (e.g. the combining type; combining procedure) based on the combining identifiers, i.e. based on a comparison of the combining identifiers of the different video streams. The video combiner is configured to select a combining method from a plurality of combining methods based on the combining identifier. The video combiner may be configured to select between at least two of the following combining methods; a first merging method, wherein the first merging method is merging of video streams in the following manner: modifying only the parameter sets that can be applied to the plurality of stripes while leaving the stripe header and stripe payload unchanged; a second merging method, wherein the second merging method is merging of video streams in the following manner: modifying a set of parameters that can be applied to a plurality of stripes and also modifying the stripe header while keeping the stripe payload unchanged; and a third merging method, wherein the third merging method is merging of video streams in the following manner: modifying the parameter set that can be applied to multiple slices and also modifying the slice header and slice payload, but not performing full pixel decoding and pixel re-encoding, is a merging method with different complexity according to one or more merging identifiers.
According to an embodiment of the application, the video combiner is configured to compare combining identifiers of two or more video streams associated with the same given combining method or with the same combining identifier type, and to decide whether to perform combining using the given combining method according to the result of the comparison. The video combiner may be configured to: if the comparison indicates that the merge identifiers of two or more video streams associated with a given merge method are equal, then the merge is selectively performed using the given merge method. The video combiner may be configured to: if the comparison of the merge identifiers indicates that the merge identifiers of two or more video streams associated with a given merge method are different, then a merge method of higher complexity than the given merge method associated with the compared merge identifiers is used, i.e., without further comparison of the encoding parameters themselves. The video combiner may be configured to: if the comparison of the merge identifiers indicates that the merge identifiers of two or more video streams associated with a given merge method are equal, then selectively comparing encoding parameters that must be equal in the two or more video streams to allow the video streams to be merged using the given merge method, and wherein the video merger is configured to: if a comparison of the encoding parameters (i.e., encoding parameters that must be equal in two or more video streams to allow the video streams to be combined using a given combining method) indicates that the encoding parameters are equal, then selectively performing combining using the given combining method, and wherein the video combiner is configured to: if the comparison of the coding parameters indicates that the coding parameters include differences, merging is performed using a merging method having a complexity higher than that of the given merging method.
According to an embodiment of the present application, the video combiner may be configured to: comparing merge identifiers associated with merge methods having different complexities, i.e., hashed, and wherein the video merger is configured to: identifying a lowest complexity merge method for which associated merge identifiers are equal in two or more video streams to be merged; and wherein the video combiner is configured to: comparing sets of coding parameters (i.e., individual coding parameters, rather than hashed versions thereof), which must be equal in two or more video streams to be combined to allow combining using the identified combining method, wherein different, typically overlapping sets of coding parameters are associated with combining methods of different complexity, and wherein the video combiner is configured to: if the comparison indicates that the encoding parameters of the encoding parameter sets associated with the identified merging method are equal in the video streams to be merged, then merging the two or more video streams using the identified merging method, and wherein the video merger is configured to: if the comparison indicates that the encoding parameters of the encoding parameter set associated with the identified merging method include differences in the video streams to be merged, two or more video streams are merged using a merging method having a complexity higher than that of the identified merging method. The video combiner is configured to: which encoding parameters should be modified in the merging process (i.e., merging video streams) is determined based on, for example, one or more differences between merging identifiers associated with the same merging method or "merging identifier type" of different video streams to be merged.
According to an embodiment of the application, the video combiner is configured to: the joint coding parameters or joint coding parameter sets associated with the slices of all video streams to be combined are obtained based on the coding parameters of the video streams to be combined, e.g. the sequence parameter set SPS and the picture parameter set PPS, and, e.g. in case the values of all coding parameters of one coded video representation and the other coded video representation are the same, the joint coding parameters are included in the combined video stream, the coding parameters are updated by copying common parameters in case there is any difference between the coding parameters of the coded video representations, the coding parameters are updated based on the primary coded video representation (i.e. one coded video representation is the primary coded video representation and the other coded video representation is the sub (sub) coded video representation), e.g. some coding parameters (e.g. the total picture size) may be adapted according to the combination of the video streams. The video combiner is configured to: the encoding parameters associated with the respective video slices are adapted (e.g. defined in the slice header) or, for example, when a merging method with a higher complexity than the lowest is used, in order to obtain modified slices to be included in the merged video stream. The adapted encoding parameters include parameters representing the picture sizes of the merged encoded video representations, wherein the picture sizes are calculated based on the picture sizes of the encoded video representations to be merged (i.e. in the respective dimensions and in the context of their spatial arrangement).
According to an embodiment of the present application, a video encoder for providing an encoded video representation (i.e., a video stream), wherein the video encoder may be configured to provide coarse granularity capability requirement information, e.g., level information; level 3 or level 4 or level 5, which describes the compatibility of a video stream with a video decoder, e.g. the decodability of a video stream by a video decoder having a capability level of a plurality of predetermined capability levels; and wherein the video encoder is configured to provide fine granularity capability requirement information, e.g. merge level limitation information describing how large a fraction of the allowable capability requirement (i.e. decoder capability) associated with one of the predetermined capability levels is required to decode the encoded video representation, and/or how large a fraction of the allowable capability requirement (i.e. "level limitation of the merge bitstream") is contributed by the encoded video representation (i.e. sub-bitstream) to the merge video stream to which the video stream is merged, the capability requirement of the merge bitstream being consistent with one of the predetermined capability levels, i.e. the capability requirement of the merge bitstream being smaller than or equal to the allowable capability requirement, the "allowable capability requirement of the merge bitstream consistent with one of the predetermined capability levels" corresponding to the "level limitation of the merge bitstream". The video encoder is configured to provide the fine granularity capability requirement information such that the fine granularity capability requirement information includes a ratio value or percentage value that references one of the predetermined capability levels. The video encoder is configured to provide the fine granularity capability requirement information such that the fine granularity capability requirement information includes reference information and score information, and wherein the reference information describes which of the predetermined capability levels the score information references such that the fine granularity capability requirement information describes a score of one of the predetermined capability levels as a whole.
According to an embodiment of the present application, a video combiner for providing a combined video representation based on a plurality of encoded video representations (i.e. video streams), wherein the video combiner may be configured to receive a plurality of video streams comprising encoding parameter information describing a plurality of encoding parameters, e.g. level information, level 3 or level 4 or level 5 describing a compatibility of the video streams with a video decoder, i.e. a decodability of the video streams, encoding video content information (e.g. encoded using encoding parameters defined by the parameter information), coarse granularity capability requirement information and fine granularity capability requirement information (e.g. combining level constraint information), the video combiner having one of a plurality of predetermined capability levels, wherein the video combiner is configured to combine two or more video streams according to the coarse granularity capability requirement information and the fine granularity capability requirement information. The video combiner may be configured to: that is, it is decided which video streams may be or may be included in the combined video stream according to the fine-granularity capability requirement information without violating the allowable capability requirement (i.e., such capability requirement of the combined video stream coincides with one of the predetermined capability levels). The video combiner may be configured to: determining whether a valid combined video stream (e.g., without violating the allowable capability requirements, i.e., such that the capability requirements of the combined video stream agree with one of the predetermined capability levels) may be obtained by combining two or more video streams according to the fine-granularity capability requirement information. The video merger is configured to integrate fine granularity capability requirement information of a plurality of video streams to be merged, for example, to determine which video streams may be included, or to decide whether a valid merged video stream is available.
According to an embodiment of the present application, a video decoder for decoding a provided video representation, wherein the video decoder is configured to receive a video representation comprising a plurality of sub-video streams (i.e. a plurality of sub-bitstreams of an entire bitstream), the plurality of sub-video streams comprising coding parameter information describing a plurality of coding parameters, coding video content information, coarse granularity capability requirement information describing compatibility of the video streams with video decoders having capability levels of a plurality of predetermined capability levels, i.e. individual information is carried, for example, in SEI messages, and fine granularity capability requirement information describing by the fine granularity capability requirement information, and whether a combined capability requirement of the plurality of sub-video streams to be combined matches a predetermined limit to be complied with (i.e. a level specific limit of the decoder). The video decoder may be further configured to parse the received coarse granularity capability requirement information and fine granularity capability requirement information to obtain an indication of the capability level and a fraction of the allowable capability requirement.
Drawings
Preferred embodiments of the present application are described below with reference to the accompanying drawings, in which:
Fig. 1 shows a block diagram of an apparatus for providing an encoded video representation as an example of a video encoder, in which the bit stream merging concept according to an embodiment of the application may be implemented;
Fig. 2 shows a schematic diagram illustrating an example of a bitstream structure according to an embodiment of the application;
FIG. 3 shows a block diagram of an apparatus for providing an encoded video representation as another example of a video encoder, in which the bit stream merging concept according to an embodiment of the present application may be implemented;
FIG. 4 shows a schematic diagram of an example of coding parameters according to an embodiment of the application;
FIGS. 5a, 5b show detailed Sequence Parameter Set (SPS) examples indicated in FIG. 4;
Fig. 6a, 6b show detailed Picture Parameter Set (PPS) examples indicated in fig. 4;
Fig. 7a to 7d show detailed slice header examples indicated in fig. 4;
Fig. 8 illustrates a detailed structure of a picture in the Supplemental Enhancement Information (SEI) message indicated in fig. 4;
Fig. 9 shows a detailed set of motion constraint tiles in the SEI message indicated in fig. 4;
FIG. 10 illustrates a block diagram of an apparatus for providing a merged video representation as an example of a video merger in which the bit stream merging concepts according to embodiments of the present application may be implemented;
FIG. 11 shows a schematic diagram illustrating a determination of combining complexity according to an embodiment of the application;
FIG. 12 shows a schematic diagram of a bitstream structure of a plurality of video representations to be combined and a bitstream structure of a combined video representation according to the bitstream combining concept of the present application;
FIG. 13 illustrates a block diagram of an apparatus for providing a merged video representation as another example of a video merger in which the bit stream merging concepts according to embodiments of the present application may be implemented;
FIG. 14 illustrates a block diagram of an apparatus for providing an encoded video representation that provides consolidated video representation capability requirement information that may be implemented in accordance with an embodiment of the application, as an example of a video encoder; and
Fig. 15 shows a block diagram of an apparatus for providing a merged video representation as an example of a video merger that provides merged video representation capability requirement information that may be implemented in accordance with an embodiment of the application.
Detailed Description
The following description sets forth specific details, such as particular embodiments, procedures, techniques, etc., for purposes of explanation and not limitation. It will be understood by those skilled in the art that other embodiments may be used in addition to these specific details. For example, while the following description is facilitated using a non-limiting example application, the techniques may be applied to any type of video codec. In some instances, detailed descriptions of well-known methods, nodes, interfaces, circuits, and devices are omitted so as not to obscure the description with unnecessary detail.
In the following description, the same or equivalent elements or elements having the same or equivalent functions are denoted by the same or equivalent reference numerals.
The invention herein aims to provide future video codecs such as VVC (versatile video coding) with means to put an indication in each sub-bitstream that allows to identify sub-bitstreams that can be incorporated together into a legal bitstream, or to identify sub-bitstreams that cannot be incorporated together into a legal bitstream with a given level of complexity. The indication hereinafter referred to as "merge identifier" also provides information about the appropriate merge method by an indication referred to as "merge identifier type". Assuming that both bitstreams carry a "merge identifier" and its same value, the sub-bitstreams can be merged into a new joint bitstream at a given level of merge complexity (associated with the merge method).
Fig. 1 shows a video encoder 2 for providing an encoded video representation, i.e. a video stream 12, based on a provided (input) video stream, comprising an encoder core 4, the encoder core 4 comprising an encoding parameter determination means 14 and a merge identifier provision 6. The video stream and the video stream 12 are provided with a bit stream structure, respectively, for example, as shown in fig. 2 as a simplified configuration. The bitstream structure is made up of a plurality of Network Abstraction Layer (NAL) units, each NAL unit including various parameters and/or data, such as a Sequence Parameter Set (SPS) 20, a Picture Parameter Set (PPS) 22, an Instantaneous Decoder Refresh (IDR) 24, supplemental Enhancement Information (SEI) 26, and a plurality of slices 28. The SEI 26 includes various messages, i.e., the structure of the picture, the motion constrained tile set, etc. The strip 28 includes a header 30 and a payload 32. The encoding parameter determination component 14 determines encoding parameters based on the SPS20, PPS22, SEI 26, and slice header 30. According to the present application, the IDR 24 is not an essential factor in determining the coding parameters, but the IDR 24 may optionally be included for determining the coding parameters. The merge identifier supply 6 provides a merge identifier indicating whether an encoded video stream can be merged with another encoded video stream and/or how (at which complexity) an encoded video stream can be merged with another encoded video stream. The merge identifier determines a merge identifier indicating a merge identifier type that represents those encoding parameters that must be equal in two different encoded video representations, or generally indicates a "suitable" merge method, e.g., merge by parameter set overwrite, merge by parameter set and stripe header overwrite, or merge by parameter set, stripe header, stripe payload overwrite, wherein the merge identifier is associated with a merge identifier type that includes those encoding parameters that must be equal in two different encoded video representations, such that the two different encoded video representations can be merged using the complexity of the merge process represented by the merge identifier type.
As a result, the encoded video stream includes encoding parameter information describing a plurality of encoding parameters, encoded video content information, and one or more merge identifiers. In this embodiment, the merge identifier is determined based on the encoding parameters. However, the merge identifier value may also be set at the discretion of the encoder operator with little assurance of collision avoidance (which may be sufficient for a closed system). A third party entity such as DVB (digital video broadcasting), ATSC (advanced television systems committee) or the like may define the value of the merge identifier to be used within its system.
Considering another embodiment according to the present invention using fig. 3 to 9, the merge identifier type is described below.
Fig. 3 shows an encoder 2 (2 a) comprising an encoder core 4 and a merge identifier 6a and indicating the data flow in the encoder 2 a. The merge identifier 6a includes a first hash component 16a and a second hash component 16b that provide hash values as merge identifiers to indicate the type of merge. That is, the combined value may be formed from concatenation of encoded values of defined sets of syntax elements (encoding parameters) of the bitstream, hereinafter referred to as a hash set. The combined identifier value may also be formed by feeding the concatenation of encoded values described above into a well-known hash function (e.g., MD5, SHA-3) or any other suitable function. As shown in fig. 3, the input video stream includes input video information 10, and the input video stream is processed at the encoder core 4. The encoder core 4 encodes the input video content and stores the encoded video content information in the payload 32. The coding parameter determining section 14 receives parameter information including SPS, PPS, a slice header, and an SEI message. The parameter information is stored in each corresponding unit and received at the combined identifier supply 6a, i.e. at the first hash part 16a and the second hash part 16b, respectively. The first hash component 16a generates a hash value, e.g., indicating the merge identifier type 2, based on the encoding parameters, and the second hash component 16b generates a hash value, e.g., indicating the merge identifier type 1, based on the encoding parameters.
The content of the hash set, i.e. which syntax element (i.e. merge parameter) values are concatenated to form a merge identifier value, determines the quality of the indication of mergeability with respect to the above-mentioned syntax category.
For example, the merge identifier type indicates a suitable merge method with respect to the merge identifier, i.e., different levels of mergeability corresponding to syntax elements incorporated in the hash set:
type 0-merge identifier for merging by parameter set overwriting
Type 1-merge identifier for merging by parameter set and stripe header overwrite
Type 2-merge identifier for merging by parameter set, stripe header and stripe payload overwrite
For example, given two input sub-bitstreams to be combined in the context of the present application, a device may compare the values of the combined identifier and the combined identifier type and draw a conclusion of the prospect of using the method associated with the combined identifier type on both sub-bitstreams.
The table below gives the mapping between the syntax element categories and the associated merging methods.
Grammar class Merging method
A 0
B 1
C.a 2
C.b、D Full transcoding
As mentioned above, the syntax categories are by no means exhaustive, but it is clear that a wide variety of parameters affect the advantages of merging sub-bitstreams into a common bitstream in different ways and that tracking and analyzing these parameters is cumbersome. In addition, the syntax category and the merging method (type) do not completely correspond, and thus, for example, some parameters are required for category B, but the merging method 1 does not require the same parameters.
A combined identifier value according to two or more of the above identifier type values is generated and written to the bitstream to allow the device to easily identify applicable combining methods. Merging method (type) 0 indicates a first value of a merging identifier type, merging method (type) 1 indicates a second value of a merging identifier type, and merging method (type) 3 indicates a third value of a merging identifier type.
The following exemplary syntax should be incorporated into the hash set.
● Temporal motion constraint tile set SEI messages indicating motion constraints at tile and picture boundaries (merging methods 0, 1,2, syntax categories C.b)
● The structure of a picture information SEI message that defines the GOP structure (i.e., mapping of coding order to display order, random access point indication, temporal layering, reference structure) (merging methods 0, 1,2, syntax categories C.b)
● Parameter set syntax element values
Reference picture set (merging method 0, grammar class B)
Chroma format (merging methods 0, 1, 2, grammar class D)
Base QP, chroma QP offset (merge method 0, syntax class B)
Bit depth luminance/chrominance (merging methods 0, 1, 2, grammar class D)
O hrd parameter
■ Initial arrival delay (merging method 0, grammar class B)
■ Initial removal delay (merging method 0, grammar class B)
O-shaped coding tool
■ Coding block structure (maximum/minimum block size, inferred partitioning) (merging methods 0, 1, grammar class C.a)
■ Transform size (min/max) (merge method 0, 1, grammar class C.a)
■ PCM block usage (merging methods 0, 1, grammar class C.a)
■ Advanced motion vector prediction (merging methods 0, 1, 2, grammar class C.b)
■ Sample adaptive offset (merge method 0, grammar class C.b)
■ Temporal motion vector prediction (merging methods 0, 1, 2, grammar class C.b)
■ Intra smoothing (merging methods 0, 1, grammar class C.a)
■ Correlation strip (merging method 0, grammar class A)
■ Symbol hiding (merging methods 0, 1, grammar class C.a)
■ Weighted prediction (merging method 0, grammar class A)
■ Transform quantization (transquant) bypass (merging methods 0, 1, grammar class C.a)
■ Entropy coding synchronization (merging methods 0, 1, grammar class C.a) (Skup)
■ Loop filter (merging methods 0, 1, 2, grammar class C.b)
● Tape head value
Parameter set ID (merge method 0, grammar class C.a)
Reference picture set (merging methods 0, 1, grammar class B)
● Using implicit CTU address signaling cp. (referenced by European patent application number: EP 18153516) (merging method 0, syntax element A)
That is, for a first value of the merge identifier type, i.e., type 0, the use of motion constraints at syntax element (parameter) tiles and picture boundaries, GOP structure, reference picture set, chroma format, basic quantization parameters and chroma quantization parameters, bit depth luma/chroma, hypothetical reference decoder parameters including parameters on initial arrival delay and parameters on initial removal delay, coding block structure, transform minimum and/or maximum size, pulse code modulation block use, advanced motion vector prediction, sample adaptive offset, temporal motion vector prediction, intra smoothing, correlation slices, symbol concealment, weighted prediction, transform quantization bypass, entropy coding synchronization, loop filter, slice header values including parameter set ID, slice header values including reference picture set, and implicit coded transform unit address signaling should be incorporated into the hash set.
For the second value of the merge identifier type, type 1, syntax element (parameter): motion constraints at tile and picture boundaries, GOP structure, chroma format, bit depth luma/chroma, coding block structure, transform minimum and/or maximum size, pulse code modulation block usage, advanced motion vector prediction, sample adaptive offset, temporal motion vector prediction, intra smoothing, symbol concealment, transform quantization bypass, entropy coding synchronization, loop filter, and slice header values including a reference picture set.
For the third value of the merge identifier type, type 2, syntax element (parameter): motion constraints at tile and picture boundaries, GOP structure, chroma format, bit depth luminance/chrominance, advanced motion vector prediction, sample adaptive offset, temporal motion vector prediction, loop filter.
Fig. 4 shows a schematic diagram illustrating an example of coding parameters according to an embodiment of the application. In fig. 4, reference numeral 40 denotes a type 0, and syntax elements belonging to the type 0 are indicated by dotted lines. Reference numeral 42 denotes type 1, and syntax elements belonging to type 1 are indicated by a normal line. Reference numeral 44 denotes type 2, and syntax elements belonging to type 1 are indicated by dashed lines.
Fig. 5a and 5b are examples of a Sequence Parameter Set (SPS) 20, the syntax elements required for type 0 being indicated by reference numeral 40. In the same manner, the syntax elements required for type 1 are indicated by reference numeral 42, and the syntax elements required for type 2 are indicated by reference numeral 44.
Fig. 6a and 6b are examples of Picture Parameter Sets (PPS) 22, the syntax elements required for type 0 being indicated by reference numeral 40, and the syntax elements required for type 1 being indicated by reference numeral 42.
Fig. 7a to 7d are examples of the slice header 30, and only one syntax element of the slice header is necessary for type 0, as indicated by reference numeral 40 in fig. 7 c.
Fig. 8 is an example of a picture Structure (SOP) 26a, and all syntax elements belonging to the SOP are required for type 2.
Figure 9 is an example of a Motion Constrained Tile Set (MCTS) 26b, and all syntax elements belonging to the MCTS are required for type 2.
As described above, the combined identifier 6a of fig. 3 uses the hash function at the hash component 16a and the hash component 16b to generate the combined identifier value. In the case of generating two or more combined identifier values using a hash function, the hashes are connected in the following manner: the hash function input to the second merge identifier containing the additional element in the hash set relative to the first merge identifier uses the first identifier value (hash result) instead of the respective syntax element value to concatenate the inputs of the hash function.
In addition, the presence of the merge identifier also provides the following guarantees: the syntax elements incorporated in the hash set have the same value in all Access Units (AUs) of the Coded Video Sequence (CVS) and/or the bitstream. In addition, the assurance is in the form of constraint flags in the profile/level syntax of the parameter set.
Considering another embodiment according to the present invention using fig. 10 to 13, the merging process is described below.
Fig. 10 illustrates a video combiner for providing a combined video stream based on a plurality of encoded video representations. The video combiner 50 includes a receiver 52, a combining method identifier 54, and a combining processor 56, the receiver 52 receiving the input video stream 12 (12 a and 12b shown in fig. 12). The combined video stream 60 is sent to a decoder. In the case where the video combiner 50 is included in the decoder, the combined video stream is sent to the user equipment or any other device to display the combined video stream.
The merging process is driven by the merge identifier and the merge identifier type described above. The merging process may only need to generate a parameter set plus an interleaving of NAL units, which is the least-significant form of merging associated with a merge identifier type value of 0, i.e. the first complexity. Fig. 12 shows an example of the first complexity combining method. As shown in fig. 12, the parameter sets, i.e., SPS1 of video stream 12a and SPS2 of video stream 12b, are combined (combined SPS is generated based on SPS1 and SPS 2), and PPS1 of video stream 12a and PPS of video stream 12b are combined (combined PPS is generated based on PPS1 and PPS 2). IDR is optional data, and therefore, description is omitted. In addition, stripes 1,1 and 1,2 of video stream 12a and stripes 2,1 and 2,2 of video stream 12b are interleaved, as shown by merged video stream 60. In fig. 10 and 12, two video streams 12a and 12b are input as an example. However, more video streams may be input and combined in the same manner.
The merging process may also include overwriting the stripe header in the bitstream during NAL unit interleaving associated with a merge identifier type value of 1 (i.e., a second complexity), when needed. Furthermore and finally, there are cases where syntax elements in the slice payload need to be adjusted, which is associated with a merge identifier type value of 2 (i.e., third complexity), and which requires entropy decoding and encoding during NAL unit interleaving. The merge identifier and merge identifier type drive the decision on selecting one of the merge processes to be performed and its details.
The input to the merging process is a list of input sub-bitstreams, which also represents the spatial arrangement. The output of this process is a combined bit stream. In general, in all bit stream merging processes, a parameter set for a new output bit stream needs to be generated, which may be based on the parameter set of the input sub-bit stream (e.g., the first input sub-bit stream). The necessary updates to the parameter set include the picture size. For example, the picture size of the output bitstream is calculated as the sum of the picture sizes of the input bitstreams in the respective dimensions and in the context of their spatial arrangement.
The bit stream merging process requires that all input sub-bit streams carry the same value for at least one instance of the merge identifier and merge identifier type. In one embodiment, a merging procedure associated with a minimum value of a merge identifier type for which all sub-bitstreams carry the same value of the merge identifier is performed.
For example, differences in the merge identifier values with a certain merge identifier type value are used in the merge process to determine details of the merge process from the difference value of the merge identifier type that matches the merge identifier value. For example, as shown in fig. 11, when a first merge identifier 70 having a merge identifier type value equal to 0 does not match between two input sub-bitstreams 70a and 70b, but a second merge identifier 80 having a merge identifier type value equal to 1 matches between two sub-bitstreams 80a and 80b, the bit position difference of the first merge identifier 70 indicates that an adjusted (slice header-related) syntax element is required in all slices.
Fig. 13 shows a video combiner 50 (50 a) comprising a receiver (not shown) and a combining processor 56, the combining method identifier comprising a combining identifier comparator 54a and a coding parameter comparator 54b. Where the input video streams 12 include hash values as the merge identifier values, the values of each input video stream are compared at a merge identifier comparator 54 a. For example, when both input video streams have the same merge identifier value, the respective encoding parameters of each input video stream are compared at encoding parameter comparator 54b. A merging method is decided based on the encoding parameter comparison result, and the merging processor 56 merges the input video streams 12 by using the decided merging method. In case the merge identifier value (hash value) also indicates the merge method, no individual coding parameter comparison is required.
Three merging methods, a first complexity merging method, a second complexity merging method, and a third complexity merging method, are explained above. The fourth merging method is to merge the video streams using full pixel decoding and pixel re-encoding. If all three merging methods are not applicable, a fourth merging method is applied.
The process for identifying the merge result level is described below with reference to another embodiment according to the present invention using fig. 14 and 15. The identification of the merging result level means that information is put into the sub-bit stream and how much the sub-bit stream contributes to the level restriction of the merged bit stream comprising the sub-bit stream.
Fig. 14 shows an encoder 2 (2 b), the encoder 2 (2 b) comprising an encoder core comprising an encoding parameter determination 14, a merge identifier provision (not shown) and a granularity capability provider 8.
In general, when sub-bitstreams are to be combined into a joint bitstream, an indication of how much each sub-bitstream contributes to the level-specific restrictions of the codec system to which the potential combined bitstream has to adhere is crucial to ensure that a legal joint bitstream is created. Although traditionally the codec level granularity is quite coarse (e.g. distinguishing the main resolution as 720p, 1080p or 4K), the merge level limitation indicates that finer granularity is required. The granularity of such a conventional level indication is not sufficient to represent the contribution of the individual sub-bitstreams to the combined bitstream. Given that the number of tiles to be merged is not known in advance, a reasonable tradeoff needs to be found between flexibility and bit rate overhead, but in general it far exceeds the traditional level-limiting granularity. One example use case is a 360 degree video stream where the service provider needs the freedom to choose between different tiling structures (e.g., 12, 24 or 96 tiles per 360 degree video), where each tile stream will contribute 1/12, 1/24 or 1/96 of the overall level constraint (e.g., 8K assuming equal rate distribution). Even further, assuming that the rate distribution between tiles is non-uniform (e.g., to achieve constant quality across the video plane), any fine granularity may be required.
For example, such signaling would be a ratio and/or percentage of signaling that would otherwise be at the level of signaling. For example, in a conference scenario with four participants, each participant would send a legal level 3 bit stream that also includes an indication, i.e. level information included in coarse granularity capability information, that indicates that the sent bit stream complies with 1/3 and/or 33% of the level 5 constraint, i.e. that this information may be included in the fine granularity capability information, the receiver of a plurality of such streams, i.e. the video combiner 50 (50 b) shown in fig. 15, may thus know that these three bit streams may be combined into a single joint bit stream following level 5, for example.
In fig. 15, a video combiner is shown comprising a receiver 52 and a combining processor 56, and a plurality of video streams are input into the video combiner. The video combiner may also be included in the video decoder or the video decoder is not shown in the figures, however, the receiver may operate as a decoder. A decoder (not shown in the figure) receives a bit stream comprising a plurality of sub-bit streams, a plurality of encoding parameters, coarse granularity capability information, and fine granularity capability information. Coarse granularity capability information and fine granularity capability information are carried in the SEI message, and the decoder parses the received information and interprets the level and score as a level limit available to the decoder. The decoder then checks whether the bit stream it is encountering actually complies with the indicated limits. The result of the inspection is provided to a video combiner or combining processor. Thus, as previously described, it is possible that the video combiner can know that multiple sub-bitstreams can be combined into a single joint bitstream.
The granularity capability information may have an indication of the ratio and/or percentage as a vector of values, each dimension relating to another aspect of the codec level constraint, e.g., maximum number of allowed luminance samples per second, maximum image size, bit rate, buffer fullness, number of tiles, etc. In addition, the ratio and/or percentage refers to a general codec level of the video bitstream.
Further embodiments and aspects of the invention will be described hereinafter, which may be used alone or in combination with any of the features, functions, and details described herein.
A first aspect relates to a video encoder for providing an encoded video representation, wherein the video encoder is configured to provide a video stream comprising encoding parameter information describing a plurality of encoding parameters, encoding video content information and one or more merge identifiers indicating whether an encoded video representation may be merged with another encoded video representation and/or how the encoded video representation may be merged with another encoded video representation.
According to a second aspect referring to the first aspect, in the video encoder, it is determined which complexity to use based on a parameter value defined by parameter information.
According to a third aspect referring to the first or second aspect, in the video encoder, the merge identifier is a concatenation of a plurality of coding parameters.
According to a fourth aspect referring to any one of the first to third aspects, in the video encoder, the merge identifier is a concatenated hash value of a plurality of encoding parameters.
According to a fifth aspect referring to any one of the first to fourth aspects, in the video encoder, the merge identifier indicates a merge identifier type representing a complexity of a merge process.
According to a sixth aspect with reference to the fifth aspect, in the video encoder, the value of the merge identifier type indicates a merge process, wherein the video encoder is configured to switch between at least two of the following values of the merge identifier type: a first value of a merge identifier type representing a merge procedure overwritten by a parameter set; a second value of the merge identifier type representing a merge procedure by parameter set and stripe header overwrite; and a third value of the merge identifier type, representing a merge procedure through parameter set, stripe header, and stripe payload overwrite.
According to a seventh aspect referring to any one of the first to sixth aspects, in the video encoder, a plurality of merging identifiers are associated with different complexities of the merging process.
According to an eighth aspect referring to any one of the first to seventh aspects, in the video encoder, the encoder is configured to check whether the encoding parameters evaluated for providing the merge identifier are the same in all units of the video sequence, and to provide the merge identifier in accordance with the check.
According to a ninth aspect with reference to any one of the first to eighth aspects, in the video encoder the plurality of encoding parameters comprises merging related parameters, which merging related parameters have to be the same in different video streams to allow merging to have a lower complexity than merging by full pixel decoding, and the video encoder is configured to provide the one or more merging identifiers based on the merging related parameters.
According to a tenth aspect with reference to the ninth aspect, in the video encoder, the merge related parameter comprises one or more or all of the following parameters: parameters describing motion constraints at tile boundaries, information about group of pictures (GOP) structure, parameters describing chroma coding format and parameters describing luma coding format, parameters describing advanced motion vector prediction, parameters describing sample adaptive offset, parameters describing temporal motion vector prediction, and parameters describing loop filters.
According to an eleventh aspect referring to any one of the first to seventh aspects, in the video encoder, when a merging parameter associated with a second complexity of a merging process is determined based on a second set of coding parameters, the second complexity being higher than the first complexity, a merging identifier associated with a first complexity of the merging process is determined based on a first set of coding parameters, the second set of coding parameters being a proper subset of the first set of coding parameters.
According to a twelfth aspect with reference to the eleventh aspect, in the video encoder, a merge identifier associated with a third complexity of a merge process is determined based on a third set of coding parameters, the third complexity being higher than the second complexity, the third set of coding parameters being a proper subset of the second set of parameters.
According to a thirteenth aspect with reference to the eleventh or twelfth aspect, the video encoder is configured to determine the merge identifier associated with the first complexity of the merge process based on a set of coding parameters that have to be equal in two different video streams to allow the merging of the video streams in the following way: only the parameter sets that can be applied to multiple stripes are modified while keeping the stripe header and stripe payload unchanged.
According to a fourteenth aspect with reference to the eleventh or twelfth aspect, the video encoder is configured to determine the merge identifier associated with the first complexity based on one or more or all of the following parameters:
parameters indicating motion constraints at tile and picture boundaries,
The parameters of the GOP structure are defined and,
The parameters of the reference picture set are described,
The parameters describing the chromaticity format are described in terms of,
Parameters describing basic quantization parameters and chrominance quantization parameters,
Parameters describing the bit depth luminance/chrominance,
Parameters of hypothetical reference decoder parameters are described, including parameters related to initial arrival delays and parameters related to initial removal delays,
The parameters describing the structure of the encoded block,
Parameters describing the transform minimum size and/or maximum size,
Describing the parameters used by the pulse code modulation block,
Parameters describing the advanced motion vector prediction,
Parameters describing the adaptive offset of the sample,
The parameters describing the temporal motion vector prediction,
Parameters describing the intra-frame smoothing are described,
The parameters describing the relevant strip are described in terms of,
The parameters describing the concealment of the symbols,
The parameters of the weighted prediction are described,
The parameters describing the transformation quantization bypass are described,
The parameters describing the synchronization of the entropy coding,
Describing the parameters of the loop filter,
Parameters describing the slice header values including the parameter set ID,
Parameters describing slice header values including reference picture sets, and
Parameters for address signaling using implicit coding transform units are described.
According to a fifteenth aspect referring to any one of the eleventh to fourteenth aspects, the video encoder is configured to determine a merge identifier associated with the second complexity of the merge process based on a set of coding parameters that must be equal in two different video streams to allow the merging of the video streams in the following way: modifying the set of parameters that can be applied to multiple stripes and also modifying the stripe header while keeping the stripe payload unchanged.
According to a sixteenth aspect with reference to any one of the eleventh to fifteenth aspects, the video encoder is configured to determine the merge identifier associated with the second complexity based on one or more or all of the following parameters:
parameters indicating motion constraints at tile and picture boundaries,
The parameters of the GOP structure are defined and,
The parameters describing the chromaticity format are described in terms of,
Parameters describing the bit depth luminance/chrominance,
The parameters describing the structure of the encoded block,
Parameters describing the transform minimum size and/or maximum size,
Describing the parameters used by the pulse code modulation block,
Parameters describing the advanced motion vector prediction,
Parameters describing the adaptive offset of the sample,
The parameters describing the temporal motion vector prediction,
Parameters describing the intra-frame smoothing are described,
The parameters describing the concealment of the symbols,
The parameters describing the transformation quantization bypass are described,
The parameters describing the synchronization of the entropy coding,
Describing parameters of a loop filter, and
Parameters describing slice header values including reference picture sets.
According to a seventeenth aspect referring to any one of the eleventh to sixteenth aspects, the video encoder is configured to determine a merge identifier associated with a third complexity of the merge process based on a set of coding parameters that have to be equal in two different video streams to allow merging of the video streams in the following way: modifying the parameter set that can be applied to multiple slices and also modifying the slice header and slice payload, but does not perform full pixel decoding and pixel re-encoding.
According to an eighteenth aspect referring to any one of the eleventh to seventeenth aspects, the video encoder is configured to determine a merge identifier associated with the third complexity based on one or more or all of the following parameters:
parameters indicating motion constraints at tile and picture boundaries,
The parameters of the GOP structure are defined and,
The parameters describing the chromaticity format are described in terms of,
Parameters describing the bit depth luminance/chrominance,
Parameters describing the advanced motion vector prediction,
Parameters describing the adaptive offset of the sample,
Parameters describing temporal motion vector prediction, and
Parameters describing the loop filter.
According to a nineteenth aspect referring to any one of the first to eighteenth aspects, the video encoder is configured to: applying a hash function to a concatenation of a second merge identifier and one or more encoding parameters not considered in determining the second merge identifier, in order to obtain a first merge identifier associated with a first complexity of a merge process, wherein the second merge identifier is associated with a second complexity of the merge process, the first complexity being lower than the second complexity.
According to a twentieth aspect referring to any one of the first to nineteenth aspects, the video encoder is configured to: applying a hash function to a concatenation of a third merge identifier and one or more coding parameters not considered in determining the third merge identifier, in order to obtain a second merge identifier associated with a second complexity of the merge process, wherein the third merge identifier is associated with a third complexity of the merge process, the second complexity being lower than the third complexity.
A twenty-first aspect relates to a video combiner for providing a combined video representation based on a plurality of encoded video representations, wherein the video combiner is configured to receive a plurality of video streams comprising encoding parameter information describing a plurality of encoding parameters, encoding video content information and one or more combining identifiers indicating whether and/or how the encoded video representation can be combined with another encoded video representation; wherein the video combiner is configured to determine the usage of the combining method based on the combining identifier.
According to a twenty-second aspect referring to the twenty-first aspect, the video combiner is configured to select a combining method from a plurality of combining methods according to the combining identifier.
According to a twenty-third aspect of the reference twenty-second aspect, the video combiner is configured to select between at least two of the following combining methods according to one or more combining identifiers: a first merging method, wherein the first merging method is merging of video streams in the following manner: modifying only the parameter sets that can be applied to the plurality of stripes while leaving the stripe header and stripe payload unchanged; a second merging method, wherein the second merging method is merging of video streams in the following manner: modifying a set of parameters that can be applied to a plurality of stripes and also modifying the stripe header while keeping the stripe payload unchanged; and a third merging method, wherein the third merging method is merging of video streams in the following manner: modifying the parameter set that can be applied to multiple slices and also modifying the slice header and slice payload, but does not perform full pixel decoding and pixel re-encoding.
According to a twenty-fourth aspect referring to the twenty-third aspect, the video combiner is configured to: a fourth merging method is selectively used, based on the one or more merging identifiers, that is merging of video streams using full-pixel decoding and pixel re-encoding.
According to a twenty-fifth aspect referring to any one of the twenty-second to twenty-fourth aspects, the video combiner is configured to: the merging identifiers of two or more video streams associated with the same given merging method are compared to decide whether to perform merging using the given merging method based on the result of the comparison.
According to a twenty-sixth aspect referring to the twenty-fifth aspect, the video combiner is configured to: if the comparison indicates that the merge identifiers of the two or more video streams associated with the given merge method are equal, then selectively performing merging using the given merge method.
According to a twenty-seventh aspect referring to the twenty-fifth aspect, the video combiner is configured to: if the comparison indicates that the merge identifiers of the two or more video streams associated with the given merge method are different, then a merge method of higher complexity than the given merge method is used.
According to a twenty-eighth aspect referring to the twenty-seventh aspect, the video combiner is configured to: in the event that the comparison indicates that the merge identifiers of two or more video streams associated with the given merge method are equal, selectively comparing encoding parameters that must be equal among the two or more video streams to allow merging of video streams using the given merge method, and wherein the video merger is configured to: selectively performing merging using the given merging method if the comparison of the encoding parameters indicates that the encoding parameters are equal, and wherein the video merger is configured to: if the comparison of the coding parameters indicates that the coding parameters include differences, merging is performed using a merging method having a complexity higher than that of the given merging method.
According to a twenty-ninth aspect referring to any one of the twenty-first to twenty-eighth aspects, the video combiner is configured to compare combining identifiers associated with combining methods having different complexity, and wherein the video combiner is configured to combine two or more video streams using a combining method of lowest complexity for which the associated combining identifiers are equal in the two or more video streams to be combined.
According to a thirty-first aspect with reference to any one of the twenty-second to twenty-ninth aspects, the video combiner is configured to compare combining identifiers associated with combining methods having different complexities, and wherein the video combiner is configured to identify a combining method of lowest complexity for which the associated combining identifiers are equal in two or more video streams to be combined; and wherein the video combiner is configured to compare sets of coding parameters that must be equal in two or more video streams to be combined to allow combining using the identified combining method, and wherein the video combiner is configured to: if the comparison indicates that the encoding parameters of the encoding parameter sets associated with the identified merging method are equal in the video streams to be merged, then the two or more video streams are merged using the identified merging method.
According to a thirty-first aspect referring to any one of the twenty-first to thirty-first aspects, the video combiner is configured to determine which coding parameters should be modified during the combining process based on one or more differences between combining identifiers of different video streams to be combined.
According to a thirty-second aspect referring to the thirty-second aspect, the video combiner is configured to determine which coding parameters should be modified in a combining method having a given complexity based on one or more differences between combining identifiers associated with combining methods having a lower complexity than the given complexity with different video streams to be combined.
According to a thirty-third aspect referring to any one of the twenty-first to thirty-second aspects, the video combiner is configured to obtain joint coding parameters associated with slices of all video streams to be combined based on coding parameters of the video streams to be combined, and to include the joint coding parameters in the combined video stream.
According to a thirty-fourth aspect referring to the thirty-third aspect, the video combiner is configured to adapt coding parameters associated with respective video slices to obtain modified slices to be included in the combined video stream.
According to a thirty-fifth aspect referring to the thirty-fourth aspect, in the video merger the adapted encoding parameters comprise parameters representing a picture size of a merged encoded video representation, wherein the picture size is calculated based on the picture size of the encoded video representation to be merged.
A thirty-sixth aspect relates to a method for providing an encoded video representation, comprising: a video stream is provided that includes encoding parameter information describing a plurality of encoding parameters, encoding video content information, and one or more merge identifiers indicating whether and/or how the encoded video representation can be merged with another encoded video representation.
A thirty-seventh aspect relates to a method for providing a combined video representation based on a plurality of encoded video representations, comprising: receiving a plurality of video streams comprising encoding parameter information describing a plurality of encoding parameters, encoding video content information, and one or more merge identifiers indicating whether an encoded video representation can be merged with another encoded video representation and/or how an encoded video representation can be merged with another encoded video representation; and selecting a merging method from a plurality of merging methods according to the merging identifier.
A thirty-eighth aspect relates to a merging method for merging two or more video streams, comprising: providing common encoding parameter information based on encoding parameter information of different video streams while leaving encoded video content information unchanged; selecting a merging process based on the unchanged encoded video content information; and merging the two or more video streams using the selected merging process.
A thirty-ninth aspect relates to a computer program having a program code for performing any of the methods according to the thirty-sixth to thirty-eighth aspects when run on a computer.
A fortieth aspect relates to a data stream generated by any one of the methods according to the thirty-sixth to thirty-eighth aspects.
Although some aspects have been described in the context of an apparatus or system, it will be clear that these aspects also represent descriptions of corresponding methods in which a block or device corresponds to a method step or a feature of a method step. Similarly, aspects described in the context of method steps also represent descriptions of features of corresponding blocks or items or corresponding devices and/or systems. Some or all of the method steps may be performed by (or using) hardware devices, such as microprocessors, programmable computers or electronic circuits. In some embodiments, one or more of the most important method steps may be performed by such an apparatus.
The data streams of the present invention may be stored on a digital storage medium or may be transmitted on a transmission medium such as a wireless transmission medium or a wired transmission medium (e.g., the internet).
Embodiments of the invention may be implemented in hardware or in software, depending on certain implementation requirements. The implementation may be performed by using a digital storage medium (e.g., floppy disk, DVD, blu-Ray, CD, ROM, PROM, and EPROM, EEPROM, or flash memory) having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system, such that the corresponding method is performed. Thus, the digital storage medium may be computer readable.
Some embodiments according to the invention comprise a data carrier with electronically readable control signals, which are capable of cooperating with a programmable computer system in order to perform one of the methods described herein.
In general, embodiments of the invention may be implemented as a computer program product having a program code operable to perform one of the methods when the computer program product is run on a computer. The program code may for example be stored on a machine readable carrier.
Other embodiments include a computer program stored on a machine-readable carrier for performing one of the methods described herein.
In other words, an embodiment of the inventive method is thus a computer program with a program code for performing one of the methods described herein when the computer program runs on a computer.
Thus, another embodiment of the inventive method is a data carrier (or digital storage medium or computer readable medium) having a computer program recorded thereon for performing one of the methods described herein. The data carrier, digital storage medium or recording medium is typically tangible and/or non-transitory.
Thus, another embodiment of the inventive method is a data stream or signal sequence representing a computer program for performing one of the methods described herein. The data stream or signal sequence may, for example, be configured to be transmitted via a data communication connection (e.g., via the internet).
Another embodiment includes a processing device, such as a computer or programmable logic device, configured or adapted to perform one of the methods described herein.
Another embodiment includes a computer having a computer program installed thereon for performing one of the methods described herein.
Another embodiment according to the invention comprises an apparatus or system configured to transmit a computer program to a receiver (e.g., electronically or optically), the computer program for performing one of the methods described herein. The receiver may be, for example, a computer, mobile device, storage device, etc. The apparatus or system may for example comprise a file server for transmitting the computer program to the receiver.
In some embodiments, a programmable logic device (e.g., a field programmable gate array) may be used to perform some or all of the functions of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor to perform one of the methods described herein. In general, the method is preferably performed by any hardware device.
The apparatus described herein may be implemented using hardware means, or using a computer, or using a combination of hardware means and a computer.
The apparatus described herein or any component of the apparatus described herein may be implemented at least in part in hardware and/or software.
The methods described herein may be performed using hardware devices, or using a computer, or using a combination of hardware devices and computers.
Any of the components of the methods described herein or the apparatus described herein may be performed, at least in part, by hardware and/or by software.
The above-described embodiments are merely illustrative of the principles of the present invention. It should be understood that modifications and variations of the arrangements and details described herein will be apparent to those skilled in the art. It is therefore intended that the scope of the appended patent claims be limited only and not by the specific details given by way of description and explanation of the embodiments herein.

Claims (24)

1. A video encoder for providing an encoded video representation, comprising:
A processor;
A memory storing a computer program that, when executed by the processor, causes the video encoder to:
Providing coarse grain capability requirement information describing: compatibility of video streams with video decoders having a capability level of a plurality of predetermined capability levels, and
Providing fine granularity capability requirement information describing: which score of the allowable capability requirements associated with one of the predetermined capability levels is needed to decode the encoded video representation to enable a video combiner to combine two or more video streams according to the coarse granularity capability requirement information and the fine granularity capability requirement information.
2. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: providing the fine granularity capability requirement information such that the fine granularity capability requirement information includes a ratio value or a percentage value.
3. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: the fine granularity capability requirement information is provided such that the fine granularity capability requirement information includes a ratio value or a percentage value that references a predetermined capability level described by the coarse granularity capability requirement information.
4. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: providing the fine granularity capability requirement information such that the fine granularity capability requirement information includes reference information and score information, and
Wherein the reference information describes which of the predetermined capability levels the score information references.
5. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: the fine granularity capability requirement information is provided such that the fine granularity capability requirement information describes capability requirements of the encoded video representation in terms of capability requirements with respect to a plurality of standards.
6. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: the fine granularity capability requirement information is provided such that the fine granularity capability requirement information includes a plurality of score values associated with different criteria.
7. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: providing the fine granularity capability requirement information such that the fine granularity capability requirement information includes one or more score values describing one or more of the following criteria:
-a fraction of the maximum number of allowed luminance samples per second;
-a fraction of the maximum image size;
-a fraction of the maximum bit rate;
-a fraction of buffer fullness; and
-A fraction of the maximum number of tiles.
8. The video encoder of claim 1, wherein the computer program, when executed by the processor, further causes the video encoder to: providing such resolution to the fine granularity capability requirement information.
9. A video combiner for providing a combined video representation based on a plurality of encoded video representations, the video combiner comprising:
A processor;
a memory storing a computer program that, when executed by the processor, causes the video combiner to:
receiving a plurality of video streams including encoding parameter information describing a plurality of encoding parameters, encoding video content information, coarse granularity capability requirement information describing compatibility of the video streams with a video decoder having a capability level of a plurality of predetermined capability levels, and fine granularity capability requirement information describing which score of an allowable capability requirement associated with one of the predetermined capability levels is required to decode an encoded video representation, and
Two or more video streams are merged according to the coarse granularity capability requirement information and the fine granularity capability requirement information.
10. The video combiner of claim 9, wherein the computer program, when executed by the processor, further causes the video combiner to: which video streams can be included in the combined video stream is determined based on the fine granularity capability requirement information.
11. The video combiner of claim 9, wherein the computer program, when executed by the processor, further causes the video combiner to: it is decided whether or not a valid combined video stream can be obtained by combining two or more video streams according to the fine granularity capability requirement information.
12. The video combiner of claim 9, wherein the computer program, when executed by the processor, further causes the video combiner to: the fine granularity capability requirement information of the multiple video streams to be combined is integrated.
13. The video combiner of claim 9, wherein the fine granularity capability requirement information describes capability requirements of the encoded video representation in terms of capability requirements with respect to a plurality of standards; and
Wherein the computer program, when executed by the processor, further causes the video combiner to: it is determined whether the combined capability requirement of the multiple video streams to be combined as described by the fine granularity capability requirement information is within predetermined limits for all criteria.
14. The video combiner of claim 9, wherein the fine granularity capability requirement information comprises a plurality of score values relating to different criteria.
15. The video combiner of claim 9, wherein the fine granularity capability requirement information describes one or more score values for one or more of the following criteria:
-a fraction of the maximum number of allowed luminance samples per second;
-a fraction of the maximum image size;
-a fraction of the maximum bit rate;
-a fraction of buffer fullness; and
-A fraction of the maximum number of tiles.
16. A video decoder, wherein the video decoder comprises the video combiner of claim 9.
17. A method of providing an encoded video representation, comprising:
Providing coarse grain capability requirement information describing: compatibility of video streams with video decoders having a capability level of a plurality of predetermined capability levels, and
Providing fine granularity capability requirement information describing: which fraction of the allowable capability requirement associated with one of the predetermined capability levels is needed to decode the encoded video representation to enable a video combiner to combine two or more video streams according to the coarse granularity capability requirement information and the fine granularity capability requirement information.
18. A method for providing a combined video representation based on a plurality of encoded video representations, comprising:
Receiving a plurality of video streams comprising encoding parameter information describing a plurality of encoding parameters, encoding video content information, coarse granularity capability requirement information describing compatibility of the video streams with a video decoder having a capability level of a plurality of predetermined capability levels, and fine granularity capability requirement information describing which fraction of an allowable capability requirement associated with one of the predetermined capability levels is required to decode an encoded video representation, wherein two or more video streams are combined according to the coarse granularity capability requirement information and the fine granularity capability requirement information.
19. A computer readable storage medium storing a computer program for performing the method of claim 17 or 18 when run on a computer.
20. A video decoder for decoding a provided video representation, wherein the video decoder comprises:
A processor;
A memory storing a computer program that, when executed by the processor, causes the video decoder to: receiving a video representation comprising a plurality of video streams, the plurality of video streams comprising encoding parameter information, encoding video content information, coarse granularity capability requirement information, and fine granularity capability requirement information, the encoding parameter information describing a plurality of encoding parameters, the coarse granularity capability requirement information describing compatibility of the video streams with a video decoder having a capability level of a plurality of predetermined capability levels, the fine granularity capability requirement information describing which fraction of a permissible capability requirement associated with one of the predetermined capability levels is required to decode the encoded video representation, and
A determination is made as to whether the combined capability requirements described by the fine granularity capability requirement information of the plurality of video streams to be combined match a predetermined limit to be adhered to.
21. The video decoder of claim 20, wherein the computer program, when executed by the processor, further causes the video decoder to: the received coarse granularity capability requirement information and fine granularity capability requirement information are parsed to obtain an indication of the capability level and a fraction of the allowable capability requirement.
22. The video decoder of claim 20, wherein the fine granularity capability requirement information describes capability requirements of the encoded video representation in terms of capability requirements with respect to a plurality of standards; and
Wherein the computer program, when executed by the processor, further causes the video decoder to: when the combined capability requirements of the plurality of video streams to be combined are within predetermined limits with respect to all criteria, it is determined that the combined capability requirements of the plurality of video streams to be combined are matched.
23. The video decoder of claim 20, wherein the fine granularity capability requirement information includes a plurality of score values associated with different criteria.
24. The video decoder of claim 20, wherein the fine granularity capability requirement information includes one or more score values describing one or more of the following criteria:
-a fraction of the maximum number of allowed luminance samples per second;
-a fraction of the maximum image size;
-a fraction of the maximum bit rate;
-a fraction of buffer fullness; and
-A fraction of the maximum number of tiles.
CN201980074638.0A 2018-09-13 2019-09-12 Bit stream merging Active CN113039804B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410913430.6A CN118694980A (en) 2018-09-13 2019-09-12 Bit stream merging

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
EP18194348 2018-09-13
EP18194348.1 2018-09-13
PCT/EP2019/074436 WO2020053369A1 (en) 2018-09-13 2019-09-12 Bitstream merging

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202410913430.6A Division CN118694980A (en) 2018-09-13 2019-09-12 Bit stream merging

Publications (2)

Publication Number Publication Date
CN113039804A CN113039804A (en) 2021-06-25
CN113039804B true CN113039804B (en) 2024-07-09

Family

ID=63579240

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202410913430.6A Pending CN118694980A (en) 2018-09-13 2019-09-12 Bit stream merging
CN201980074638.0A Active CN113039804B (en) 2018-09-13 2019-09-12 Bit stream merging

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202410913430.6A Pending CN118694980A (en) 2018-09-13 2019-09-12 Bit stream merging

Country Status (8)

Country Link
US (2) US11800128B2 (en)
EP (1) EP3850851A1 (en)
JP (4) JP7179163B2 (en)
KR (2) KR102572947B1 (en)
CN (2) CN118694980A (en)
BR (1) BR112021004636A2 (en)
MX (1) MX2021002934A (en)
WO (1) WO2020053369A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3850851A1 (en) * 2018-09-13 2021-07-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Bitstream merging
EP3861754A1 (en) * 2018-10-02 2021-08-11 Telefonaktiebolaget LM Ericsson (publ) Picture tile attributes signaled using loop(s) over tiles
US11381867B2 (en) 2019-01-08 2022-07-05 Qualcomm Incorporated Multiple decoder interface for streamed media data
GB2584295A (en) * 2019-05-28 2020-12-02 Canon Kk Method and apparatus for encoding and decoding a video bitstream for merging regions of interest
CN110248221A (en) * 2019-06-18 2019-09-17 北京物资学院 A kind of video ads dynamic insertion method and device
US11792433B2 (en) * 2020-09-28 2023-10-17 Sharp Kabushiki Kaisha Systems and methods for signaling profile and level information in video coding
US20220337855A1 (en) * 2021-04-20 2022-10-20 Samsung Electronics Co., Ltd. Operation of video decoding engine for evc
WO2023203423A1 (en) * 2022-04-20 2023-10-26 Nokia Technologies Oy Method and apparatus for encoding, decoding, or displaying picture-in-picture

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012231295A (en) * 2011-04-26 2012-11-22 Canon Inc Encoder, encoding method and program

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20050012809A (en) * 2002-06-18 2005-02-02 코닌클리케 필립스 일렉트로닉스 엔.브이. Video encoding method and corresponding encoding and decoding devices
BR122012013066A2 (en) * 2007-04-18 2015-08-04 Thomson Licensing Multi-view video encoding device
EP2751998A4 (en) * 2011-08-30 2015-08-12 Intel Corp Multiview video coding schemes
US20130188715A1 (en) * 2012-01-09 2013-07-25 Qualcomm Incorporated Device and methods for merge list reordering in video coding
US9838685B2 (en) * 2012-06-15 2017-12-05 Google Technology Holdings LLC Method and apparatus for efficient slice header processing
CN104685886B (en) * 2012-06-29 2018-12-07 瑞典爱立信有限公司 Devices and methods therefor for video processing
WO2014050425A1 (en) * 2012-09-28 2014-04-03 シャープ株式会社 Image decoding device
US10284908B2 (en) * 2013-02-26 2019-05-07 Comcast Cable Communications, Llc Providing multiple data transmissions
GB2516824A (en) * 2013-07-23 2015-02-11 Nokia Corp An apparatus, a method and a computer program for video coding and decoding
US9712837B2 (en) * 2014-03-17 2017-07-18 Qualcomm Incorporated Level definitions for multi-layer video codecs
US10205949B2 (en) * 2014-05-21 2019-02-12 Arris Enterprises Llc Signaling for addition or removal of layers in scalable video
US10798422B2 (en) * 2015-10-20 2020-10-06 Intel Corporation Method and system of video coding with post-processing indication
KR102515357B1 (en) 2018-01-25 2023-03-29 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Efficient subpicture extraction
CN117692661A (en) * 2018-04-03 2024-03-12 华为技术有限公司 File format indication based on error mitigation in sub-picture code stream view dependent video coding
EP3850851A1 (en) * 2018-09-13 2021-07-21 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Bitstream merging

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012231295A (en) * 2011-04-26 2012-11-22 Canon Inc Encoder, encoding method and program

Also Published As

Publication number Publication date
JP2024123203A (en) 2024-09-10
CN118694980A (en) 2024-09-24
JP2023022096A (en) 2023-02-14
JP7511733B2 (en) 2024-07-05
US20230421793A1 (en) 2023-12-28
US11800128B2 (en) 2023-10-24
BR112021004636A2 (en) 2021-05-25
JP7359926B2 (en) 2023-10-11
MX2021002934A (en) 2021-07-16
US20210203973A1 (en) 2021-07-01
KR20210047935A (en) 2021-04-30
KR102572947B1 (en) 2023-08-31
KR20230128584A (en) 2023-09-05
JP2022501889A (en) 2022-01-06
WO2020053369A1 (en) 2020-03-19
JP2023171865A (en) 2023-12-05
EP3850851A1 (en) 2021-07-21
JP7179163B2 (en) 2022-11-28
CN113039804A (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN113039804B (en) Bit stream merging
US11159802B2 (en) Signaling and selection for the enhancement of layers in scalable video
US11395000B2 (en) Dependent random access point pictures
KR102037158B1 (en) Video composition
RU2653299C2 (en) Method and device for video coding and decoding
KR101944565B1 (en) Reducing latency in video encoding and decoding
JP6312704B2 (en) Syntax and semantics for buffering information that simplify video splicing
US8958486B2 (en) Simultaneous processing of media and redundancy streams for mitigating impairments
CN110708597B (en) Live broadcast delay monitoring method and device, electronic equipment and readable storage medium
CN113170237B (en) Video encoding and decoding method and apparatus
US11336965B2 (en) Method and apparatus for processing video bitstream, network device, and readable storage medium
CN114788290A (en) System and method for signaling picture timing and decoding unit information in video coding
KR101396948B1 (en) Method and Equipment for hybrid multiview and scalable video coding
KR101433168B1 (en) Method and Equipment for hybrid multiview and scalable video coding
EP4222977A1 (en) A method, an apparatus and a computer program product for video encoding/decoding
US20180352240A1 (en) Generalized Temporal Sub-Layering Frame Work

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant