JP7325534B2

JP7325534B2 - Method and apparatus for point cloud encoding

Info

Publication number: JP7325534B2
Application number: JP2021562151A
Authority: JP
Inventors: セフン・ヤ; ウェン・ガオ; シャン・ジャン; シャン・リュウ
Original assignee: テンセント・アメリカ・エルエルシー
Priority date: 2020-04-02
Filing date: 2021-03-25
Publication date: 2023-08-14
Anticipated expiration: 2041-03-25
Also published as: US20230015039A1; US20210312667A1; WO2021202220A1; CN113892235A; KR20210152530A; AU2021248462B2; EP3991302A1; US11514612B2; US11741635B2; JP2022531110A; CA3138068A1; CA3138068C; EP3991302A4; AU2021248462A1; SG11202111723VA

Description

参照による組み込み
本出願は、2021年3月16日に出願された米国特許出願第17/203,155号「METHOD AND APPARATUS FOR POINT CLOUD CODING」に対する優先権の利益を主張し、同出願は、2020年4月2日に出願された米国仮出願第63/004,304号「METHOD AND APPARATUS FOR FLEXIBLE QUAD-TREE AND BINARY-TREE PARTITIONING FOR GEOMETRY CODING」に対する優先権の利益を主張する。先行出願の開示全体は、参照によりその全体が本明細書に組み込まれる。 INCORPORATION BY REFERENCE This application claims the benefit of priority to U.S. patent application Ser. Claims priority benefit to U.S. Provisional Application No. 63/004,304, "METHOD AND APPARATUS FOR FLEXIBLE QUAD-TREE AND BINARY-TREE PARTITIONING FOR GEOMETRY CODING," filed Jan. 2. The entire disclosure of the prior application is incorporated herein by reference in its entirety.

本開示は、一般的に点群符号化に関連する実施形態を説明する。 This disclosure describes embodiments generally related to point cloud encoding.

本明細書で提供される背景技術の説明は、本開示の文脈を一般的に提示することを目的としている。本発明者らの研究は、この背景技術の項に記載されている限りにおいて、ならびにそうでなく出願時に先行技術として認められない可能性がある説明の態様は、本開示に対する先行技術として明示的にも暗示的にも認められない。 The background description provided herein is for the purpose of generally presenting the context of the present disclosure. Our work, to the extent set forth in this Background Section, as well as aspects of the description that may otherwise not be admitted as prior art at the time of filing, are expressly identified as prior art to the present disclosure. is not allowed either implicitly or

3次元（3D）空間内の世界の物体、世界の環境などの世界を捕捉し表現するための様々な技術が開発されている。世界の3D表現は、より没入型の相互作用および通信を可能にすることができる。点群は、世界の3D表現として使用することができる。点群は、3D空間内の点のセットであり、各々が、例えば色、材料特性、テクスチャ情報、強度属性、反射率属性、動き関連属性、モダリティ属性、および／または様々な他の属性などの関連属性を有する。そのような点群は、大量のデータを含んでいる可能性があり、記憶および送信するためにコストがかかり、時間がかかり得る。 Various techniques have been developed to capture and represent the world, such as world objects, world environments, etc., in three-dimensional (3D) space. A 3D representation of the world can enable more immersive interaction and communication. A point cloud can be used as a 3D representation of the world. A point cloud is a set of points in 3D space, each having attributes such as color, material properties, texture information, intensity attributes, reflectance attributes, motion-related attributes, modality attributes, and/or various other attributes. Has associated attributes. Such point clouds can contain large amounts of data and can be costly and time consuming to store and transmit.

本開示の態様は、点群の圧縮および解凍のための方法および装置を提供する。本開示の一態様によれば、点群復号器における点群ジオメトリ復号の方法が提供される。この方法では、第1のシグナリング情報は、3次元（3D）空間内の点のセットを含む点群の符号化されたビットストリームから受信することができる。第1のシグナリング情報は、点群の区分情報を示すことができる。第2のシグナリング情報は、第1の値を示す第1のシグナリング情報に基づいて決定することができる。第2のシグナリング情報は、3D空間内の点のセットの区分モードを示すことができる。さらに、3D空間内の点のセットの区分モードは、第2のシグナリング情報に基づいて決定することができる。点群は、その後、区分モードに基づいて再構築することができる。 Aspects of the present disclosure provide methods and apparatus for point cloud compression and decompression. According to one aspect of the present disclosure, a method of point cloud geometry decoding in a point cloud decoder is provided. In this method, first signaling information may be received from an encoded bitstream of a point cloud that includes a set of points in three-dimensional (3D) space. The first signaling information may indicate segmentation information of the point cloud. The second signaling information can be determined based on the first signaling information indicating the first value. The second signaling information may indicate the partitioning mode of the set of points in 3D space. Additionally, a partitioning mode for a set of points in 3D space can be determined based on the second signaling information. The point cloud can then be reconstructed based on segmentation modes.

いくつかの実施形態では、区分モードは、第2の値を示す第2のシグナリング情報に基づいて、所定の四分木および二分木（QtBt）区分であると決定することができる。 In some embodiments, the partitioning mode can be determined to be a predetermined quadtree and binary tree (QtBt) partition based on second signaling information indicative of a second value.

この方法では、3D空間が非対称直方体であることを示す第3のシグナリング情報を受信することができる。x、y、およびz方向に沿ってシグナリングされる3D空間の寸法は、第1の値を示す第3のシグナリング情報に基づいて決定することができる。 The method may receive third signaling information indicating that the 3D space is an asymmetric cuboid. The dimensions of the 3D space signaled along the x, y, and z directions can be determined based on third signaling information indicative of the first value.

いくつかの実施形態では、第1の値を示す第2のシグナリング情報に基づいて、区分モードにおける複数の区分レベルのそれぞれについて3ビットのシグナリング情報を決定することができる。複数の区分レベルの各々の3ビットシグナリング情報は、区分モードにおけるそれぞれの区分レベルのx、y、およびz方向に沿った区分方向を示すことができる。 In some embodiments, 3-bit signaling information can be determined for each of the multiple partitioning levels in the partitioning mode based on the second signaling information indicative of the first value. The 3-bit signaling information for each of the multiple partitioning levels can indicate partitioning directions along the x, y, and z directions for the respective partitioning level in partitioning mode.

いくつかの実施形態では、3ビットのシグナリング情報は、3D空間の次元に基づいて決定することができる。 In some embodiments, the 3-bit signaling information can be determined based on the dimensionality of the 3D space.

この方法では、区分モードは、第2の値を示す第1のシグナリング情報に基づいて決定することができ、区分モードは、区分モードにおける複数の区分レベルの各々におけるそれぞれの八分木区分を含むことができる。 In this method, the partitioning mode can be determined based on the first signaling information indicative of the second value, the partitioning mode including a respective octree partition at each of the plurality of partitioning levels in the partitioning mode. be able to.

本開示の一態様によれば、点群復号器における点群ジオメトリ復号の方法が提供される。この方法では、第1のシグナリング情報は、3次元（3D）空間内の点のセットを含む点群の符号化されたビットストリームから受信することができる。第1のシグナリング情報は、点群の区分情報を示すことができる。3D空間内の点のセットの区分モードは、第1のシグナリング情報に基づいて決定することができ、区分モードは、複数の区分レベルを含むことができる。点群は、その後、区分モードに基づいて再構築することができる。 According to one aspect of the present disclosure, a method of point cloud geometry decoding in a point cloud decoder is provided. In this method, first signaling information may be received from an encoded bitstream of a point cloud that includes a set of points in three-dimensional (3D) space. The first signaling information may indicate segmentation information of the point cloud. A partitioning mode for the set of points in 3D space may be determined based on the first signaling information, and the partitioning mode may include multiple partitioning levels. The point cloud can then be reconstructed based on segmentation modes.

いくつかの実施形態では、区分モードにおける複数の区分レベルの各々についての3ビットのシグナリング情報は、第1の値を示す第1のシグナリング情報に基づいて決定することができ、複数の区分レベルの各々についての3ビットのシグナリング情報は、区分モードにおけるそれぞれの区分レベルについてのx、y、およびz方向に沿った区分方向を示すことができる。 In some embodiments, the 3-bit signaling information for each of the multiple partitioning levels in the partitioning mode can be determined based on the first signaling information indicating the first value, The 3-bit signaling information for each can indicate the partition direction along the x, y, and z directions for each partition level in partition mode.

いくつかの実施形態では、区分モードは、第2の値を示す第1のシグナリング情報に基づいて、区分モードにおける複数の区分レベルの各々にそれぞれの八分木区分を含むように決定することができる。 In some embodiments, the partitioning mode can be determined to include a respective octree partition in each of the plurality of partitioning levels in the partitioning mode based on the first signaling information indicative of the second value. can.

この方法では、第2のシグナリング情報を、点群のための符号化されたビットストリームからさらに受信することができる。第2のシグナリング情報は、第2のシグナリング情報が第1の値であるとき、3D空間が非対称直方体であり、第2のシグナリング情報が第2の値であるとき、3D空間が対称直方体であることを示すことができる。 The method may further receive second signaling information from the encoded bitstream for the point cloud. The second signaling information is such that when the second signaling information is the first value, the 3D space is an asymmetric rectangular parallelepiped, and when the second signaling information is the second value, the 3D space is a symmetrical rectangular parallelepiped It can be shown that

いくつかの実施形態では、第2の値を示す第1の信号情報および第1の値を示す第2の信号情報に基づいて、区分モードは、区分モードの複数の区分レベルにおける第1の区分レベルの各々にそれぞれの八分木区分を含むように決定され得る。区分モードの複数の区分レベルのうちの最後の区分レベルの区分型式および区分方向は、以下の条件 In some embodiments, based on the first signal information indicative of the second value and the second signal information indicative of the first value, the partition mode determines the first partition at the plurality of partition levels of the partition mode. Each of the levels can be determined to contain a respective octree partition. The partition type and partition direction of the last partition level among the multiple partition levels of the partition mode must meet the following conditions:

に従って決定することができ、ここで、d_x、d_y、およびd_zは、それぞれx、y、およびz方向の3D空間のlog2サイズである。 where d _x , d _y , and d _z are the log2 sizes of 3D space in the x, y, and z directions, respectively.

本方法では、第2のシグナリング情報は、第1の値を示す第1のシグナリング情報に基づいて決定することができる。第2のシグナリング情報は、第2のシグナリング情報が第1の値を示す場合、3D空間が非対称直方体であり、第2のシグナリング情報が第2の値を示す場合、3D空間が対称直方体であることを示すことができる。さらに、x、y、およびz方向に沿ってシグナリングされる3D空間の寸法は、第1の値を示す第2のシグナリング情報に基づいて決定することができる。 In the method, the second signaling information can be determined based on the first signaling information indicative of the first value. The second signaling information indicates that the 3D space is an asymmetric cuboid when the second signaling information indicates the first value, and the 3D space is a symmetric cuboid when the second signaling information indicates the second value It can be shown that Additionally, the dimensions of the 3D space signaled along the x, y, and z directions can be determined based on second signaling information indicative of the first value.

いくつかの例では、点群データを処理するための装置は、上述した方法のうちの1つまたは複数を実行するように構成された受信回路および処理回路を含む。 In some examples, an apparatus for processing point cloud data includes receiving circuitry and processing circuitry configured to perform one or more of the methods described above.

開示された主題のさらなる特徴、性質、および様々な利点は、以下の詳細な説明および添付の図面からより明らかになるであろう。 Further features, properties and various advantages of the disclosed subject matter will become more apparent from the following detailed description and accompanying drawings.

一実施形態による通信システムの簡略化されたブロック図の概略図である。1 is a schematic illustration of a simplified block diagram of a communication system in accordance with one embodiment; FIG. 一実施形態によるストリーミングシステムの簡略化されたブロック図の概略図である。1 is a schematic illustration of a simplified block diagram of a streaming system according to one embodiment; FIG. いくつかの実施形態による、点群フレームを符号化するための符号化器のブロック図である。1 is a block diagram of an encoder for encoding point cloud frames, according to some embodiments; FIG. いくつかの実施形態による、点群フレームに対応する圧縮ビットストリームを復号するための復号器のブロック図である。FIG. 4 is a block diagram of a decoder for decoding compressed bitstreams corresponding to point cloud frames, according to some embodiments; 一実施形態によるビデオ復号器の簡略ブロック図の概略図である。1 is a schematic diagram of a simplified block diagram of a video decoder according to one embodiment; FIG. 一実施形態によるビデオ符号化器の簡略ブロック図の概略図である。1 is a schematic diagram of a simplified block diagram of a video encoder according to one embodiment; FIG. いくつかの実施形態による、点群フレームに対応する圧縮ビットストリームを復号するための復号器のブロック図である。FIG. 4 is a block diagram of a decoder for decoding compressed bitstreams corresponding to point cloud frames, according to some embodiments; いくつかの実施形態による、点群フレームを符号化するための符号化器のブロック図である。1 is a block diagram of an encoder for encoding point cloud frames, according to some embodiments; FIG. 本開示のいくつかの実施形態による、八分木区分技術に基づく立方体の区分を示す図である。[0014] FIG. 4 illustrates partitioning a cube based on the octree partitioning technique, according to some embodiments of the present disclosure; 本開示のいくつかの実施形態による、八分木区分および八分木区分に対応する八分木構造の一例を示す図である。[0014] Figure 4 illustrates an example of an octree partition and an octree structure corresponding to the octree partition, according to some embodiments of the present disclosure; 本開示のいくつかの実施形態による、z方向のより短い境界ボックスを有する点群を示す図である。FIG. 10 illustrates a point cloud with a shorter bounding box in the z-direction, according to some embodiments of the present disclosure; 本開示のいくつかの実施形態による、x－y、x－z、およびy－z軸に沿った八分木区分技術に基づく立方体の区分を示す図である。FIG. 4 illustrates partitioning a cube based on the octree partitioning technique along the xy, xz, and yz axes, according to some embodiments of the present disclosure; 本開示のいくつかの実施形態による、x、y、およびz軸に沿ったバイナリ区分技術に基づく立方体の区分を示す図である。[0014] FIG. 4 illustrates sectioning a cube based on the binary sectioning technique along the x, y, and z axes, according to some embodiments of the present disclosure; いくつかの実施形態による、第1の処理例の概要を示す第1のフローチャートである。1 is a first flowchart outlining a first example process, according to some embodiments; いくつかの実施形態による、第2の処理例の概要を示す第2のフローチャートである。4 is a second flowchart outlining a second example process, according to some embodiments; 一実施形態によるコンピュータシステムの概略図である。1 is a schematic diagram of a computer system according to one embodiment; FIG.

世界の高度な3D表現は、より没入型の相互作用および通信を可能にし、機械が世界を理解し、解釈し、ナビゲートすることも可能にしている。そのような情報の表現を可能にするものとして、3D点群が登場した。点群データに関連するいくつかの適用事例が特定されており、点群表現および圧縮のための対応する要件が開発されている。例えば、3D点群は、物体検出および位置特定のための自動運転に使用することができる。3D点群はまた、地図作成のための地理情報システム（GIS）で使用することができ、文化遺産のオブジェクトおよびコレクションを視覚化およびアーカイブするために文化遺産で使用することができる。 Advanced 3D representations of the world enable more immersive interaction and communication, and also enable machines to understand, interpret and navigate the world. 3D point clouds have emerged as a way to express such information. Several application cases related to point cloud data have been identified and corresponding requirements for point cloud representation and compression have been developed. For example, 3D point clouds can be used in autonomous driving for object detection and localization. 3D point clouds can also be used in geographic information systems (GIS) for cartography and in cultural heritage sites to visualize and archive cultural heritage objects and collections.

点群は、一般に、各々が関連付けられた属性を有する3D空間内の点のセットを指すことができる。属性は、色、材料特性、テクスチャ情報、強度属性、反射率属性、動き関連属性、モダリティ属性、および／または様々な他の属性を含むことができる。点群を使用して、そのような点の構成としてオブジェクトまたはシーンを再構築することができる。点は、様々な設定で複数のカメラ、深度センサ、および／またはLidarを使用して捕捉することができ、再構築されたシーンを現実的に表現するために数千から最大数十億の点で構成することができる。 A point cloud can generally refer to a set of points in 3D space, each with associated attributes. Attributes may include color, material properties, texture information, intensity attributes, reflectance attributes, motion-related attributes, modality attributes, and/or various other attributes. A point cloud can be used to reconstruct an object or scene as a composition of such points. Points can be captured using multiple cameras, depth sensors, and/or lidar in various settings, ranging from thousands up to billions of points for a realistic representation of the reconstructed scene. can be configured with

圧縮技術は、より高速な伝送または記憶装置の削減のために点群を表すために必要なデータ量を削減することができる。したがって、リアルタイム通信および6自由度（6 DoF）仮想現実で使用するための点群の非可逆圧縮のための技術が必要とされている。さらに、自動運転および文化遺産の用途などのための動的マッピングの文脈において、可逆点群圧縮のための技術が求められている。したがって、ISO/IEC MPEG（JTC 1/SC 29/WG 11）は、色および反射率などの幾何学的形状および属性の圧縮、スケーラブル／プログレッシブ符号化、経時的に捕捉された点群のシーケンスの符号化、および点群のサブセットへのランダムアクセスに対処するための規格の作業を開始した。 Compression techniques can reduce the amount of data required to represent the point cloud for faster transmission or reduced storage. Therefore, there is a need for techniques for lossy compression of point clouds for use in real-time communications and six degrees of freedom (6 DoF) virtual reality. Moreover, there is a need for techniques for reversible point cloud compression in the context of dynamic mapping, such as for autonomous driving and cultural heritage applications. ISO/IEC MPEG (JTC 1/SC 29/WG 11) therefore requires compression of geometric shapes and attributes such as color and reflectance, scalable/progressive coding, and processing of sequences of point clouds captured over time. Work has begun on a standard to address encoding and random access to subsets of point clouds.

図1は、本開示の実施形態による通信システム（100）の簡略化されたブロック図を示す。通信システム（100）は、例えばネットワーク（150）を介して互いに通信可能な複数の端末装置を含む。例えば、通信システム（100）は、ネットワーク（150）を介して相互接続された端末装置（110）および（120）の対を含む。図1の例では、端末装置（110）および（120）の第1の対は、点群データの一方向の送信を実行し得る。例えば、端末装置（110）は、端末装置（110）に接続されたセンサ（105）によって捕捉された点群（例えば、構造を表す点）を圧縮することができる。圧縮された点群は、例えばビットストリームの形態で、ネットワーク（150）を介して他の端末装置（120）に送信することができる。端末装置（120）は、ネットワーク（150）から圧縮された点群を受信し、ビットストリームを解凍して点群を再構築し、再構築された点群を適切に表示することができる。一方向データ伝送は、メディア・サービング・アプリケーションなどにおいて一般的であり得る。 FIG. 1 shows a simplified block diagram of a communication system (100) according to embodiments of the present disclosure. A communication system (100) includes, for example, a plurality of terminal devices capable of communicating with each other via a network (150). For example, a communication system (100) includes a pair of terminals (110) and (120) interconnected via a network (150). In the example of FIG. 1, a first pair of terminals (110) and (120) may perform unidirectional transmission of point cloud data. For example, the terminal (110) can compress a cloud of points (eg, points representing structures) captured by a sensor (105) connected to the terminal (110). The compressed point cloud can be transmitted over the network (150) to another terminal device (120), eg in the form of a bitstream. The terminal device (120) can receive the compressed point cloud from the network (150), decompress the bitstream to reconstruct the point cloud, and display the reconstructed point cloud appropriately. One-way data transmission may be common in media serving applications and the like.

図1の例では、端末装置（110）および（120）は、サーバ、およびパーソナルコンピュータとして示され得るが、本開示の原理はそのように限定されなくてもよい。本開示の実施形態は、ラップトップコンピュータ、タブレットコンピュータ、スマートフォン、ゲーム端末、メディアプレーヤ、および／または専用3次元（3D）機器に適用される。ネットワーク（150）は、端末装置（110）と端末装置（120）との間で圧縮された点群を送信する任意の数のネットワークを表す。ネットワーク（150）は、例えば、有線（有線）および／または無線通信ネットワークを含むことができる。ネットワーク（150）は、回路交換チャネルおよび／またはパケット交換チャネルでデータを交換することができる。代表的なネットワークには、電気通信ネットワーク、ローカルエリアネットワーク、ワイドエリアネットワークおよび／またはインターネットが含まれる。本議論の目的のために、ネットワーク（150）のアーキテクチャおよびトポロジーは、以下に本明細書で説明されない限り、本開示の動作にとって重要ではない場合がある。 In the example of FIG. 1, terminals (110) and (120) may be shown as servers and personal computers, but the principles of the disclosure may not be so limited. Embodiments of the present disclosure apply to laptop computers, tablet computers, smart phones, gaming consoles, media players, and/or dedicated three-dimensional (3D) equipment. Network (150) represents any number of networks that transmit compressed point clouds between terminals (110) and (120). Network (150) may include, for example, wired (wired) and/or wireless communication networks. The network (150) may exchange data over circuit-switched channels and/or packet-switched channels. Typical networks include telecommunications networks, local area networks, wide area networks and/or the Internet. For purposes of this discussion, the architecture and topology of network (150) may not be critical to the operation of the present disclosure unless otherwise described herein.

図2は、一実施形態に係るストリーミングシステム（200）の簡略ブロック図を示す。図2の例は、点群の開示された主題のアプリケーションである。開示された主題は、3Dテレプレゼンスアプリケーション、仮想現実アプリケーションなどの他の点群対応アプリケーションにも等しく適用可能であり得る。 FIG. 2 shows a simplified block diagram of a streaming system (200) according to one embodiment. The example of FIG. 2 is an application of the disclosed subject matter to point clouds. The disclosed subject matter may be equally applicable to other point cloud enabled applications such as 3D telepresence applications, virtual reality applications.

ストリーミングシステム（200）は、捕捉サブシステム（213）を含むことができる。捕捉サブシステム（213）は、点群ソース（201）、例えば光検出および測距（LIDAR）システム、3Dカメラ、3Dスキャナ、例えば非圧縮の点群（202）を生成するソフトウェア内の非圧縮の点群を生成するグラフィック生成構成要素などを含むことができる。一例では、点群（202）は、3Dカメラによって捕捉された点を含む。点群（202）は、圧縮された点群（204）（圧縮された点群のビットストリーム）と比較して高いデータ量を強調するために太線として示されている。圧縮された点群（204）は、点群ソース（201）に結合された符号化器（203）を含む電子装置（220）によって生成することができる。符号化器（203）は、以下でより詳細に説明されるように、開示された主題の態様を可能にするまたは実装するためのハードウェア、ソフトウェア、またはそれらの組み合わせを含み得る。点群（202）のストリームと比較してより低いデータ量を強調するために細い線として示されている圧縮された点群（204）（または圧縮された点群（204）のビットストリーム）は、将来の使用のためにストリーミングサーバ（205）に格納することができる。図2のクライアントサブシステム（206）および（208）などの1つまたは複数のストリーミングクライアントサブシステムは、ストリーミングサーバ（205）にアクセスして、圧縮された点群（204）の複製（207）および（209）を取得することができる。クライアントサブシステム（206）は、例えば、電子装置（230）に復号器（210）を含み得る。復号器（210）は、圧縮された点群の到来する複製（207）を復号し、レンダリング装置（212）上にレンダリングされることが可能な再構築された点群（211）の発信ストリームを生成する。 The streaming system (200) can include a capture subsystem (213). The capture subsystem (213) is a point cloud source (201), such as a light detection and ranging (LIDAR) system, 3D camera, 3D scanner, e.g. It can include graphics generation components and the like that generate point clouds. In one example, the point cloud (202) includes points captured by a 3D camera. The point cloud (202) is shown as a bold line to emphasize the high data volume compared to the compressed point cloud (204) (compressed point cloud bitstream). The compressed point cloud (204) can be generated by an electronic device (220) including an encoder (203) coupled to the point cloud source (201). The encoder (203) may include hardware, software, or a combination thereof to enable or implement aspects of the disclosed subject matter, as described in more detail below. Compressed point cloud (204) (or compressed point cloud (204) bitstream), shown as a thin line to highlight the lower amount of data compared to the stream of point cloud (202), is , can be stored on the streaming server (205) for future use. One or more streaming client subsystems, such as the client subsystems (206) and (208) in Figure 2, access the streaming server (205) to replicate (207) and (209) can be obtained. The client subsystem (206) may include the decoder (210) in the electronic device (230), for example. The decoder (210) decodes the incoming replica (207) of the compressed point cloud and produces an outgoing stream of reconstructed point cloud (211) that can be rendered on the rendering device (212). Generate.

電子装置（220）および（230）は、他の構成要素（図示せず）を含み得ることに留意されたい。例えば、電子装置（220）は復号器（図示せず）を含むことができ、電子装置（230）は符号化器（図示せず）も含むことができる。 Note that electronic devices (220) and (230) may include other components (not shown). For example, the electronic device (220) may include a decoder (not shown) and the electronic device (230) may also include an encoder (not shown).

いくつかのストリーミングシステムでは、圧縮された点群（204）、（207）、および（209）（例えば、圧縮された点群のビットストリーム）は、特定の規格に従って圧縮することができる。いくつかの例では、点群の圧縮にビデオ符号化規格が使用される。このような規格としては、例えば、HEVC（High Efficiency Video Coding）（HEVC）や、VVC（Versatile Video Coding）（VVC）などがある。 In some streaming systems, the compressed point clouds (204), (207), and (209) (eg, compressed point cloud bitstreams) can be compressed according to a particular standard. In some examples, video coding standards are used to compress the point cloud. Such standards include, for example, HEVC (High Efficiency Video Coding) (HEVC) and VVC (Versatile Video Coding) (VVC).

図3は、いくつかの実施形態による、点群フレームを符号化するためのV-PCC符号化器（300）のブロック図を示す。いくつかの実施形態では、V-PCC符号化器（300）は、通信システム（100）およびストリーミングシステム（200）で使用することができる。例えば、符号化器（203）は、V-PCC符号化器（300）と同様に構成され、動作することができる。 FIG. 3 shows a block diagram of a V-PCC encoder (300) for encoding point cloud frames, according to some embodiments. In some embodiments, the V-PCC encoder (300) can be used in communication systems (100) and streaming systems (200). For example, the encoder (203) can be configured and operate similarly to the V-PCC encoder (300).

V-PCC符号化器（300）は、非圧縮入力として点群フレームを受け取り、圧縮された点群フレームに対応するビットストリームを生成する。いくつかの実施形態では、V-PCC符号化器（300）は、点群ソース（201）などの点群ソースから点群フレームを受信することができる。 The V-PCC encoder (300) receives point cloud frames as uncompressed input and produces a bitstream corresponding to the compressed point cloud frames. In some embodiments, the V-PCC encoder (300) may receive point cloud frames from a point cloud source, such as the point cloud source (201).

図3の例では、V-PCC符号化器（300）は、パッチ生成モジュール（306）と、パッチパッキングモジュール（308）と、ジオメトリ画像生成モジュール（310）と、テクスチャ画像生成モジュール（312）と、パッチ情報モジュール（304）と、占有マップモジュール（314）と、平滑化モジュール（336）と、画像パディングモジュール（316）および（318）と、グループ拡張モジュール（320）と、ビデオ圧縮モジュール（322）、（323）および（332）と、補助パッチ情報圧縮モジュール（338）と、エントロピー圧縮モジュール（334）と、マルチプレクサ（324）とを含む。 In the example of FIG. 3, the V-PCC encoder (300) includes a patch generation module (306), a patch packing module (308), a geometry image generation module (310), and a texture image generation module (312). , a patch information module (304), an occupancy map module (314), a smoothing module (336), an image padding module (316) and (318), a group expansion module (320), and a video compression module (322). ), (323) and (332), an auxiliary patch information compression module (338), an entropy compression module (334), and a multiplexer (324).

本開示の一態様によれば、V-PCC符号化器（300）は、圧縮された点群を解凍された点群に変換するために使用されるいくつかのメタデータ（例えば、占有マップおよびパッチ情報）とともに、3D点群フレームを画像ベースの表現に変換する。いくつかの例では、V-PCC符号化器（300）は、3D点群フレームをジオメトリ画像、テクスチャ画像および占有マップに変換し、次いでビデオ符号化技術を使用してジオメトリ画像、テクスチャ画像および占有マップをビットストリームに符号化することができる。一般に、ジオメトリ画像は、画素に投影された点に関連付けられたジオメトリ値で満たされた画素を有する2D画像であり、ジオメトリ値で満たされた画素は、ジオメトリサンプルと呼ぶことができる。テクスチャ画像は、画素に投影された点に関連付けられたテクスチャ値で満たされた画素を有する2D画像であり、テクスチャ値で満たされた画素はテクスチャサンプルと呼ぶことができる。占有マップは、パッチによって占有されているか占有されていないかを示す値で満たされた画素を有する2D画像である。 According to one aspect of the present disclosure, the V-PCC encoder (300) stores some metadata (e.g., occupancy map and patch information) to transform the 3D point cloud frame into an image-based representation. In some examples, the V-PCC encoder (300) converts the 3D point cloud frame into geometry images, texture images and occupancy maps, and then uses video encoding techniques to convert the geometry images, texture images and occupancy maps. A map can be encoded into a bitstream. In general, a geometry image is a 2D image that has pixels filled with geometry values associated with points projected onto the pixels, and the pixels filled with geometry values can be referred to as geometry samples. A texture image is a 2D image that has pixels filled with texture values associated with points projected onto the pixels, and the pixels filled with texture values can be referred to as texture samples. An occupancy map is a 2D image with pixels filled with values indicating whether they are occupied or not occupied by patches.

パッチは、一般に、点群によって記述される表面の連続したサブセットを指すことができる。一例では、パッチは、閾値量未満で互いにずれた表面法線ベクトルを有する点を含む。パッチ生成モジュール（306）は、各パッチが2D空間内の平面に対する深度場によって記述され得るように、点群を、重なり合っていてもいなくてもよいパッチのセットにセグメント化する。いくつかの実施形態では、パッチ生成モジュール（306）は、再構築誤差を最小化しながら、点群を滑らかな境界を有する最小数のパッチに分解することを目的とする。 A patch can generally refer to a contiguous subset of the surface described by the point cloud. In one example, a patch includes points that have surface normal vectors that are offset from each other by less than a threshold amount. A patch generation module (306) segments the point cloud into sets of patches that may or may not overlap, such that each patch can be described by a depth field for a plane in 2D space. In some embodiments, the patch generation module (306) aims to decompose the point cloud into a minimal number of patches with smooth boundaries while minimizing reconstruction errors.

パッチ情報モジュール（304）は、パッチのサイズおよび形状を示すパッチ情報を収集することができる。いくつかの例では、パッチ情報を画像フレームにパックし、次いで補助パッチ情報圧縮モジュール（338）によって符号化して、圧縮された補助パッチ情報を生成することができる。 A patch information module (304) can collect patch information that indicates the size and shape of the patch. In some examples, patch information may be packed into image frames and then encoded by the auxiliary patch information compression module (338) to produce compressed auxiliary patch information.

パッチパッキングモジュール（308）は、抽出されたパッチを2次元（2D）グリッド上にマッピングする一方で、未使用の空間を最小限に抑え、グリッドのすべてのM×M個（例えば、16×16）のブロックが固有のパッチと関連付けられることを保証するように構成される。効率的なパッチパッキングは、未使用の空間を最小化するか、または時間的一貫性を保証することによって、圧縮効率に直接影響を与える可能性がある。 The patch packing module (308) maps the extracted patches onto a two-dimensional (2D) grid while minimizing the unused space and packing all MxM (e.g. 16x16 ) blocks are associated with unique patches. Efficient patch packing can directly impact compression efficiency by minimizing unused space or ensuring temporal consistency.

ジオメトリ画像生成モジュール（310）は、所与のパッチ位置における点群のジオメトリに関連付けられた2Dジオメトリ画像を生成することができる。テクスチャ画像生成モジュール（312）は、所与のパッチ位置における点群のテクスチャと関連付けられた2Dテクスチャ画像を生成することができる。ジオメトリ画像生成モジュール（310）およびテクスチャ画像生成モジュール（312）は、パッキング処理中に計算された3Dから2Dへのマッピングを利用して、点群のジオメトリおよびテクスチャを画像として格納する。複数の点が同じサンプルに投影される場合をより良好に処理するために、各パッチは、層と呼ばれる2つの画像に投影される。一例では、ジオメトリ画像は、YUV420-8ビットフォーマットのWxHの単色フレームによって表される。テクスチャ画像を生成するために、テクスチャ生成手順は、再サンプリングされた点に関連付けられる色を計算するために、再構築／平滑化された幾何学的形状を利用する。 A geometry image generation module (310) can generate a 2D geometry image associated with the geometry of the point cloud at a given patch location. A texture image generation module (312) can generate a 2D texture image associated with the point cloud texture at a given patch location. Geometry image generation module (310) and texture image generation module (312) utilize the 3D to 2D mapping computed during the packing process to store point cloud geometry and texture as images. To better handle the case where multiple points are projected onto the same sample, each patch is projected into two images called layers. In one example, the geometry image is represented by a WxH monochrome frame in YUV420-8 bit format. To generate the texture image, the texture generation procedure utilizes the reconstructed/smoothed geometry to compute the colors associated with the resampled points.

占有マップモジュール（314）は、各単位でパディング情報を記述する占有マップを生成することができる。例えば、占有画像は、グリッドの各セルについて、セルが空の空間に属するか点群に属するかを示すバイナリマップを含む。一例では、占有マップは、画素がパディングされているか否かを画素ごとに記述するバイナリ情報を使用する。別の例では、占有マップは、画素のブロックがパディングされるか否かを画素のブロックごとに記述するバイナリ情報を使用する。 The occupancy map module (314) can generate an occupancy map that describes padding information for each unit. For example, the occupancy image contains a binary map indicating for each cell of the grid whether the cell belongs to the empty space or to the point cloud. In one example, the occupancy map uses binary information that describes for each pixel whether the pixel is padded or not. In another example, the occupancy map uses binary information that describes for each block of pixels whether the block of pixels is padded or not.

占有マップモジュール（314）によって生成された占有マップは、可逆符号化または不可逆符号化を使用して圧縮することができる。可逆符号化が使用されるとき、エントロピー圧縮モジュール（334）は、占有マップを圧縮するために使用される。非可逆符号化が使用される場合、占有マップを圧縮するためにビデオ圧縮モジュール（332）が使用される。 The occupancy map generated by the occupancy map module (314) can be compressed using lossless or lossy encoding. The entropy compression module (334) is used to compress the occupancy map when lossless encoding is used. If lossy encoding is used, a video compression module (332) is used to compress the occupancy map.

パッチパッキングモジュール（308）は、画像フレーム内にパッキングされた2Dパッチ間にいくつかの空き空間を残すことができることに留意されたい。画像パディングモジュール（316）および（318）は、2Dビデオおよび画像コーデックに適し得る画像フレームを生成するために、空き空間（パディングと呼ばれる）を埋めることができる。画像パディングは、未使用の空間を冗長な情報で埋めることができる背景充填とも呼ばれる。いくつかの例では、良好な背景充填は、ビットレートを最小限に増加させ、パッチ境界の周りに著しい符号化歪みを導入しない。 Note that the patch packing module (308) may leave some empty space between the 2D patches packed in the image frame. Image padding modules (316) and (318) can fill empty space (called padding) to generate image frames that may be suitable for 2D video and image codecs. Image padding is also called background filling, which allows unused space to be filled with redundant information. In some instances, good background fill increases bitrate minimally and does not introduce significant coding distortion around patch boundaries.

ビデオ圧縮モジュール（322）、（323）、および（332）は、HEVC、VVCなどの適切なビデオ符号化規格に基づいて、パディングされたジオメトリ画像、パディングされたテクスチャ画像、および占有マップなどの2D画像を符号化することができる。一例では、ビデオ圧縮モジュール（322）、（323）、および（332）は、別々に動作する個々の構成要素である。別の例では、ビデオ圧縮モジュール（322）、（323）、および（332）を単一の構成要素として実装できることに留意されたい。 The Video Compression Modules (322), (323) and (332) provide 2D compression such as padded geometry images, padded texture images and occupancy maps based on the appropriate video coding standard such as HEVC, VVC. Images can be encoded. In one example, the video compression modules (322), (323), and (332) are individual components that operate separately. Note that in another example, the video compression modules (322), (323) and (332) can be implemented as a single component.

いくつかの例では、平滑化モジュール（336）は、再構築されたジオメトリ画像の平滑化画像を生成するように構成される。平滑化された画像は、テクスチャ画像生成（312）に提供することができる。次に、テクスチャ画像生成（312）は、再構築されたジオメトリ画像に基づいてテクスチャ画像の生成を調整することができる。例えば、符号化や復号の際にパッチ形状（例えば、幾何学的形状）に多少の歪みがある場合には、その歪みを考慮してテクスチャ画像を生成し、パッチ形状の歪みを補正するようにしてもよい。 In some examples, the smoothing module (336) is configured to generate a smoothed image of the reconstructed geometry image. The smoothed image can be provided to texture image generation (312). A texture image generator (312) can then adjust the texture image generation based on the reconstructed geometry image. For example, if there is some distortion in the patch shape (eg, geometric shape) during encoding or decoding, the distortion is taken into consideration when generating a texture image to correct the distortion of the patch shape. may

いくつかの実施形態では、グループ拡張（320）は、符号化利得ならびに再構築された点群の視覚的品質を改善するために、冗長な低周波数コンテンツでオブジェクト境界の周りに画素をパディングするように構成される。 In some embodiments, the group extension (320) is designed to pad pixels around object boundaries with redundant low-frequency content to improve the coding gain as well as the visual quality of the reconstructed point cloud. configured to

マルチプレクサ（324）は、圧縮されたジオメトリ画像、圧縮されたテクスチャ画像、圧縮された占有マップ、および／または圧縮された補助パッチ情報を圧縮ビットストリームに多重化することができる。 A multiplexer (324) can multiplex compressed geometry images, compressed texture images, compressed occupancy maps, and/or compressed auxiliary patch information into the compressed bitstream.

図4は、いくつかの実施形態による、点群フレームに対応する圧縮ビットストリームを復号するためのV-PCC復号器（400）のブロック図を示す。いくつかの実施形態では、V-PCC復号器（400）は、通信システム（100）およびストリーミングシステム（200）で使用することができる。例えば、復号器（210）は、V-PCC復号器（400）と同様に動作するように構成することができる。V-PCC復号器（400）は、圧縮ビットストリームを受信し、圧縮ビットストリームに基づいて再構築された点群を生成する。 FIG. 4 shows a block diagram of a V-PCC decoder (400) for decoding compressed bitstreams corresponding to point cloud frames, according to some embodiments. In some embodiments, the V-PCC decoder (400) can be used in communication systems (100) and streaming systems (200). For example, the decoder (210) can be configured to operate similarly to the V-PCC decoder (400). The V-PCC decoder (400) receives the compressed bitstream and produces a reconstructed point cloud based on the compressed bitstream.

図4の例では、V-PCC復号器（400）は、デマルチプレクサ（432）と、ビデオ解凍モジュール（434）および（436）と、占有マップ解凍モジュール（438）と、補助パッチ情報解凍モジュール（442）と、ジオメトリ再構築モジュール（444）と、平滑化モジュール（446）と、テクスチャ再構築モジュール（448）と、色平滑化モジュール（452）とを含む。 In the example of Figure 4, the V-PCC decoder (400) includes a demultiplexer (432), video decompression modules (434) and (436), an occupancy map decompression module (438), and an auxiliary patch information decompression module ( 442), a geometry reconstruction module (444), a smoothing module (446), a texture reconstruction module (448) and a color smoothing module (452).

デマルチプレクサ（432）は、圧縮ビットストリームを受信し、圧縮されたテクスチャ画像、圧縮されたジオメトリ画像、圧縮された占有マップ、および圧縮された補助パッチ情報に分離することができる。 A demultiplexer (432) can receive the compressed bitstream and separate it into a compressed texture image, a compressed geometry image, a compressed occupancy map, and compressed auxiliary patch information.

ビデオ解凍モジュール（434）および（436）は、適切な規格（例えば、HEVC、VVCなど）に従って圧縮画像を復号し、解凍された画像を出力することができる。例えば、ビデオ解凍モジュール（434）は、圧縮されたテクスチャ画像を復号し、解凍されたテクスチャ画像を出力する。ビデオ解凍モジュール（436）は、圧縮されたジオメトリ画像を復号し、前記解凍されたジオメトリ画像を出力する。 Video decompression modules (434) and (436) can decode compressed images according to a suitable standard (eg, HEVC, VVC, etc.) and output decompressed images. For example, the video decompression module (434) decodes compressed texture images and outputs decompressed texture images. A video decompression module (436) decodes the compressed geometry image and outputs said decompressed geometry image.

占有マップ解凍モジュール（438）は、適切な規格（例えば、HEVC、VVCなど）に従って圧縮された占有マップを復号し、解凍された占有マップを出力することができる。 The occupancy map decompression module (438) can decode the compressed occupancy map according to a suitable standard (eg, HEVC, VVC, etc.) and output the decompressed occupancy map.

補助パッチ情報解凍モジュール（442）は、適切な規格（例えば、HEVC、VVCなど）に従って圧縮された補助パッチ情報を復号し、解凍された補助パッチ情報を出力することができる。 The auxiliary patch information decompression module (442) can decode auxiliary patch information compressed according to a suitable standard (eg, HEVC, VVC, etc.) and output decompressed auxiliary patch information.

ジオメトリ再構築モジュール（444）は、解凍されたジオメトリ画像を受け取り、解凍された占有マップおよび解凍された補助パッチ情報に基づいて再構築された点群ジオメトリを生成することができる。 A geometry reconstruction module (444) can receive the decompressed geometry image and generate a reconstructed point cloud geometry based on the decompressed occupancy map and the decompressed auxiliary patch information.

平滑化モジュール（446）は、パッチのエッジにおける不一致を平滑化することができる。平滑化手順は、圧縮アーチファクトに起因してパッチ境界で生じ得る潜在的な不連続性を緩和することを目的とする。いくつかの実施形態では、パッチ境界上に位置する画素に平滑化フィルタを適用して、圧縮／解凍によって生じ得る歪みを緩和することができる。 A smoothing module (446) may smooth discrepancies in the edges of the patches. The smoothing procedure aims to mitigate potential discontinuities that can occur at patch boundaries due to compression artifacts. In some embodiments, a smoothing filter may be applied to pixels located on patch boundaries to mitigate distortion that may be caused by compression/decompression.

テクスチャ再構築モジュール（448）は、解凍されたテクスチャ画像および平滑化ジオメトリに基づいて点群内の点のテクスチャ情報を決定することができる。 A texture reconstruction module (448) can determine texture information for points in the point cloud based on the decompressed texture image and the smoothed geometry.

色平滑化モジュール（452）は、色の不一致を平滑化することができる。3D空間内の隣接していないパッチは、2Dビデオ内で互いに隣接してパックされることが多い。いくつかの例では、非隣接パッチからの画素値は、ブロックベースのビデオコーデックによって混合される場合がある。色平滑化の目的は、パッチ境界に現れる可視アーチファクトを低減することである。 A color smoothing module (452) can smooth color discrepancies. Non-adjacent patches in 3D space are often packed next to each other in 2D video. In some examples, pixel values from non-adjacent patches may be blended by a block-based video codec. The purpose of color smoothing is to reduce visible artifacts that appear at patch boundaries.

図5は、本開示の一実施形態によるビデオ復号器（510）のブロック図を示す。ビデオ復号器（510）は、V-PCC復号器（400）で使用することができる。例えば、ビデオ解凍モジュール（434）および（436）、占有マップ解凍モジュール（438）は、ビデオ復号器（510）と同様に構成することができる。 FIG. 5 shows a block diagram of a video decoder (510) according to one embodiment of this disclosure. The video decoder (510) can be used with the V-PCC decoder (400). For example, the video decompression modules (434) and (436), the occupancy map decompression module (438) can be configured similarly to the video decoder (510).

ビデオ復号器（510）は、例えば符号化されたビデオシーケンスのような圧縮画像からシンボル（521）を再構築するための解析器（520）を含み得る。これらのシンボルのカテゴリには、ビデオ復号器（510）の動作を管理するために使用される情報が含まれる。解析器（520）は、受信された符号化されたビデオシーケンスを解析／エントロピー復号することができる。符号化されたビデオシーケンスの符号化は、ビデオ符号化技術または規格に従うことができ、可変長符号化、ハフマン符号化、文脈依存性ありまたはなしの算術符号化などを含む様々な原理に従うことができる。解析器（520）は、グループに対応する少なくとも1つのパラメータに基づいて、符号化されたビデオシーケンスから、ビデオ復号器内の画素のサブグループのうちの少なくとも1つのサブグループパラメータのセットを抽出することができる。サブグループは、グループオブピクチャ（GOP）、ピクチャ、タイル、スライス、マクロブロック、符号化ユニット（CU）、ブロック、変換ユニット（TU）、予測ユニット（PU）などを含むことができる。解析器（520）はまた、変換係数、量子化パラメータ値、動きベクトルなどのような符号化ビデオシーケンス情報から抽出することができる。 A video decoder (510) may include an analyzer (520) for reconstructing symbols (521) from a compressed image, such as an encoded video sequence. These symbol categories contain information used to manage the operation of the video decoder (510). The analyzer (520) can parse/entropy decode the received encoded video sequence. Encoding of an encoded video sequence can follow a video coding technique or standard and can follow various principles including variable length coding, Huffman coding, arithmetic coding with or without context dependence, etc. can. The analyzer (520) extracts from the encoded video sequence a set of at least one subgroup parameter of the subgroups of pixels in the video decoder based on at least one parameter corresponding to the group. be able to. Subgroups can include groups of pictures (GOPs), pictures, tiles, slices, macroblocks, coding units (CUs), blocks, transform units (TUs), prediction units (PUs), and so on. The analyzer (520) can also extract from encoded video sequence information such as transform coefficients, quantization parameter values, motion vectors, and the like.

解析器（520）は、シンボル（521）を作成するために、バッファメモリから受信したビデオシーケンスに対してエントロピー復号／解析動作を実行することができる。 The analyzer (520) can perform entropy decoding/analysis operations on the video sequence received from the buffer memory to produce symbols (521).

シンボル（521）の再構築は、符号化されたビデオピクチャまたはその一部（例えば、インターおよびイントラピクチャ、インターブロックおよびイントラブロック）の形式、およびその他の要因に依存して、複数の異なるユニットを含み得る。どのユニットがどのように関与するかは、解析器（520）によって符号化ビデオシーケンスから解析されたサブグループ制御情報によって制御することができる。解析器（520）と以下の複数のユニットとの間のそのようなサブグループ制御情報の流れは、分かりやすくするために示されていない。 Reconstruction of the symbol (521) can be divided into several different units, depending on the format of the encoded video picture or parts thereof (e.g., inter and intra pictures, inter and intra blocks), and other factors. can contain. Which units are involved and how can be controlled by subgroup control information parsed from the encoded video sequence by the analyzer (520). Such subgroup control information flow between the analyzer (520) and the following units is not shown for clarity.

既に述べた機能ブロックの他に、ビデオ復号器（510）は、概念的には、以下で説明するように、いくつかの機能ユニットに細分化され得る。商業的制約の下で動作する実際の実装では、これらのユニットの多くは互いに密接に相互作用し、少なくとも部分的に互いに統合することができる。しかしながら、開示された主題を説明するために、以下の機能ユニットへの概念的な細分化が適切である。 In addition to the functional blocks already mentioned, the video decoder (510) can be conceptually subdivided into a number of functional units, as described below. In practical implementations operating under commercial constraints, many of these units interact closely with each other and can be at least partially integrated with each other. However, to describe the disclosed subject matter, the following conceptual breakdown into functional units is adequate.

第1のユニットは、スケーラ／逆変換ユニット（551）である。スケーラ／逆変換ユニット（551）は、量子化変換係数、ならびにどの変換を使用するか、ブロックサイズ、量子化係数、量子化スケーリング行列などをシンボル（複数可）（521）として含む制御情報を、解析器（520）から受信する。スケーラ／逆変換ユニット（551）は、アグリゲータ（555）に入力され得るサンプル値を備えるブロックを出力し得る。 The first unit is the Scaler/Inverse Transform Unit (551). The scaler/inverse transform unit (551) stores the quantized transform coefficients and control information including which transform to use, block size, quantized coefficients, quantized scaling matrix, etc. as symbol(s) (521), Receive from the analyzer (520). The scaler/inverse transform unit (551) may output a block comprising sample values that may be input to the aggregator (555).

場合によっては、スケーラ／逆変換（551）の出力サンプルは、イントラ符号化ブロック、つまり、以前に再構築されたピクチャからの予測情報を使用していないが、現在のピクチャの以前に再構築された部分からの予測情報を使用できるブロックに関係し得る。そのような予測情報は、イントラピクチャ予測ユニット（552）によって提供することができる。場合によっては、イントラピクチャ予測ユニット（552）は、現在のピクチャバッファ（558）から取り出された周囲の既に再構築された情報を用いて、再構築中のブロックと同じサイズおよび形状のブロックを生成する。現在のピクチャバッファ（558）は、例えば、部分的に再構築された現在のピクチャおよび／または完全に再構築された現在のピクチャをバッファリングする。アグリゲータ（555）は、場合によっては、イントラ予測ユニット（552）が生成した予測情報を、スケーラ／逆変換ユニット（551）からの出力サンプル情報に、サンプル単位で付加する。 In some cases, the output samples of the scaler/inverse transform (551) are intra-coded blocks, i.e., not using prediction information from previously reconstructed pictures, but previously reconstructed of the current picture. Blocks that can use prediction information from other parts. Such prediction information can be provided by an intra-picture prediction unit (552). In some cases, the intra-picture prediction unit (552) uses surrounding already-reconstructed information retrieved from the current picture buffer (558) to generate a block of the same size and shape as the block being reconstructed. do. A current picture buffer (558) buffers, for example, a partially reconstructed current picture and/or a fully reconstructed current picture. The aggregator (555) optionally appends the prediction information generated by the intra prediction unit (552) to the output sample information from the scaler/inverse transform unit (551) on a sample-by-sample basis.

他の場合では、スケーラ／逆変換ユニット（551）の出力サンプルは、インター符号化された、潜在的に動き補償されたブロックに関係する可能性がある。そのような場合、動き補償予測ユニット（553）は、予測に使用されるサンプルをフェッチするために参照ピクチャメモリ（557）にアクセスすることができる。ブロックに関連するシンボル（521）に従ってフェッチされたサンプルを動き補償した後、これらのサンプルは、出力サンプル情報を生成するために、アグリゲータ（555）によってスケーラ／逆変換ユニット（551）の出力（この場合、残差サンプルまたは残差信号と呼ばれる）に追加され得る。動き補償予測ユニット（553）が予測サンプルをフェッチする参照ピクチャメモリ（557）内のアドレスは、動き補償予測ユニット（553）が例えばX、Y、および参照ピクチャ成分を有し得るシンボル（521）の形態で利用可能な動きベクトルによって制御することができる。動き補償はまた、サブサンプルの正確な動きベクトルが使用されているときに参照ピクチャメモリ（557）からフェッチされたサンプル値の補間、動きベクトル予測メカニズムなどをも含み得る。 In other cases, the output samples of the scaler/inverse transform unit (551) may relate to inter-coded, potentially motion compensated blocks. In such cases, the motion compensated prediction unit (553) can access the reference picture memory (557) to fetch the samples used for prediction. After motion compensating the samples fetched according to the symbols (521) associated with the block, these samples are converted by the aggregator (555) to the output of the scaler/inverse transform unit (551) (this case, called residual samples or residual signal). The address in the reference picture memory (557) from which the motion compensated prediction unit (553) fetches the prediction samples is the address of the symbol (521) where the motion compensated prediction unit (553) may have, for example, the X, Y, and reference picture components. It can be controlled by the motion vectors available in the mod. Motion compensation may also include interpolation of sample values fetched from reference picture memory (557) when sub-sample accurate motion vectors are used, motion vector prediction mechanisms, and the like.

アグリゲータ（555）の出力サンプルは、ループフィルタユニット（556）における様々なループフィルタリング技術の対象となり得る。ビデオ圧縮技術には、符号化ビデオシーケンス（符号化ビデオビットストリームとも称される）に含まれるパラメータによって制御され、解析器（520）からのシンボル（521）としてループフィルタユニット（556）で利用できるループ内フィルタ技術を含めることができるが、符号化ピクチャまたは符号化ビデオシーケンスの以前の（復号順で）部分の復号中に取得されたメタ情報に応答したり、以前に再構築およびループフィルタ処理されたサンプル値に応答したりすることもできる。 The output samples of the aggregator (555) may be subject to various loop filtering techniques in a loop filter unit (556). Video compression techniques include the following parameters, controlled by parameters contained in the encoded video sequence (also called encoded video bitstream) and made available to the loop filter unit (556) as symbols (521) from the analyzer (520): In-loop filtering techniques may be included, but may be responsive to meta-information obtained during decoding of earlier (in decoding order) portions of coded pictures or coded video sequences, or previously reconstructed and loop-filtered. It is also possible to respond to sampled values that have been sent.

ループフィルタユニット（556）の出力は、レンダリング装置に出力することができるとともに、将来のインターピクチャ予測に使用するために参照ピクチャメモリ（557）に格納されることができるサンプルストリームとすることができる。 The output of the loop filter unit (556) can be a sample stream that can be output to the rendering device and stored in the reference picture memory (557) for use in future inter-picture prediction. .

特定の符号化ピクチャは、完全に再構築されると、将来の予測のための参照ピクチャとして使用され得る。例えば、現在のピクチャに対応する符号化ピクチャが完全に再構築され、符号化ピクチャが（例えば、解析器（520）によって）参照ピクチャとして識別されると、現在のピクチャバッファ（558）は参照ピクチャメモリ（557）の一部になり得、次の符号化ピクチャの再構築を開始する前に、新しい現在のピクチャバッファを再割り当てすることができる。 A particular coded picture, once fully reconstructed, can be used as a reference picture for future prediction. For example, when the coded picture corresponding to the current picture is fully reconstructed and the coded picture is identified (eg, by the analyzer (520)) as a reference picture, the current picture buffer (558) stores the reference picture. It can be part of the memory (557) and can reallocate a new current picture buffer before starting reconstruction of the next coded picture.

ビデオ復号器（510）は、ITU-T Rec.H.265などの規格における所定のビデオ圧縮技術に従って復号動作を実行することができる。符号化されたビデオシーケンスは、符号化されたビデオシーケンスがビデオ圧縮技術または規格の構文およびビデオ圧縮技術または規格に文書化されたプロファイルの両方に準拠するという意味で、使用されているビデオ圧縮技術または規格によって指定された構文に準拠することができる。具体的には、プロファイルは、ビデオ圧縮技術または規格で利用可能なすべてのツールから、そのプロファイルの下で使用可能な唯一のツールとして特定のツールを選択することができる。また、コンプライアンスのために必要なのは、符号化されたビデオシーケンスの複雑さがビデオ圧縮技術または規格のレベルによって定義される境界内にあることであり得る。場合によっては、レベルは、最大ピクチャサイズ、最大フレームレート、最大再構築サンプルレート（例えば毎秒メガサンプルで測定される）、最大参照ピクチャサイズなどを制限する。レベルによって設定された制限は、場合によっては、仮想参照復号器（HRD）の仕様と、符号化ビデオシーケンスで通知されるHRDバッファ管理のメタデータとによってさらに制限され得る。 A video decoder (510) may perform decoding operations according to predetermined video compression techniques in standards such as ITU-T Rec. H.265. An encoded video sequence is defined by the video compression technology being used in the sense that the encoded video sequence conforms to both the syntax of the video compression technology or standard and the profile documented in the video compression technology or standard. or conform to the syntax specified by the standard. Specifically, a profile can select a particular tool from all available tools for a video compression technology or standard as the only tool available under that profile. A requirement for compliance may also be that the complexity of the encoded video sequence is within bounds defined by the level of the video compression technique or standard. In some cases, the level limits the maximum picture size, maximum frame rate, maximum reconstructed sample rate (eg, measured in megasamples per second), maximum reference picture size, and the like. Limits set by a level may in some cases be further constrained by the specification of a hypothetical reference decoder (HRD) and HRD buffer management metadata signaled in the encoded video sequence.

図6は、本開示の一実施形態によるビデオ符号化器（603）のブロック図を示す。ビデオ符号化器（603）は、点群を圧縮するV-PCC符号化器（300）で使用することができる。一例では、ビデオ圧縮モジュール（322）および（323）、ならびにビデオ圧縮モジュール（332）は、符号化器（603）と同様に構成される。 FIG. 6 shows a block diagram of a video encoder (603) according to one embodiment of this disclosure. The video encoder (603) can be used with the V-PCC encoder (300) to compress the point cloud. In one example, the video compression modules (322) and (323) and the video compression module (332) are configured similarly to the encoder (603).

ビデオ符号化器（603）は、パディングされたジオメトリ画像、パディングされたテクスチャ画像などの画像を受信し、圧縮画像を生成することができる。 A video encoder (603) can receive images, such as padded geometry images, padded texture images, and produce compressed images.

一実施形態によれば、ビデオ符号化器（603）は、リアルタイムで、またはアプリケーションによって要求される任意の他の時間制約下で、ソースビデオシーケンスのピクチャを符号化して符号化ビデオシーケンス（圧縮画像）に圧縮することができる。適切な符号化速度を強制することは、コントローラ（650）の一機能である。いくつかの実施形態では、コントローラ（650）は、以下に説明するように他の機能ユニットを制御し、他の機能ユニットに機能的に結合される。カップリングは、明確にするために示されていない。コントローラ（650）によって設定されるパラメータは、レート制御関連パラメータ（ピクチャスキップ、量子化器、レート歪み最適化技術のラムダ値、．．．）、ピクチャサイズ、グループオブピクチャ（GOP）レイアウト、最大動きベクトル探索範囲などを含むことができる。コントローラ（650）は、特定のシステム設計に最適化されたビデオ符号化器（603）に関する他の適切な機能を有するように構成することができる。 According to one embodiment, the video encoder (603) encodes the pictures of the source video sequence in real-time or under any other time constraint required by the application to produce an encoded video sequence (compressed images). ). Enforcing the proper encoding rate is a function of the controller (650). In some embodiments, the controller (650) controls and is functionally coupled to other functional units as described below. Couplings are not shown for clarity. Parameters set by the controller (650) include rate control related parameters (picture skip, quantizer, lambda value for rate-distortion optimization techniques, ...), picture size, group of pictures (GOP) layout, maximum motion It can include vector search ranges and the like. The controller (650) may be configured with other suitable functionality for the video encoder (603) optimized for a particular system design.

いくつかの実施形態では、ビデオ符号化器（603）は、符号化ループで動作するように構成される。過度に簡略化された説明として、一例では、符号化ループは、ソース符号化器（630）（例えば、符号化される入力ピクチャと、参照ピクチャとに基づいて、シンボルストリームのようなシンボルを生成することを担当する）と、ビデオ符号化器（603）に組み込まれた（ローカル）復号器（633）とを含むことができる。復号器（633）は、（リモート）復号器も作成するのと同様の方法でサンプルデータを作成するためにシンボルを再構築する（開示された主題で考慮されるビデオ圧縮技術では、シンボルと符号化ビデオビットストリームとの間の任意の圧縮が可逆的であるため）。再構築されたサンプルストリーム（サンプルデータ）は、参照ピクチャメモリ（634）に入力される。シンボルストリームの復号は、復号器位置（ローカルまたはリモート）とは無関係にビット正確な結果をもたらすので、参照ピクチャメモリ（634）内のコンテンツもローカル符号化器とリモート符号化器との間でビット正確である。言い換えると、符号化器の予測部は、復号中に予測を使用するときに復号器が「見る」のとまったく同じサンプル値を参照ピクチャサンプルとして「見る」。参照ピクチャ同期性（例えばチャネル誤差のために同期性を維持することができない場合、結果として生じるドリフト）のこの基本原理は、いくつかの関連技術においても使用される。 In some embodiments, the video encoder (603) is configured to operate in an encoding loop. As an oversimplified description, in one example, an encoding loop generates a symbol, such as a symbol stream, based on a source encoder (630) (e.g., input pictures to be encoded and reference pictures). and a (local) decoder (633) embedded in the video encoder (603). The decoder (633) reconstructs the symbols to produce sample data in a manner similar to that which also produces the (remote) decoder (in the video compression techniques considered in the disclosed subject matter, symbols and codes (because any compression to or from the encoded video bitstream is lossless). The reconstructed sample stream (sample data) is input to the reference picture memory (634). Since symbol stream decoding yields bit-accurate results regardless of decoder position (local or remote), the content in the reference picture memory (634) is also bit-accurate between local and remote encoders. Accurate. In other words, the predictor of the encoder "sees" as reference picture samples exactly the same sample values that the decoder "sees" when using prediction during decoding. This basic principle of reference picture synchrony (eg drift resulting when synchrony cannot be maintained due to channel error) is also used in some related techniques.

「ローカル」復号器（633）の動作は、図5に関連して既に詳細に説明したビデオ復号器（510）などの「リモート」復号器の動作と同じであり得る。しかしながら、図5も簡単に参照すると、シンボルが利用可能であり、エントロピー符号化器（645）および解析器（520）による符号化されたビデオシーケンスへのシンボルの符号化／復号は可逆的であり得るため、含むビデオ復号器（510）のエントロピー復号部、および解析器（520）は、ローカル復号器（633）に完全に実装されない場合がある。 The operation of the 'local' decoder (633) may be the same as that of a 'remote' decoder, such as the video decoder (510) already described in detail in connection with FIG. However, referring also briefly to FIG. 5, the symbols are available and the encoding/decoding of the symbols into the encoded video sequence by the entropy encoder (645) and analyzer (520) is lossless. Therefore, the entropy decoding part of the video decoder (510) including, and the analyzer (520) may not be fully implemented in the local decoder (633).

この時点でなされ得る観測は、復号器内に存在する構文解析／エントロピー復号を除く任意の復号器技術もまた、対応する符号化器内に実質的に同一の機能形態で存在する必要があるということである。このため、開示された主題は復号器動作に焦点を合わせている。符号化器技術の説明は、それらが包括的に説明された復号器技術の逆であるので省略することができる。特定の領域においてのみ、より詳細な説明が必要とされ、以下に提供される。 The observation that can be made at this point is that any decoder technique other than parsing/entropy decoding present in the decoder must also be present in the corresponding encoder in substantially the same functional form. That is. For this reason, the disclosed subject matter focuses on decoder operations. A description of the encoder techniques can be omitted as they are the inverse of the generically described decoder techniques. Only certain areas require more detailed explanation and are provided below.

動作中、いくつかの例では、ソース符号化器（630）は、「参照ピクチャ」として指定されたビデオシーケンスからの1つまたは複数の以前に符号化ピクチャを参照して入力ピクチャを予測的に符号化する動き補償予測符号化を実行することができる。このようにして、符号化エンジン（632）は、入力ピクチャの画素ブロックと、入力ピクチャに対する予測参照として選択され得る参照ピクチャの画素ブロックとの間の差分を符号化する。 In operation, in some examples, the source encoder (630) predictively encodes an input picture with reference to one or more previously encoded pictures from a video sequence, designated as "reference pictures." Encoding motion compensated predictive encoding can be performed. In this way, the encoding engine (632) encodes the difference between the pixelblocks of the input picture and the pixelblocks of the reference pictures that can be selected as prediction references for the input picture.

ローカルビデオ復号器（633）は、ソース符号化器（630）によって生成されたシンボルに基づいて、参照ピクチャとして指定され得るピクチャの符号化ビデオデータを復号し得る。符号化エンジン（632）の動作は、不可逆処理であることが有利であり得る。符号化ビデオデータがビデオ復号器（図6には示されていない）で復号され得るとき、再構築されたビデオシーケンスは、通常、いくつかのエラーを伴うソースビデオシーケンスのレプリカであり得る。ローカルビデオ復号器（633）は、参照ピクチャに対してビデオ復号器によって実行され得る復号処理を複製し、再構築された参照ピクチャを参照ピクチャキャッシュ（634）に記憶させることができる。このようにして、ビデオ符号化器（603）は、遠端ビデオ復号器によって取得されることになる再構築された参照ピクチャとして共通のコンテンツを有する再構築された参照ピクチャの複製をローカルに格納することができる（伝送エラーなし）。 A local video decoder (633) may decode encoded video data for pictures that may be designated as reference pictures based on the symbols generated by the source encoder (630). The operation of the encoding engine (632) may advantageously be a lossy process. When the encoded video data can be decoded with a video decoder (not shown in FIG. 6), the reconstructed video sequence will usually be a replica of the source video sequence with some errors. The local video decoder (633) may replicate the decoding process that may be performed by the video decoder on the reference pictures and store the reconstructed reference pictures in the reference picture cache (634). In this way, the video encoder (603) locally stores duplicates of reconstructed reference pictures with common content as the reconstructed reference pictures to be retrieved by the far-end video decoder. (no transmission errors).

予測器（635）は、符号化エンジン（632）の予測検索を実行することができる。すなわち、符号化されるべき新しいピクチャについて、予測器（635）は、（候補参照画素ブロックとしての）サンプルデータ、または、新しいピクチャの適切な予測参照として機能し得る参照ピクチャの動きベクトル、ブロック形状などの特定のメタデータを求めて参照ピクチャメモリ（634）を探索することができる。予測器（635）は、適切な予測参照を見つけるために、サンプルブロックごとに動作することができる。場合によっては、予測器（635）によって取得された検索結果によって判定されるように、入力ピクチャは、参照ピクチャメモリ（634）に格納された複数の参照ピクチャから描画された予測参照を有することができる。 A predictor (635) can perform a predictive search for the encoding engine (632). That is, for a new picture to be encoded, the predictor (635) uses sample data (as candidate reference pixel blocks) or motion vectors of reference pictures that can serve as good predictive references for the new picture, block shape The reference picture memory (634) can be searched for specific metadata such as. The predictor (635) can operate on a sample block by sample block to find a suitable prediction reference. In some cases, the input picture may have prediction references drawn from multiple reference pictures stored in the reference picture memory (634), as determined by the search results obtained by the predictor (635). can.

コントローラ（650）は、例えば、ビデオデータを符号化するために使用されるパラメータおよびサブグループパラメータの設定を含む、ソース符号化器（630）の符号化動作を管理することができる。 The controller (650) can manage the encoding operations of the source encoder (630), including, for example, setting parameters and subgroup parameters used to encode the video data.

前述のすべての機能ユニットの出力は、エントロピー符号化器（645）においてエントロピー符号化を受けることができる。エントロピー符号化器（645）は、ハフマン符号化、可変長符号化、算術符号化などの技術に従って圧縮画像643を生成するシンボルを可逆圧縮することによって、様々な機能ユニットによって生成されたシンボルを符号化されたビデオシーケンスに変換する。 The outputs of all the functional units mentioned above may undergo entropy encoding in an entropy encoder (645). An entropy encoder (645) encodes the symbols produced by the various functional units by losslessly compressing the symbols producing a compressed image 643 according to techniques such as Huffman coding, variable length coding, arithmetic coding. converted to a formatted video sequence.

コントローラ（650）は、ビデオ符号化器（603）の動作を管理することができる。符号化中、コントローラ（650）は、各符号化ピクチャに特定の符号化ピクチャ形式を割り当てることができ、これは、それぞれのピクチャに適用され得る符号化技術に影響を及ぼし得る。例えば、ピクチャは、以下のピクチャ形式のうちの1つとして割り当てられることが多い。 A controller (650) may manage the operation of the video encoder (603). During encoding, the controller (650) can assign each coded picture a particular coded picture format, which can affect the coding technique that can be applied to each picture. For example, pictures are often assigned as one of the following picture types.

なお、イントラピクチャ（Iピクチャ）は、シーケンス内の他のピクチャを予測元とせずに符号化・復号可能なものであってもよい。いくつかのビデオ符号化は、例えば、独立復号器リフレッシュ（「IDR」）ピクチャを含む異なる形式のイントラピクチャを可能にする。当業者は、Iピクチャのこれらの変形ならびにそれらのそれぞれの用途および特徴を認識している。 Intra pictures (I pictures) may be coded and decoded without using other pictures in the sequence as prediction sources. Some video encodings allow different types of intra pictures, including independent decoder refresh (“IDR”) pictures, for example. Those skilled in the art are aware of these variations of I-pictures and their respective uses and characteristics.

予測ピクチャ（Pピクチャ）は、各ブロックのサンプル値を予測するために、最大で1つの動きベクトルおよび参照インデックスを使用するイントラ予測またはインター予測を使用して符号化および復号され得るものであり得る。 Predicted pictures (P-pictures) may be those that can be coded and decoded using intra-prediction or inter-prediction, which uses at most one motion vector and reference indices to predict the sample values of each block. .

双方向予測ピクチャ（Bピクチャ）は、各ブロックのサンプル値を予測するために、最大で2つの動きベクトルおよび参照インデックスを使用するイントラ予測またはインター予測を使用して符号化および復号され得るものであり得る。同様に、複数の予測ピクチャは、単一のブロックの再構築のために3つ以上の参照ピクチャおよび関連するメタデータを使用することができる。 A bi-predictive picture (B-picture) can be coded and decoded using intra- or inter-prediction, which uses at most two motion vectors and reference indices to predict the sample values of each block. could be. Similarly, multiple predicted pictures can use more than two reference pictures and associated metadata for reconstruction of a single block.

ソースピクチャは、一般に、複数のサンプルブロック（例えば、各々、4×4、8×8、4×8、または16×16のサンプルのブロック）に空間的に細分化され、ブロックごとに符号化され得る。ブロックは、ブロックのそれぞれのピクチャに適用される符号化割当によって判定されるように、他の（既に符号化された）ブロックを参照して予測的に符号化され得る。例えば、Iピクチャのブロックは、非予測的に符号化されてもよいし、同じピクチャの既に符号化されたブロックを参照して予測的に符号化されてもよい（空間予測またはイントラ予測）。Pピクチャの画素ブロックは、以前に符号化された1つの参照ピクチャを参照して、空間予測を介して、または時間予測を介して予測的に符号化され得る。Bピクチャのブロックは、1つまたは2つの以前に符号化された参照ピクチャを参照して、空間予測を介して、または時間予測を介して予測的に符号化され得る。 A source picture is typically spatially subdivided into multiple sample blocks (e.g., blocks of 4x4, 8x8, 4x8, or 16x16 samples each) and coded block by block. obtain. A block may be predictively coded with reference to other (already coded) blocks, as determined by the coding assignment applied to the block's respective picture. For example, blocks of an I picture may be coded non-predictively or predictively (spatial prediction or intra prediction) with reference to already coded blocks of the same picture. Pixel blocks of P pictures may be predictively coded via spatial prediction or via temporal prediction with reference to one previously coded reference picture. Blocks of B pictures may be predictively coded via spatial prediction or via temporal prediction with reference to one or two previously coded reference pictures.

ビデオ符号化器（603）は、例えばITU-T Rec.H.265のような所定のビデオ符号化技術または規格に従って符号化動作を実行し得る。その動作において、ビデオ符号化器（603）は、入力ビデオシーケンス内の時間的および空間的冗長性を利用する予測符号化動作を含む、様々な圧縮動作を実行することができる。したがって、符号化ビデオデータは、使用されているビデオ符号化技術または規格によって指定された構文に準拠することができる。 The video encoder (603) may perform encoding operations according to a predetermined video encoding technique or standard, such as ITU-T Rec.H.265. In its operation, the video encoder (603) can perform various compression operations, including predictive encoding operations that exploit temporal and spatial redundancies within the input video sequence. The encoded video data can thus conform to the syntax specified by the video encoding technique or standard being used.

ビデオは、時系列における複数のソースピクチャ（画像）の形態であってもよい。イントラピクチャ予測（しばしばイントラ予測と略される）は、所与のピクチャにおける空間相関を利用し、インターピクチャ予測は、ピクチャ間の（時間的または他の）相関を利用する。一例では、現在のピクチャと呼ばれる、符号化／復号中の特定のピクチャがブロックに区分される。現在のピクチャ内のブロックがビデオ内の以前に符号化されてまだバッファされている参照ピクチャ内の参照ブロックに類似しているとき、現在のピクチャ内のブロックは、動きベクトルと呼ばれるベクトルによって符号化することができる。動きベクトルは、参照ピクチャ内の参照ブロックを指し、複数の参照ピクチャが使用されている場合、参照ピクチャを識別する第3の次元を有することができる。 A video may be in the form of multiple source pictures (images) in time series. Intra-picture prediction (often abbreviated as intra-prediction) exploits spatial correlations in a given picture, while inter-picture prediction exploits (temporal or other) correlations between pictures. In one example, a particular picture being encoded/decoded, called the current picture, is partitioned into blocks. When a block in the current picture is similar to a reference block in a previously encoded and still buffered reference picture in the video, the block in the current picture is coded by a vector called a motion vector. can do. A motion vector refers to a reference block within a reference picture and can have a third dimension that identifies the reference picture if multiple reference pictures are used.

いくつかの実施形態では、インターピクチャ予測に双予測技術を使用することができる。双予測技術によれば、第1の参照ピクチャおよび第2の参照ピクチャなどの2つの参照ピクチャが使用され、これらは両方ともビデオ内の現在のピクチャの復号順より前にある（しかし、表示順序はそれぞれ過去および未来のものであってもよい）。現在のピクチャ内のブロックは、第1の参照ピクチャ内の第1の参照ブロックを指す第1の動きベクトル、および第2の参照ピクチャ内の第2の参照ブロックを指す第2の動きベクトルによって符号化することができる。ブロックは、第1の参照ブロックと第2の参照ブロックとの組み合わせによって予測することができる。 In some embodiments, bi-prediction techniques may be used for inter-picture prediction. According to bi-prediction techniques, two reference pictures are used, such as a first reference picture and a second reference picture, both of which precede the current picture in the video in decoding order (but display order). may be past and future, respectively). A block in the current picture is coded by a first motion vector pointing to a first reference block in a first reference picture and a second motion vector pointing to a second reference block in a second reference picture. can be A block can be predicted by a combination of a first reference block and a second reference block.

さらに、符号化効率を改善するために、インターピクチャ予測にマージモード技術を使用することができる。 In addition, merge mode techniques can be used for inter-picture prediction to improve coding efficiency.

本開示のいくつかの実施形態によれば、インターピクチャ予測およびイントラピクチャ予測などの予測は、ブロック単位で実行される。例えば、HEVC規格によれば、ビデオピクチャのシーケンス内のピクチャは、圧縮のために符号化ツリーユニット（CTU）に区分され、ピクチャ内のCTUは、64×64画素、32×32画素、または16×16画素などの同じサイズを有する。一般に、CTUは、1つの輝度CTBおよび2つの彩度CTBである3つの符号化ツリーブロック（CTB）を含む。各CTUは、1つまたは複数の符号化ユニット（CU）に再帰的に四分木分割することができる。例えば、64×64画素のCTUは、64×64画素の1つのCU、または32×32画素の4つのCU、または16×16画素の16個のCUに分割することができる。一例では、各CUは、インター予測形式またはイントラ予測形式などのCUの予測形式を判定するために分析される。CUは、時間的および／または空間的な予測可能性に応じて、1つまたは複数の予測ユニット（PU）に分割される。一般に、各PUは、輝度予測ブロック（PB）と、2つの彩度PBとを含む。一実施形態では、符号化（符号化／復号）における予測演算は、予測ブロックの単位で実行される。予測ブロックの例として輝度予測ブロックを使用すると、予測ブロックは、8×8画素、16×16画素、8×16画素、16×8画素などの画素の値の行列（例えば、輝度値）を含む。 According to some embodiments of the present disclosure, prediction, such as inter-picture prediction and intra-picture prediction, is performed on a block-by-block basis. For example, according to the HEVC standard, pictures in a sequence of video pictures are partitioned into Coding Tree Units (CTUs) for compression, and a CTU within a picture can be 64x64 pixels, 32x32 pixels, or 16 pixels. have the same size, such as x16 pixels. In general, a CTU contains three coding treeblocks (CTBs), one luma CTB and two chroma CTBs. Each CTU can be recursively quadtree split into one or more coding units (CUs). For example, a CTU of 64x64 pixels can be divided into one CU of 64x64 pixels, or four CUs of 32x32 pixels, or 16 CUs of 16x16 pixels. In one example, each CU is analyzed to determine the CU's prediction type, such as an inter-prediction type or an intra-prediction type. A CU is divided into one or more prediction units (PUs) according to temporal and/or spatial predictability. In general, each PU contains a luma prediction block (PB) and two chroma PBs. In one embodiment, a prediction operation in encoding (encoding/decoding) is performed in units of prediction blocks. Using a luminance prediction block as an example of a prediction block, the prediction block contains a matrix of pixel values (e.g., luminance values) such as 8x8 pixels, 16x16 pixels, 8x16 pixels, 16x8 pixels, etc. .

G-PCCモデルは、ジオメトリ情報と、色または反射率などの関連する属性とを別々に圧縮することができる。点群の3D座標であるジオメトリ情報は、その占有情報の八分木分解によって符号化することができる。一方、属性は、予測技術およびリフティング技術を使用して再構築されたジオメトリに基づいて圧縮することができる。八分木区分処理については、例えば図7～図13で説明する。 A G-PCC model can separately compress geometric information and associated attributes such as color or reflectance. Geometry information, which is the 3D coordinates of a point cloud, can be encoded by an octree decomposition of its occupancy information. Attributes, on the other hand, can be compressed based on the reconstructed geometry using predictive and lifting techniques. The octree partitioning process will be explained with reference to FIGS. 7 to 13, for example.

図7は、一実施形態によるG-PCC分解処理中に適用されるG-PCC復号器（800）のブロック図を示す。復号器（800）は、圧縮ビットストリームを受信し、点群データ解凍を実行してビットストリームを解凍し、復号された点群データを生成するように構成することができる。一実施形態では、復号器（800）は、算術復号モジュール（810）、逆量子化モジュール（820）、八分木復号モジュール（830）、LOD生成モジュール（840）、逆量子化モジュール（850）、および逆補間ベースの予測モジュール（860）を含むことができる。 FIG. 7 shows a block diagram of a G-PCC decoder (800) applied during the G-PCC decomposition process according to one embodiment. The decoder (800) can be configured to receive the compressed bitstream and perform point cloud data decompression to decompress the bitstream and generate decoded point cloud data. In one embodiment, the decoder (800) includes an arithmetic decoding module (810), an inverse quantization module (820), an octree decoding module (830), an LOD generation module (840), an inverse quantization module (850). , and an inverse interpolation-based prediction module (860).

図示されるように、圧縮ビットストリーム（801）は、算術復号モジュール（810）において受信され得る。算術復号モジュール（810）は、圧縮ビットストリーム（801）を復号して、点群の量子化された予測残差（生成された場合）および占有コード（またはシンボル）を取得するように構成される。八分木復号モジュール（830）は、占有コードに従って点群内の点の量子化位置を生成するように構成される。逆量子化モジュール（850）は、八分木復号モジュール（830）によって提供された量子化位置に基づいて点群内の点の再構築位置を生成するように構成される。 As shown, a compressed bitstream (801) may be received at an arithmetic decoding module (810). The arithmetic decoding module (810) is configured to decode the compressed bitstream (801) to obtain quantized prediction residuals (if generated) and occupancy codes (or symbols) for the point cloud. . The octree decoding module (830) is configured to generate quantized positions of points in the point cloud according to the occupancy code. The inverse quantization module (850) is configured to generate reconstructed positions of points in the point cloud based on the quantized positions provided by the octree decoding module (830).

LOD生成モジュール（840）は、再構築された位置に基づいて点を異なるLODに再構築し、LODベースの順序を決定するように構成される。逆量子化モジュール（820）は、算術復号モジュール（810）から受信した量子化された予測残差に基づいて再構築された予測残差を生成するように構成される。逆補間ベースの予測モジュール（860）は、属性予測処理を実行して、逆量子化モジュール（820）から受信した再構築された予測残差およびLOD生成モジュール（840）から受信したLODベースの順序に基づいて、点群内の点の再構築された属性を生成するように構成される。 The LOD generation module (840) is configured to reconstruct the points into different LODs based on the reconstructed positions and determine the LOD-based order. The inverse quantization module (820) is configured to generate a reconstructed prediction residual based on the quantized prediction residual received from the arithmetic decoding module (810). The Inverse Interpolation Based Prediction module (860) performs attribute prediction processing to obtain the reconstructed prediction residuals received from the Inverse Quantization module (820) and the LOD based order received from the LOD Generation module (840). is configured to generate reconstructed attributes of points in the point cloud based on .

さらに、逆補間ベースの予測モジュール（860）から生成された再構築された属性は、逆量子化モジュール（850）から生成された再構築された位置と共に、一例では、復号器（800）から出力される復号された点群（または再構築された点群）（802）に対応する。 Further, the reconstructed attributes generated from the inverse interpolation-based prediction module (860), along with the reconstructed positions generated from the inverse quantization module (850), in one example, are output from the decoder (800). corresponding to the decoded point cloud (or reconstructed point cloud) (802).

図8は、一実施形態によるG－PPC符号化器（700）のブロック図を示す。符号化器（700）は、点群データを受信し、点群データを圧縮して、圧縮された点群データを搬送するビットストリームを生成するように構成することができる。一実施形態では、符号化器（700）は、位置量子化モジュール（710）、重複点除去モジュール（712）、八分木符号化モジュール（730）、属性転送モジュール（720）、詳細レベル（LOD）生成モジュール（740）、補間ベースの予測モジュール（750）、残差量子化モジュール（760）、および算術符号化モジュール（770）を含むことができる。 FIG. 8 shows a block diagram of a G-PPC encoder (700) according to one embodiment. The encoder (700) may be configured to receive point cloud data, compress the point cloud data, and generate a bitstream carrying the compressed point cloud data. In one embodiment, the encoder (700) includes a position quantization module (710), a duplicate point removal module (712), an octree encoding module (730), an attribute transfer module (720), a level of detail (LOD ) generation module (740), interpolation-based prediction module (750), residual quantization module (760), and arithmetic coding module (770).

図示のように、入力点群（701）を符号化器（700）で受信することができる。点群（701）の位置（例えば、3D座標）は量子化モジュール（710）に提供される。量子化モジュール（710）は、座標を量子化して量子化位置を生成するように構成される。重複点除去モジュール（712）は、量子化位置を受け取り、フィルタ処理を実行して重複点を識別および除去するように構成される。八分木符号化モジュール（730）は、重複点除去モジュール（712）からフィルタリングされた位置を受信し、八分木ベースの符号化処理を実行して、ボクセルの3Dグリッドを記述する一連の占有コード（またはシンボル）を生成するように構成される。占有コードは算術符号化モジュール（770）に提供される。 As shown, an input point cloud (701) can be received at an encoder (700). The positions (eg, 3D coordinates) of the point cloud (701) are provided to the quantization module (710). A quantization module (710) is configured to quantize the coordinates to generate quantized positions. A duplicate point removal module (712) is configured to receive the quantized positions and perform filtering to identify and remove duplicate points. An octree encoding module (730) receives the filtered positions from the remove duplicate points module (712) and performs an octree-based encoding process to generate a sequence of occupancy describing a 3D grid of voxels. Configured to generate code (or symbols). Occupancy codes are provided to the arithmetic coding module (770).

属性転送モジュール（720）は、入力点群の属性を受信し、複数の属性値が各ボクセルに関連付けられている場合に、各ボクセルの属性値を決定するための属性転送処理を実行するように構成される。属性転送処理は、八分木符号化モジュール（730）から出力された再順序付けされた点に対して実行することができる。転送動作後の属性は、補間ベースの予測モジュール（750）に提供される。LOD生成モジュール（740）は、八分木符号化モジュール（730）から出力された再順序付けされた点に対して動作し、点を異なるLODに再編成するように構成される。LOD情報は、補間ベースの予測モジュール（750）に供給される。 An attribute transfer module (720) receives attributes of the input point cloud and performs an attribute transfer process to determine attribute values for each voxel when multiple attribute values are associated with each voxel. Configured. Attribute transfer processing can be performed on the reordered points output from the octree encoding module (730). Post-transfer attributes are provided to an interpolation-based prediction module (750). The LOD generation module (740) is configured to operate on the reordered points output from the octree encoding module (730) and reorganize the points into different LODs. LOD information is provided to an interpolation-based prediction module (750).

補間ベースの予測モジュール（750）は、LOD生成モジュール（740）からのLOD情報によって示されるLODベースの順序および属性転送モジュール（720）から受信した転送された属性に従って点を処理し、予測残差を生成する。残差量子化モジュール（760）は、補間ベースの予測モジュール（750）から予測残差を受信し、量子化を実行して量子化された予測残差を生成するように構成される。量子化された予測残差は算術符号化モジュール（770）に提供される。算術符号化モジュール（770）は、八分木符号化モジュール（730）から占有コード、候補インデックス（使用される場合）、補間ベースの予測モジュール（750）からの量子化された予測残差、および他の情報を受信し、エントロピー符号化を実行して、受信した値または情報をさらに圧縮するように構成される。これにより、圧縮情報を伝送する圧縮ビットストリーム（702）を生成することができる。ビットストリーム（702）は、圧縮ビットストリームを復号する復号器に送信されるか、またはそうでなく提供されてもよく、または記憶装置に記憶されてもよい。 The interpolation-based prediction module (750) processes the points according to the LOD-based order indicated by the LOD information from the LOD generation module (740) and the transferred attributes received from the attribute transfer module (720) to generate prediction residuals to generate The residual quantization module (760) is configured to receive prediction residuals from the interpolation-based prediction module (750) and perform quantization to produce quantized prediction residuals. Quantized prediction residuals are provided to the arithmetic coding module (770). The arithmetic coding module (770) receives the occupied codes, the candidate indices (if used) from the octree coding module (730), the quantized prediction residuals from the interpolation-based prediction module (750), and It is configured to receive other information and perform entropy coding to further compress the received value or information. This can generate a compressed bitstream (702) that carries the compressed information. The bitstream (702) may be transmitted or otherwise provided to a decoder that decodes the compressed bitstream, or may be stored in a storage device.

本明細書に開示された属性予測技術を実装するように構成された補間ベースの予測モジュール（750）および逆補間ベースの予測モジュール（860）は、図7および図8に示されたものと同様または異なる構造を有することができる他の復号器または符号化器に含まれることができることに留意されたい。さらに、符号化器（700）および復号器（800）は、同じ装置、または様々な例では別個の装置に含まれ得る。 Interpolation-based prediction module (750) and inverse interpolation-based prediction module (860) configured to implement the attribute prediction techniques disclosed herein are similar to those shown in FIGS. or included in other decoders or encoders that may have a different structure. Further, the encoder (700) and decoder (800) may be included in the same device or, in various examples, separate devices.

様々な実施形態において、符号化器（300）、復号器（400）、符号化器（700）、および／または復号器（800）は、ハードウェア、ソフトウェア、またはそれらの組み合わせで実装することができる。例えば、符号化器（300）、復号器（400）、符号化器（700）、および／または復号器（800）は、特定用途向け集積回路（ASIC）、フィールドプログラマブルゲートアレイ（FPGA）などのソフトウェアを用いて、または用いずに動作する1つまたは複数の集積回路（IC）などの処理回路を用いて実装することができる。別の例では、符号化器（300）、復号器（400）、符号化器（700）、および／または復号器（800）は、不揮発性（または非一時的）コンピュータ可読記憶媒体に記憶された命令を含むソフトウェアまたはファームウェアとして実装することができる。命令は、1つまたは複数のプロセッサなどの処理回路によって実行されると、処理回路に、符号化器（300）、復号器（400）、符号化器（700）、および／または復号器（800）の機能を実行させる。 In various embodiments, the encoder (300), decoder (400), encoder (700), and/or decoder (800) can be implemented in hardware, software, or a combination thereof. can. For example, the encoder (300), decoder (400), encoder (700), and/or decoder (800) may be application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), etc. It can be implemented using processing circuitry such as one or more integrated circuits (ICs) that operate with or without software. In another example, the encoder (300), decoder (400), encoder (700), and/or decoder (800) are stored in a non-volatile (or non-transitory) computer-readable storage medium. It can be implemented as software or firmware containing instructions. The instructions, when executed by a processing circuit, such as one or more processors, instruct the processing circuit to operate the encoder (300), decoder (400), encoder (700), and/or decoder (800). ) functions.

すべての軸（例えば、x、yおよびz軸）に沿って対称的に3D立方体によって定義される点群の区分は、点群圧縮（PCC）における八分木（Octree：OT）区分として知られる8つの下位立方体をもたらすことができる。OT区分は、1次元の二分木（BT）区分および2次元空間の4分木（QT）区分に似ている。OT区分の概念は図9に示すことができ、図9では、実線の3D立方体（900）を破線の8つの小さな等しいサイズの立方体に区分することができる。図9に示すように、八分木区分技法は、3D立方体（900）を8つの小さな等しいサイズの立方体0～7に分割することができる。 A point cloud partition defined by a 3D cube symmetrically along all axes (e.g., x, y and z axes) is known as an Octree (OT) partition in point cloud compression (PCC). Can yield 8 subcubes. OT partitioning is similar to binary tree (BT) partitioning in one-dimensional space and quadtree (QT) partitioning in two-dimensional space. The concept of OT partitioning can be illustrated in Figure 9, where a solid 3D cube (900) can be partitioned into 8 smaller equal sized cubes with dashed lines. As shown in Figure 9, the octree partitioning technique can divide a 3D cube (900) into eight smaller, equal-sized cubes 0-7.

八分木区分技術（例えば、TMC13では、）において、八分木ジオメトリコーデックが使用される場合、ジオメトリ符号化は以下のように進む。第一に、立方体軸に整列した境界ボックスBは、2つの端点（0，0，0）および（2^d，2^d，2^d）によって定義することができ、ここで2^dは境界ボックスBのサイズを定義し、dはビットストリームに符号化することができる。したがって、定義された境界ボックスBの内側のすべての点を圧縮することができる。 In an octree partitioning technique (eg, in TMC13), if an octree geometry codec is used, geometry encoding proceeds as follows. First, a bounding box B aligned with the cube axes can be defined by its two endpoints (0,0,0) and ( ^2d , ^2d , ^2d ), where ^2d is the bounding box B , and d can be encoded into the bitstream. Therefore, all points inside the defined bounding box B can be compressed.

次いで、境界ボックスBを再帰的に細分化することによって八分木構造を構築することができる。各段階で、立方体を8つの下位立方体に細分化することができる。k（k≦d）回反復的に細分化した後の下位立方体のサイズは、（2^d－k，2^d－k，2^d－k）とすることができる。次いで、対応する下位立方体が点を含む（すなわち、満たされたであり、値1を有する）か含まない（すなわち、空であり、値0を有する）かを示すために、占有コードなどの8ビットコードを、各下位立方体に1ビット値を関連付けることによって生成することができる。1より大きいサイズ（すなわち、非ボクセル）を有する満たされた下位立方体のみをさらに細分化することができる。次いで、各立方体の占有コードを算術符号化器によって圧縮することができる。 An octree structure can then be constructed by subdividing the bounding box B recursively. At each stage, the cube can be subdivided into eight subcubes. The size of the subcube after k (k≦d) iterative subdivisions can be (2 ^d−k , 2 ^d−k , 2 ^d−k ). Then, an occupancy code such as 8 A bitcode can be generated by associating a 1-bit value with each subcube. Only filled subcubes with a size greater than 1 (ie non-voxels) can be further subdivided. The occupancy code for each cube can then be compressed by an arithmetic encoder.

復号処理は、境界ボックスBのビットストリーム次元から読み取ることによって開始することができる。次いで、復号された占有コードに従って境界ボックスBを細分することによって同じ八分木構造を構築することができる。2レベルOT区分および対応する占有コードの一例を図10に示すことができ、陰影が付けられた立方体およびノードは、立方体およびノードが点によって占有されていることを示す。 The decoding process can begin by reading from the bitstream dimension of the bounding box B. The same octree structure can then be constructed by subdividing the bounding box B according to the decoded occupation code. An example of a two-level OT partition and corresponding occupancy code can be shown in FIG. 10, where the shaded cubes and nodes indicate that the cubes and nodes are occupied by points.

図10は、本開示のいくつかの実施形態による、八分木区分（1010）および八分木区分（1010）に対応する八分木構造（1020）の一例を示す。図10は、八分木区分（1010）における2つのレベルの区分を示す。八分木構造（1020）は、八分木区分（1010）のための立方体ボックスに対応するノード（N0）を含む。第1のレベルでは、立方体ボックスは、図9に示す番号付け技術に従って0～7の番号が付けられた8つの下位立方体ボックスに区分される。ノードN0の区分の占有コードはバイナリで「10000001」であり、これは、ノードN0-0によって表される第1の下位立方体ボックスおよびノードN0-7によって表される第8の下位立方体ボックスが点群内の点を含み、他の下位立方体ボックスが空であることを示す。 FIG. 10 shows an example of an octree partition (1010) and an octree structure (1020) corresponding to the octree partition (1010), according to some embodiments of the present disclosure. FIG. 10 shows two levels of partitioning in an octree partitioning (1010). The octree structure (1020) includes nodes (N0) corresponding to the cubic boxes for the octree partition (1010). At the first level, the cubic box is partitioned into eight sub-cubic boxes numbered 0-7 according to the numbering technique shown in FIG. The occupancy code for the partition of node N0 is "10000001" in binary, which means that the first sub-cubic box represented by nodes N0-0 and the eighth sub-cubic box represented by nodes N0-7 point Including the points in the group and indicating that the other subcubic boxes are empty.

次に、第2のレベルの区分では、第1の下位立方体ボックス（ノードN0-0によって表される）および第8の下位立方体ボックス（ノードN0-7によって表される）は、それぞれ8つの八分の一区分にさらに細分される。例えば、第1の下位立方体ボックス（ノードN0-0によって表される）は、図9に示す番号付け技術に従って0～7の番号が付けられた8つのより小さい下位立方体ボックスに区分される。ノードN0-0の区分の占有コードはバイナリで「00011000」であり、これは第4のより小さい下位立方体ボックス（ノードN0-0-3によって表される）および第5のより小さい下位立方体ボックス（ノードN0-0-4によって表される）が点群内の点を含み、他のより小さい下位立方体ボックスが空であることを示す。第2のレベルでは、第7の下位立方体ボックス（ノードN0-7によって表される）は、図10に示すように、8つのより小さい下位立方体ボックスに同様に区分される。 Then, in the second level partitioning, the first sub-cubic box (represented by nodes N0-0) and the eighth sub-cubic box (represented by nodes N0-7) are each divided into eight octahedral boxes. further subdivided into subdivisions. For example, the first sub-cubic box (represented by nodes N0-0) is partitioned into eight smaller sub-cubic boxes numbered 0-7 according to the numbering technique shown in FIG. The occupancy code for the partition of node N0-0 is "00011000" in binary, which corresponds to the fourth smaller subcubic box (represented by nodes N0-0-3) and the fifth smaller subcubic box ( represented by nodes N0-0-4) contain points in the point cloud and other smaller subcubic boxes are empty. At the second level, the seventh sub-cubic box (represented by nodes N0-7) is similarly partitioned into eight smaller sub-cubic boxes, as shown in FIG.

図10の例では、空でない立方体空間（例えば、立方体ボックス、下位立方体ボックス、より小さい下位立方体ボックスなど）に対応するノードはグレーで陰影が付けられ、陰影が付けられたノードと呼ばれる。 In the example of Figure 10, nodes corresponding to non-empty cubic spaces (e.g., cubic boxes, subcubic boxes, smaller subcubic boxes, etc.) are shaded in gray and referred to as shaded nodes.

元のTMC13設計では、例えば、上述したように、境界ボックスBは、すべての次元に対して同じサイズを有する立方体になるように制限することができ、したがって、OT区分は、下位立方体がすべての次元に対してサイズが半分になる各ノードですべての下位立方体に対して実行することができる。OT区分は、下位立方体のサイズが1に達するまで再帰的に実行することができる。しかしながら、このような方法での区分は、特に点が3Dシーン（または3D空間）に不均一に分布している場合、すべての場合に効率的であるとは限らない。 In the original TMC13 design, for example, as mentioned above, the bounding box B could be constrained to be a cube with the same size for all dimensions, so the OT partition was such that the subcubes were all It can be done for all subcubes with each node being half the size for the dimension. OT partitioning can be performed recursively until the size of the subcube reaches one. However, segmentation in such a manner is not efficient in all cases, especially if the points are unevenly distributed in the 3D scene (or 3D space).

1つの極端な場合は、3D空間内の2D平面であり得、3D空間内のx－y平面上にすべての点を配置することができ、z軸の変動は0であり得る。そのような場合、開始点として立方体Bに対して実行されるOT区分は、z方向の占有情報を表すために大量のビットを浪費する可能性があり、冗長であり、有用ではない。実際の用途では、最悪のケースは頻繁に発生しない場合がある。しかしながら、他の方向と比較して一方向の分散が少ない点群を有することが一般的である。図11に示すように、TMC13で「ford_01_vox1mm」と名付けられた点群シーケンスは、xおよびy方向に主成分を有することができる。実際、Lidarシステムから生成された多くの点群データは、同じ特性を有することができる。 One extreme case could be a 2D plane in 3D space, where all points can be placed on the xy plane in 3D space, and the z-axis variation can be zero. In such cases, the OT partitioning performed on cube B as a starting point may waste a large amount of bits to represent the occupancy information in the z direction, which is redundant and not useful. In practical applications, the worst case may not occur often. However, it is common to have point clouds with less variance in one direction compared to other directions. As shown in Figure 11, the point cloud sequence named "ford_01_vox1mm" in TMC13 can have principal components in the x and y directions. In fact, many point cloud data generated from lidar systems can have the same properties.

四分木および二分木（QtBt）区分では、境界ボックスBは立方体に限定されない場合があり、代わりに、境界ボックスBは、3Dシーンまたはオブジェクトの形状によりよく適合する任意サイズの長方形の直方体とすることができる。実装では、境界ボックスBのサイズは、2の累乗、例えば（2^dx，2^dy，2^dz）として表すことができる。 For quadtree and binary tree (QtBt) partitions, the bounding box B may not be restricted to a cube, instead the bounding box B may be an arbitrary size rectangular cuboid that better fits the shape of the 3D scene or object be able to. In implementations, the size of the bounding box B can be expressed as a power of 2, eg, ( ^2dx , ^2dy , ^2dz ).

境界ボックスBは完全な立方体ではない可能性があるため、場合によっては、ノードはすべての方向に沿って区分されない（または区分できない）場合がある。区分が3つの方向すべてで実行される場合、区分は典型的なOT区分である。3つのうちの2つの方向で区分が行われる場合、区分は3DのQT区分である。区分が一方向のみで実行される場合、区分は3DのBT区分である。3DにおけるQTおよびBTの例をそれぞれ図12および図13に示す。 Since the bounding box B may not be a perfect cube, in some cases the nodes may not (or may not) be partitioned along all directions. A partition is a typical OT partition if the partition is performed in all three directions. A segmentation is a 3D QT segmentation if segmentation is performed in two of the three directions. If the segmentation is performed in only one direction, the segmentation is a 3D BT segmentation. Examples of QT and BT in 3D are shown in Figures 12 and 13, respectively.

図12に示すように、3D立方体1201は、x－y軸に沿って4つの下位立方体0、2、4、および6に区分することができる。3D立方体1202は、x－z軸に沿って4つの下位立方体0、1、4および5に区分することができる。3D立方体1203は、y－z軸に沿って4つの下位立方体0、1、2および3に区分することができる。図13では、3D立方体1301をx軸に沿って2つの下位立方体0および4に区分することができる。3D立方体1302は、2つの下位立方体0および2に区分することができる。3D立方体1303は、2つの下位立方体0および1に区分することができる。 As shown in FIG. 12, a 3D cube 1201 can be partitioned into four subcubes 0, 2, 4, and 6 along the xy axes. The 3D cube 1202 can be partitioned into four subcubes 0, 1, 4 and 5 along the xz axis. The 3D cube 1203 can be partitioned into four subcubes 0, 1, 2 and 3 along the yz axis. In FIG. 13, a 3D cube 1301 can be partitioned into two sub-cubes 0 and 4 along the x-axis. The 3D cube 1302 can be partitioned into two sub-cubes 0 and 2. The 3D cube 1303 can be partitioned into two sub-cubes 0 and 1.

TMC13における暗黙的QT区分およびBT区分の条件を定義するために、2つのパラメータ（すなわち、KおよびM）を適用することができる。第1のパラメータK（0≦K≦max（d_x，d_y，d_z）－min（d_x，d_y，d_z））は、OT区分の前に実行することができる暗黙的QT区分およびBT区分の最大時間を定義することができる。第2のパラメータM（0≦M≦min（d_x，d_y，d_z））は、暗黙のQT区分およびBT区分の最小サイズを定義することができ、すべての次元がMより大きい場合にのみ、暗黙のQT区分およびBT区分が許容されることを示す。 Two parameters (ie, K and M) can be applied to define the conditions for implicit QT and BT compartments in TMC13. The first parameter K (0≤K≤max( _dx , _dy , _dz )-min( _dx , _dy , _dz )) is the implicit QT segmentation that can be performed before the OT segmentation. and maximum time for BT segment can be defined. The second parameter M (0 ≤ M ≤ min(d _x , d _y , d _z )) can define the minimum size of the implied QT and BT segments, if all dimensions are greater than M only indicates that implicit QT and BT divisions are allowed.

より具体的には、第1のK個の区分は表Iの規則に従うことができ、第1のK個の区分の後の区分は表IIの規則に従うことができる。表に列挙された条件のいずれも満たされない場合、OT区分を実行することができる。 More specifically, the first K partitions can follow the rules of Table I, and the partitions after the first K partitions can follow the rules of Table II. If none of the conditions listed in the table are met, an OT division can be performed.

一実施形態では、境界ボックスBは、 In one embodiment, the bounding box B is

のサイズを有することができる。一般性を失うことなく、条件0＜d_x≦d_y≦d_zを境界ボックスBに適用することができる。条件に基づいて、第1のK（K≦d_z－d_x）個の深さで、暗黙のBT区分をz軸に沿って実行することができ、暗黙のQT区分をy－z軸に沿って表Iに従って実行することができる。サブノードのサイズは can have a size of Without loss of generality, we can apply the condition 0<d _x ≦d _y ≦d _z to the bounding box B. Based on the conditions, at the first K ( _K≤dz - _dx ) depths, an implicit BT segmentation can be performed along the z-axis and an implicit QT segmentation can be performed along the y-z-axis. can be performed according to Table I. The subnode size is

になることができ、δ_yおよびδ_zの値（δ_z≧δ_y≧0）はKの値に依存することができる。さらに、OT区分はd_x－M回実行することができ、その結果、残りのサブノードは and the values of δ _y and δ _z (δ _z ≧δ _y ≧0) can depend on the value of K. Furthermore, the OT partition can be run d _x −M times, so that the remaining subnodes are

のサイズを有することができる。次に、表IIによれば、暗黙のBT区分をz軸に沿ってδ_z－δ_y回実行することができ、その後、暗黙のQT区分をy－z軸に沿ってδ_y回実行することができる。したがって、残りのノードは、2^{（M，M，M）}のサイズを有することができる。したがって、最小単位に到達するために、OT区分をM回実行することができる。 can have a size of Then, according to Table II, implicit BT segments can be performed δ _z -δ _y times along the z axis, followed by δ _y implicit QT segments along the yz axis. be able to. Therefore, the remaining nodes can have a size of 2 ^(M,M,M) . Therefore, the OT partition can be performed M times to arrive at the smallest unit.

QtBt区分では、ノード分解の各レベルで八分木、四分木、および二分木を切り替えることによって所与の直方体の区分をどのように適用するかについての暗黙の規則が提供される。規則（例えば、表I）に従ってQtBt区分を介したKレベルの初期分解の後、別の規則（例えば、表II）に従ってQtBt区分の別のラウンドを実行することができる。以上の処理において、規則内の条件のいずれも満たされない場合、八分木分解（または八分木区分）を適用することができる。 QtBt partitioning provides implicit rules for how to apply the partitioning of a given cuboid by switching between octree, quadtree, and binary tree at each level of node decomposition. After the initial decomposition of K levels via QtBt partitioning according to a rule (eg, Table I), another round of QtBt partitioning can be performed according to another rule (eg, Table II). In the above process, if none of the conditions in the rules are met, octree decomposition (or octree partitioning) can be applied.

暗黙の規則は、以下のようにQtBtの有効性に影響を与える可能性がある。（1）x、y、およびz次元に沿ったほぼ対称な直方体境界ボックスを有する点群データの場合、QtBt区分は、すべてのレベルでOt（八分木）分解を実行している関連する方法（例えば、暗黙のQtBt区分）を超える符号化利得を示していない、および（2）x、y、およびz次元に沿った高度に非対称直方体境界ボックスを有する点群データの場合、QtBt区分は、分解中に不要な占有情報の送信をスキップすることによって符号化利得を示している。 Implicit rules can affect the validity of QtBt as follows. (1) For point cloud data with roughly symmetric cuboidal bounding boxes along the x, y, and z dimensions, the QtBt partition is performing an Ot (octree) decomposition at all levels with a related method (e.g., the implied QtBt partition) and (2) for point cloud data with highly asymmetric cuboidal bounding boxes along the x, y, and z dimensions, the QtBt partition is It shows the coding gain by skipping the transmission of unnecessary occupation information during decomposition.

現在のQtBt区分では、特定の制限を以下のように置くことができる。第一に、QtBt区分は、非対称な境界ボックスの使用を常に強制することができ、これは、点群がほぼ対称な境界ボックスを有する場合には有用でなく、逆効果でさえない可能性がある。第二に、パラメータKと共に表Iは、Ot区分の代わりにQt/Bt区分を実施することによって、規則に従ってより大きな次元を削減することができる。しかしながら、境界ボックスが対称であるとき、パラメータKと共に表Iは、最初にQtまたはBt区分を許容しない場合がある。第三に、表IIは、サブボックスの最小次元がMに達すると、上記のK回の分割および起動の後に適用することができる。したがって、表IIは、すべての次元がMに等しくなるまで、規則に従ってより大きな次元を削減することができる。第四に、現在の暗黙の規則（または暗黙のQtBt区分）は、現在のQtBt区分がレベルMに達するまで、第1の（最大）Kレベルの後に八分木分解を常に要求することができる。言い換えれば、現在のQtBt区分は、2つのレベル点の間で任意のQt/Bt/Ot区分を選択することを可能にしない場合がある。 In the current QtBt division, specific restrictions can be placed as follows. First, the QtBt partition can always force the use of asymmetric bounding boxes, which may not be useful or even counterproductive if the point cloud has nearly symmetrical bounding boxes. be. Second, Table I with parameter K can be reduced to a larger dimensionality according to the rule by implementing Qt/Bt partitioning instead of Ot partitioning. However, when the bounding box is symmetrical, Table I with parameter K may initially not allow Qt or Bt partitions. Third, Table II can be applied after K divisions and activations above, once the minimum dimension of the subbox reaches M. Therefore, Table II can be reduced by rule to larger dimensions until all dimensions are equal to M. Fourth, the current implicit rule (or implicit QtBt partition) can always require octree decomposition after the first (maximum) K levels until the current QtBt partition reaches level M. . In other words, the current QtBt partition may not allow choosing any Qt/Bt/Ot partition between two level points.

本開示では、複数の方法が提供される。本方法は、例えば上記の議論に基づいて、典型的なユースケースのためのTMC13におけるQtBt設計（例えば、暗黙のQtBt区分）の単純化を提供する。本方法はまた、例えば各レベルでノード分解型式を明示的にシグナリングすることによって、より柔軟な区分方法を可能にする。 Multiple methods are provided in the present disclosure. The method provides a simplification of QtBt design (eg, implicit QtBt partitioning) in TMC13 for typical use cases, eg, based on the discussion above. The method also allows a more flexible partitioning method, for example by explicitly signaling the node decomposition type at each level.

一実施形態では、第1の区分方法（または簡略化されたQtBt区分）を提供することができる。第1の区分方法は、暗黙的QtBt区分の特別な場合とすることができ、これは、K＝0 ＆ M＝0を設定することによって高度に非対称な境界ボックスを有するデータセットに適用することができる。第1の区分方法は、QtBt設計（例えば、QtBt区分）を単純化することができ、さらに上述したように典型的な場合の符号化利益をもたらすことができる。 In one embodiment, a first partitioning method (or simplified QtBt partitioning) can be provided. The first partitioning method can be a special case of implicit QtBt partitioning, which can be applied to datasets with highly asymmetric bounding boxes by setting K=0 & M=0. can be done. The first partitioning method can simplify the QtBt design (eg, QtBt partitioning) and can also provide typical case coding benefits as described above.

TMC13内のQtBt区分と比較して、第1の区分方法は、以下の特徴を含むことができる：（1）TMC13内のQtBt区分内の暗黙の有効フラグ（例えば、implicit_qtbt_enabled_flag）を除去することができる。（2）非対称境界ボックスの使用を可能にするために、非対称境界ボックスフラグ（例えば、asymmetric_bbox_enabled_flagである）を導入することができる。一例では、非対称境界ボックスフラグが、対称またはほぼ対称の境界ボックスデータについては0などの値（第2の値とも呼ばれる）に設定され、高度に非対称の境界ボックスデータについては1などの値（第1の値とも呼ばれる）に設定される。（3）非対称境界ボックスフラグが第1の値である場合、ノード分解レベルが0（または最後のレベル）に達したときに、K＝0 ＆ M＝0の暗黙的QtBt規則（例えば、表Iおよび表II）を適用することができる。そうでなければ、非対称境界ボックスフラグが第2の値である場合、第1の区分方法は八分木分解（または八分木区分）を実行することができる。 Compared to the QtBt partition within TMC13, the first partition method may include the following features: (1) the implicit enable flag (e.g., implicit_qtbt_enabled_flag) within the QtBt partition within TMC13 may be removed; can. (2) An asymmetric bounding box flag (eg, asymmetric_bbox_enabled_flag) can be introduced to enable the use of asymmetric bounding boxes. In one example, the asymmetric bounding box flag is set to a value such as 0 (also referred to as the secondary value) for symmetric or nearly symmetric bounding box data, and a value such as 1 (also referred to as the secondary value) for highly asymmetric bounding box data. (also called a value of 1). (3) If the asymmetric bounding box flag is the first value, the implicit QtBt rule with K = 0 & M = 0 (e.g., Table I and Table II) can be applied. Otherwise, if the asymmetric bounding box flag is the second value, the first partitioning method may perform octree decomposition (or octree partitioning).

第1の区分方法によれば、表IIIに示すようないくつかの次元に沿って不要な占有情報を送信することをスキップするために、M＝0で表IIに示す暗黙的QtBt規則を適用することができる。 According to the first partitioning method, apply the implicit QtBt rule shown in Table II with M=0 to skip transmitting unnecessary occupancy information along some dimensions as shown in Table III. can do.

一実施形態では、分割決定の明示的なシグナリングを送信するために、第2の区分方法（または明示的QtBt区分）を提供することができる。明示的なシグナリングは、現在のQtBt区分における固定された暗黙のルールの使用とは対照的に提供することができる。 In one embodiment, a second partitioning method (or explicit QtBt partitioning) may be provided to send explicit signaling of splitting decisions. Explicit signaling can be provided in contrast to the use of fixed implicit rules in the current QtBt division.

第2の区分方法は、以下の特徴を含むことができる：（1）明示的な分割決定シグナリングを有効／無効にするために明示的QtBt有効フラグ（例えば、explicit_qtbt_enabled_flag）を導入することができる一方で、第1の区分方法からの非対称境界ボックスフラグを依然として導入することができる。（2）明示的QtBt有効フラグが0（または第2の値）などの値に設定されると、第2の区分方法は、上述した第1の区分方法に戻る（または等しくなり得る）。したがって、非対称境界ボックスフラグ（例えば、asymmetric_bbox_enabled_flagである）が1などの値（または第1の値）である場合、ノード分解レベルが0（または最後のレベル）に達したときに、K＝0 ＆ M＝0を有する暗黙的QtBt規則（例えば、表Iおよび表II）を適用することができる。非対称境界ボックスフラグが第2の値である場合、第2の区分方法は、八分木分解（または八分木区分）を実行することができる。一実施形態では、非対称境界ボックスフラグが使用されず、明示的QtBt有効フラグが第2の値（例えば、0）に設定されるとき、第2の区分方法は、すべてのレベルに対して八分木分解（または八分木区分）を適用することができる。（4）明示的QtBt有効フラグが第1の値（例えば、1）に設定されると、第1の区分方法で述べたようにレベルが0（または最後のレベル）に達するまで常に八分木分割を実行する代わりに、3ビット信号を八分木レベルの各々で送信して、x、y、およびz軸の各々に沿って分割するかどうかを示すことができる。したがって、3ビット信号は、八分木レベルの各々において、Bt区分、Qt区分、またはOt区分が適用され得るかどうかを示すことができる。いくつかの実施形態では、TMC13内の暗黙的QtBt規則（例えば、表Iおよび表II）を適用して、各八分木レベルにおける3ビット信号を決定することができる。 A second partitioning method can include the following features: (1) while an explicit QtBt enabled flag (e.g., explicit_qtbt_enabled_flag) can be introduced to enable/disable explicit split decision signaling; , we can still introduce the asymmetric bounding box flag from the first partitioning method. (2) When the Explicit QtBt Valid Flag is set to a value such as 0 (or a second value), the second partitioning method reverts to (or may be equal to) the first partitioning method described above. Therefore, if the asymmetric bounding box flag (e.g. is asymmetric_bbox_enabled_flag) is a value (or the first value) such as 1, then when the node decomposition level reaches 0 (or the last level), K=0 & Implicit QtBt rules with M=0 (eg, Tables I and II) can be applied. If the asymmetric bounding box flag is a second value, the second partitioning method may perform octree decomposition (or octree partitioning). In one embodiment, when the asymmetric bounding box flag is not used and the explicit QtBt valid flag is set to a second value (e.g., 0), the second partitioning method uses octet for all levels. Tree decomposition (or octree partitioning) can be applied. (4) When the explicit QtBt valid flag is set to the first value (e.g. 1), always octree until the level reaches 0 (or the last level) as mentioned in the first partitioning method Instead of performing splitting, a 3-bit signal can be sent at each octree level to indicate whether to split along each of the x, y, and z axes. Thus, a 3-bit signal can indicate whether Bt partitioning, Qt partitioning, or Ot partitioning can be applied at each of the octree levels. In some embodiments, implicit QtBt rules within TMC13 (eg, Tables I and II) can be applied to determine the 3-bit signal at each octree level.

第2の区分方法において明示的QtBt有効フラグが第1の値に設定されているとき、Ot／Qt／Bt区分は途中で任意の方法で許可されるので、分割の最大可能総数は、最大ノード深度と最小ノード深度との差の3倍とすることができることに留意されたい。 When the explicit QtBt valid flag is set to the first value in the second partitioning method, Ot/Qt/Bt partitioning is allowed in any way along the way, so the maximum possible total number of partitions is the maximum node Note that it can be three times the difference between the depth and the minimum node depth.

本開示の一実施形態では、分割決定の明示的なシグナリングを送信するために、第3の区分方法（または明示的QtBt型式2区分）を提供することができる。明示的なシグナリングは、現在のQtBt区分における固定された暗黙の規則の使用とは対照的に提供することができる（例えば、表Iおよび表II）。第3の区分方法は、現在のQtBt区分と比較して以下の特徴を含むことができる：（1）明示的QtBt有効フラグ（例えば、explicit_qtbt_enabled_flag）は、明示的な分割決定シグナリングを有効／無効にするために、TMC13内のQtBt区分内の暗黙的QtBt有効フラグ（例えば、implicit_qtbt_enabled_flag）を置き換えることができる。（2）非対称境界ボックスの使用を有効／無効にするために、明示的QtBt有効フラグが1（または第1の値）などの値であるときのみ、非対称境界ボックスフラグ（例えば、asymmetric_bbox_enabled_flag）をさらにシグナリングすることができる。非対称境界ボックスフラグが第1の値であるとき、x、y、およびzに沿った非対称境界ボックスの寸法（すなわち、サイズ）は、3つの最大値とは対照的に、さらにシグナリングすることができる。したがって、非対称境界ボックスフラグは、対称またはほぼ対称の境界ボックスデータの場合は0（例えば、第2の値）、非対称性の高い境界ボックスデータの場合は第1の値（例えば、1）などの値に設定することができる。（3）明示的QtBt有効フラグが第1の値に設定されると、x軸、y軸、およびz軸のそれぞれに沿って分割すべきかどうかを示すために、3ビット信号を八分木レベル（または八分木区分レベル）のそれぞれで送信することができる。一実施形態では、TMC13内の暗黙的QtBt規則（例えば、表Iおよび表II）を適用して、八分木レベルの各々の3ビット信号を決定することができる。別の実施形態では、他の分割規則を適用して、八分木レベルの各々の3ビット信号を決定することができる。他の分割規則は、八分木占有情報の符号化を容易にし、データの特性（例えば、八分木占有情報）または日付の取得メカニズムをさらに考慮に入れることができる。（4）明示的QtBt有効フラグが第2の値（例えば、0）に設定されているとき、第3の区分方法は、すべてのレベルに対して八分木分解（八分木区分）を適用することができる。 In one embodiment of the present disclosure, a third partitioning method (or explicit QtBt type 2 partitioning) may be provided to send explicit signaling of splitting decisions. Explicit signaling can be provided in contrast to the use of fixed implicit rules in the current QtBt division (eg Tables I and II). A third partitioning method may include the following features compared to the current QtBt partitioning: (1) An explicit QtBt enabled flag (e.g., explicit_qtbt_enabled_flag) enables/disables explicit split decision signaling; To do so, the implicit QtBt enabled flag (eg, implicit_qtbt_enabled_flag) in the QtBt partition within TMC13 can be replaced. (2) To enable/disable the use of asymmetric bounding boxes, set the asymmetric bounding box flag (e.g. asymmetric_bbox_enabled_flag) additionally only when the explicit QtBt enabled flag has a value such as 1 (or the first value) can be signaled. When the asymmetric bounding box flag is the first value, the dimensions (i.e. size) of the asymmetric bounding box along x, y, and z can be further signaled as opposed to the maximum of the three . Therefore, the asymmetric bounding box flag can be 0 (e.g., second value) for symmetric or nearly symmetric bounding box data, first value (e.g., 1) for highly asymmetric bounding box data, and so on. can be set to a value. (3) When the explicit QtBt valid flag is set to the first value, a 3-bit signal at the octree level to indicate whether to split along each of the x-, y-, and z-axes. (or octree partition level). In one embodiment, implicit QtBt rules within TMC 13 (eg, Tables I and II) may be applied to determine the 3-bit signal for each octree level. In another embodiment, other splitting rules can be applied to determine the 3-bit signal for each octree level. Other splitting rules facilitate the encoding of octatree occupation information and may further take into account characteristics of the data (eg, octatree occupation information) or date acquisition mechanisms. (4) when the explicit QtBt valid flag is set to a second value (e.g. 0), the third partitioning method applies octree decomposition (octree partitioning) for all levels can do.

第3の区分方法において明示的QtBt有効フラグが第1の値に設定されているとき、Ot／Qt／Bt区分は途中で任意の方法で許可されるので、分割の最大可能総数は、最大ノード深度と最小ノード深度との差の3倍とすることができることに留意されたい。 When the explicit QtBt valid flag is set to the first value in the third partitioning method, Ot/Qt/Bt partitioning is allowed in any way along the way, so the maximum possible total number of partitions is the maximum node Note that it can be three times the difference between the depth and the minimum node depth.

本開示の一実施形態では、明示的または暗黙的のいずれかとしての型式のQtBt区分の追加のシグナリングによってQtBt区分の使用においてより柔軟性を提供するために、第4の区分方法（または柔軟なQtBt区分）を提供することができる。現在のQtBt区分における固定された暗黙のルールの使用とは対照的に、追加シグナリングを提供することができる（例えば、表Iおよび表II）。 In one embodiment of the present disclosure, a fourth partitioning method (or flexible QtBt division) can be provided. In contrast to the use of fixed implicit rules in the current QtBt division, additional signaling can be provided (eg Tables I and II).

第4の区分方法は、以下を含むことができる。（1）より柔軟性のあるQtBt区分の使用を示すために、TMC13内のQtBt区分内の暗黙のQtBt有効フラグを置き換えるために、QtBt有効フラグ（例えば、qtbt_enabled_flag）を適用することができる。（2）QtBt有効フラグが1などの値に設定されると、QtBt型式フラグ（例えば、qtbt_type_flag）をさらにシグナリングすることができる。（3）QtBt型式フラグが0などの値に設定される場合、現在の暗黙的QtBt方式（例えば、表Iおよび表II）を適用することができる。さらに、非対称境界ボックスの使用を選択的に有効／無効にするために、非対称境界ボックスフラグ（例えば、asymmetric_bbox_enabled_flagである）をさらにシグナリングすることができる。一実施形態では、非対称境界ボックスフラグが1などの値に設定されると、x、y、およびzに沿った非対称境界ボックスの寸法（すなわち、サイズ）は、3つの最大値とは対照的にシグナリングされ得る。別の実施形態では、非対称境界ボックスフラグがシグナリングされない場合、非対称境界ボックスを常に使用することができる。 A fourth segmentation method may include: (1) A QtBt enabled flag (e.g., qtbt_enabled_flag) can be applied to replace the implicit QtBt enabled flag within the QtBt partition within TMC13 to indicate the use of the QtBt partition with greater flexibility. (2) When the QtBt valid flag is set to a value such as 1, a QtBt type flag (eg, qtbt_type_flag) can be additionally signaled. (3) If the QtBt type flag is set to a value such as 0, the current implicit QtBt scheme (eg, Tables I and II) can be applied. Additionally, an asymmetric bounding box flag (eg, asymmetric_bbox_enabled_flag) can be further signaled to selectively enable/disable the use of asymmetric bounding boxes. In one embodiment, when the asymmetric bounding box flag is set to a value such as 1, the dimensions (i.e., size) of the asymmetric bounding box along x, y, and z are set to can be signaled. In another embodiment, if the asymmetric bounding box flag is not signaled, the asymmetric bounding box can always be used.

第4の区分方法はまた、以下を含むことができる。（4）QtBt型式フラグが1などの値である場合、3ビット信号を八分木レベルの各々に送信して、x、y、およびz軸の各々に沿って分割するかどうかを示すことができる。一実施形態では、TMC13内の暗黙的QtBt規則（例えば、表Iおよび表II）を適用して、八分木レベルの各々の3ビット信号を決定することができる。別の実施形態では、他の分割規則を適用して、八分木レベルの各々の3ビット信号を決定することができる。他の分割規則は、八分木占有情報の符号化を容易にし、データの特性（例えば、八分木占有情報）または日付の取得メカニズムをさらに考慮に入れることができる。一実施形態では、非対称境界ボックスの使用を選択的に有効／無効にするために、非対称境界ボックスフラグをさらにシグナリングすることができる。非対称境界ボックスフラグが1などの値であるとき、x、y、およびzに沿った非対称境界ボックスの寸法（すなわち、サイズ）は、3つの最大値とは対照的にシグナリングすることができる。別の実施形態では、非対称境界ボックスフラグはシグナリングされなくてもよく、非対称境界ボックスを常に使用することができる。（5）QtBt有効フラグが0などの値に設定されている場合、第4の区分方法は、すべてのレベルに対して八分木分解（または八分木区分）を適用することができる。 A fourth segmentation method can also include: (4) If the QtBt type flag is a value such as 1, a 3-bit signal may be sent to each of the octree levels to indicate whether to split along each of the x, y, and z axes. can. In one embodiment, implicit QtBt rules within TMC 13 (eg, Tables I and II) may be applied to determine the 3-bit signal for each octree level. In another embodiment, other splitting rules can be applied to determine the 3-bit signal for each octree level. Other splitting rules facilitate the encoding of octatree occupation information and may further take into account characteristics of the data (eg, octatree occupation information) or date acquisition mechanisms. In one embodiment, an asymmetric bounding box flag can be further signaled to selectively enable/disable the use of asymmetric bounding boxes. When the asymmetric bounding box flag is a value such as 1, the dimensions (ie, size) of the asymmetric bounding box along x, y, and z can be signaled as opposed to the maximum of the three values. In another embodiment, the asymmetric bounding box flag may not be signaled and the asymmetric bounding box may always be used. (5) If the QtBt valid flag is set to a value such as 0, the fourth partitioning method can apply octree decomposition (or octree partitioning) for all levels.

第4の区分方法において明示的QtBt有効フラグが1などの値に設定されているとき、Ot／Qt／Bt区分は途中で任意の方法で許可されるので、分割の最大可能総数は、最大ノード深度と最小ノード深度との差の3倍とすることができることに留意されたい。 When the explicit QtBt valid flag is set to a value such as 1 in the fourth partitioning method, Ot/Qt/Bt partitioning is allowed in any way along the way, so the maximum possible total number of partitions is the maximum node Note that it can be three times the difference between the depth and the minimum node depth.

上記の技術は、点群圧縮／解凍に適合されたビデオ符号化器または復号器で実施することができる。符号化器／復号器は、ハードウェア、ソフトウェア、またはそれらの任意の組み合わせで実装することができ、ソフトウェアは、存在する場合、1つまたは複数の非一時的コンピュータ可読媒体に格納することができる。例えば、方法（または実施形態）、符号化器、および復号器の各々は、処理回路（例えば、1つまたは複数のプロセッサまたは1つまたは複数の集積回路）によって実施されてもよい。一例では、1つまたは複数のプロセッサは、非一時的コンピュータ可読媒体に記憶されたプログラムを実行する。 The above techniques can be implemented in a video encoder or decoder adapted for point cloud compression/decompression. The encoder/decoder may be implemented in hardware, software, or any combination thereof, and the software, if present, may be stored in one or more non-transitory computer-readable media. . For example, each of the method (or embodiments), encoder, and decoder may be implemented by processing circuitry (eg, one or more processors or one or more integrated circuits). In one example, one or more processors execute a program stored on a non-transitory computer-readable medium.

図14および図15は、本開示の実施形態による処理（1400）および処理（1500）の概要を示すフローチャートである。処理（1400）および（1500）は、点群の復号処理中に使用することができる。様々な実施形態において、処理（1400）および（1500）は、端末装置（110）内の処理回路、符号化器（203）および／または復号器（201）の機能を実行する処理回路、符号化器（300）、復号器（400）、符号化器（700）、および／または復号器（800）の機能を実行する処理回路などの処理回路によって実行することができる。いくつかの実施形態では、処理（1400）および（1500）はソフトウェア命令で実施されることができ、したがって、処理回路がソフトウェア命令を実行すると、処理回路は処理（1400）および（1500）をそれぞれ実行する。 Figures 14 and 15 are flowcharts outlining processes (1400) and (1500) according to embodiments of the present disclosure. Processes (1400) and (1500) can be used during the point cloud decoding process. In various embodiments, the processes (1400) and (1500) are processing circuits in the terminal device (110), processing circuits that perform the functions of the encoder (203) and/or the decoder (201), encoding processing circuitry, such as processing circuitry that performs the functions of the detector (300), the decoder (400), the encoder (700), and/or the decoder (800). In some embodiments, the processes (1400) and (1500) can be implemented in software instructions such that when the processing circuitry executes the software instructions, the processing circuitry executes the processes (1400) and (1500) respectively. Execute.

図14に示すように、処理（1400）は（S1401）から開始し、（S1410）に進む。 As shown in FIG. 14, the process (1400) starts from (S1401) and proceeds to (S1410).

（S1410）において、第1のシグナリング情報は、3次元（3D）空間内の点のセットを含む点群のための符号化されたビットストリームから受信することができる。第1のシグナリング情報は、点群の区分情報を示すことができる。 At (S1410), first signaling information may be received from an encoded bitstream for a point cloud that includes a set of points in three-dimensional (3D) space. The first signaling information may indicate segmentation information of the point cloud.

（S1420）において、第2のシグナリング情報は、第1の値を示す第1のシグナリング情報に基づいて決定することができる。第2のシグナリング情報は、3D空間内の点のセットの区分モードを示すことができる。 At (S1420), the second signaling information can be determined based on the first signaling information indicating the first value. The second signaling information may indicate the partitioning mode of the set of points in 3D space.

（S1430）において、3D空間内の点のセットの区分モードは、第2のシグナリング情報に基づいて決定することができる。次いで、処理（1400）は（S1440）に進むことができ、点群はその後、区分モードに基づいて再構築することができる。 At (S1430), a partitioning mode of the set of points in 3D space may be determined based on the second signaling information. The process (1400) can then proceed to (S1440) and the point cloud can then be reconstructed based on the segmentation mode.

いくつかの実施形態では、区分モードは、第2の値である第2のシグナリング情報に基づいて、所定の四分木および二分木（QtBt）区分であると決定することができる。 In some embodiments, the partitioning mode can be determined to be a predetermined quadtree and binary tree (QtBt) partition based on second signaling information that is a second value.

処理（1400）では、3D空間が非対称直方体であることを示す第3のシグナリング情報を受信することができる。x、y、およびz方向に沿ってシグナリングされる3D空間の寸法は、第1の値である第3のシグナリング情報に基づいて決定することができる。 The process (1400) may receive third signaling information indicating that the 3D space is an asymmetric cuboid. The dimensions of the 3D space signaled along the x, y, and z directions can be determined based on the first value, the third signaling information.

いくつかの実施形態では、第2のシグナリング情報が第1の値であることに基づいて、区分モードにおける複数の区分レベルの各々について3ビットのシグナリング情報を決定することができる。複数の区分レベルの各々の3ビットシグナリング情報は、区分モードにおけるそれぞれの区分レベルのx、y、およびz方向に沿った区分方向を示すことができる。 In some embodiments, 3-bit signaling information can be determined for each of the multiple partitioning levels in the partitioning mode based on the second signaling information being the first value. The 3-bit signaling information for each of the multiple partitioning levels can indicate partitioning directions along the x, y, and z directions for the respective partitioning level in partitioning mode.

処理（1400）では、区分モードは、第2の値である第1のシグナリング情報に基づいて決定することができ、区分モードは、区分モードにおける複数の区分レベルの各々にそれぞれの八分木区分を含むことができる。 In the process (1400), a partitioning mode can be determined based on the second value of the first signaling information, the partitioning mode is determined based on the respective octree partitioning for each of the plurality of partitioning levels in the partitioning mode. can include

図15に示すように、処理（1500）は（S1501）から開始し、（S1510）に進む。 As shown in FIG. 15, the process (1500) starts from (S1501) and proceeds to (S1510).

（S1510）において、第1のシグナリング情報は、3次元（3D）空間内の点のセットを含む点群のための符号化されたビットストリームから受信することができる。第1のシグナリング情報は、点群の区分情報を示すことができる。 At (S1510), first signaling information may be received from an encoded bitstream for a point cloud that includes a set of points in three-dimensional (3D) space. The first signaling information may indicate segmentation information of the point cloud.

（S1520）において、3D空間内の点のセットの区分モードは、第1のシグナリング情報に基づいて決定することができ、区分モードは、複数の区分レベルを含むことができる。 At (S1520), a partitioning mode of the set of points in 3D space may be determined based on the first signaling information, and the partitioning mode may include multiple partitioning levels.

（S1530）において、点群は、その後、区分モードに基づいて再構築され得る。 At (S1530), the point cloud may then be reconstructed based on the segmentation mode.

いくつかの実施形態では、区分モードにおける複数の区分レベルの各々についての3ビットのシグナリング情報は、第1の値である第1のシグナリング情報に基づいて決定することができ、複数の区分レベルの各々についての3ビットのシグナリング情報は、区分モードにおけるそれぞれの区分レベルについてのx、y、およびz方向に沿った区分方向を示すことができる。 In some embodiments, the 3-bit signaling information for each of the multiple partitioning levels in the partitioning mode can be determined based on a first value of the first signaling information, and The 3-bit signaling information for each can indicate the partition direction along the x, y, and z directions for each partition level in partition mode.

いくつかの実施形態では、区分モードは、第1のシグナリング情報が第2の値であることに基づいて、区分モードにおける複数の区分レベルの各々にそれぞれの八分木区分を含むように決定することができる。 In some embodiments, the partitioning mode determines each of the plurality of partitioning levels in the partitioning mode to include a respective octree partition based on the first signaling information being the second value. be able to.

処理（1500）では、点群のための符号化されたビットストリームから第2のシグナリング情報をさらに受信することができる。第2のシグナリング情報は、第2のシグナリング情報が第1の値であるとき、3D空間が非対称直方体であり、第2のシグナリング情報が第2の値であるとき、3D空間が対称直方体であることを示すことができる。 The process (1500) may further receive second signaling information from the encoded bitstream for the point cloud. The second signaling information is such that when the second signaling information is the first value, the 3D space is an asymmetric rectangular parallelepiped, and when the second signaling information is the second value, the 3D space is a symmetrical rectangular parallelepiped It can be shown that

いくつかの実施形態では、第2の値を示す第1の信号情報および第1の値を示す第2の信号情報に基づいて、区分モードは、区分モードの複数の区分レベルにおける第1の区分レベルの各々にそれぞれの八分木区分を含むように決定され得る。区分モードの複数の区分レベルのうちの最後の区分レベルの区分型式および区分方向は、以下の表 In some embodiments, based on the first signal information indicative of the second value and the second signal information indicative of the first value, the partition mode determines the first partition at the plurality of partition levels of the partition mode. Each of the levels can be determined to contain a respective octree partition. The partition type and partition direction for the last partition level of the multiple partition levels in the partition mode are shown in the table below.

処理（1500）では、第1の値を示す第1のシグナリング情報に基づいて、第2のシグナリング情報を決定できる。第2のシグナリング情報は、第2のシグナリング情報が第1の値を示す場合、3D空間が非対称直方体であり、第2のシグナリング情報が第2の値を示す場合、3D空間が対称直方体であることを示すことができる。さらに、x、y、およびz方向に沿ってシグナリングされる3D空間の寸法は、第1の値を示す第2のシグナリング情報に基づいて決定することができる。 The process (1500) can determine second signaling information based on first signaling information indicative of a first value. The second signaling information indicates that the 3D space is an asymmetric cuboid when the second signaling information indicates the first value, and the 3D space is a symmetric cuboid when the second signaling information indicates the second value It can be shown that Additionally, the dimensions of the 3D space signaled along the x, y, and z directions can be determined based on second signaling information indicative of the first value.

上述のとおり、上述した技術は、コンピュータ可読命令を使用し、1つまたは複数のコンピュータ可読媒体に物理的に記憶されたコンピュータソフトウェアとして実装することができる。例えば、図16は、開示された主題の特定の実施形態を実装するのに適したコンピュータシステム（1800）を示している。 As noted above, the techniques described above may be implemented as computer software using computer-readable instructions and physically stored on one or more computer-readable media. For example, FIG. 16 illustrates a computer system (1800) suitable for implementing certain embodiments of the disclosed subject matter.

コンピュータソフトウェアは、任意の適切な機械符号またはコンピュータ言語を使用して符号化でき、アセンブリ、コンパイル、リンク、または同様のメカニズムの対象となり、1つまたは複数のコンピュータ中央処理装置（CPU）、グラフィック処理装置（GPU）などによる直接、または解釈、マイクロ符号の実行などを通じて実行できる命令を含む符号を作成する。 Computer software may be encoded using any suitable machine code or computer language, is subject to assembly, compilation, linking, or similar mechanisms, and is processed by one or more computer central processing units (CPUs), graphics processing, and so on. Create code containing instructions that can be executed directly by a device such as a GPU, or through interpretation, microcode execution, etc.

命令は、例えば、パーソナルコンピュータ、タブレットコンピュータ、サーバ、スマートフォン、ゲーム装置、モノのインターネット装置などを含む様々な型式のコンピュータまたはその構成要素上で実行することができる。 The instructions can be executed on various types of computers or components thereof including, for example, personal computers, tablet computers, servers, smart phones, gaming devices, Internet of Things devices, and the like.

コンピュータシステム（1800）について図16に示される構成要素は、本質的に例示であり、本開示の実施形態を実装するコンピュータソフトウェアの使用または機能の範囲に関していかなる制限を示唆することを意図しない。成分の構成は、コンピュータシステム（1800）の例示的な実施形態に示されている成分のいずれかまたは組み合わせに関する依存関係または要件を有すると解釈されるべきではない。 The components shown in FIG. 16 for computer system (1800) are exemplary in nature and are not intended to suggest any limitation as to the scope of use or functionality of the computer software implementing embodiments of the present disclosure. No configuration of components should be interpreted as having any dependency or requirement relating to any or combination of components illustrated in the exemplary embodiment of computer system (1800).

コンピュータシステム（1800）は、特定のヒューマンインターフェース入力装置を含むことができる。そのようなヒューマンインターフェース入力装置は、例えば、触知入力（例えば、キーストローク、スワイプ、データグローブの動き）、音声入力（例えば、声、拍手）、視覚入力（例えば、ジェスチャ）、嗅覚入力（図示せず）を介した1人以上の人間のユーザによる入力に応答することができる。ヒューマンインターフェース装置を使用して、音声（スピーチ、音楽、環境音など）、画像（走査した画像、静止画像カメラから得られる写真画像など）、ビデオ（2次元ビデオ、立体ビデオを含む3次元ビデオなど）など、人間による意識的な入力に必ずしも直接関係しない特定の媒体を捕捉することもできる。 The computer system (1800) may include certain human interface input devices. Such human interface input devices include, for example, tactile input (e.g. keystrokes, swipes, data glove movements), audio input (e.g. voice, clapping), visual input (e.g. gestures), olfactory input (e.g. (not shown)). Audio (speech, music, environmental sounds, etc.), images (scanned images, photographic images obtained from still image cameras, etc.), video (2D video, 3D video including 3D video, etc.) using human interface devices ), etc., that are not necessarily directly related to conscious human input.

入力ヒューマンインターフェース装置は、キーボード（1801）、マウス（1802）、トラックパッド（1803）、タッチスクリーン（1810）、データグローブ（図示せず）、ジョイスティック（1805）、マイクロフォン（1806）、スキャナ（1807）、カメラ（1808）のうちの1つまたは複数（各々のうちのただ1つ）を含むことができる。 Input human interface devices include keyboard (1801), mouse (1802), trackpad (1803), touch screen (1810), data glove (not shown), joystick (1805), microphone (1806), scanner (1807). , may include one or more (only one of each) of the cameras (1808).

コンピュータシステム（1800）はまた、特定のヒューマンインターフェース出力装置を含んでもよい。そのようなヒューマンインターフェース出力装置は、例えば、触知出力、音、光、および匂い／味によって1人または複数の人間のユーザの感覚を刺激することができる。そのようなヒューマンインターフェース出力装置は、触知出力装置（例えば、タッチスクリーン（1810）、データグローブ（図示せず）、またはジョイスティック（1805）による触覚フィードバックであるが、入力装置として機能しない触覚フィードバック装置も存在し得る）、音声出力装置（例えば、スピーカ（1809）、ヘッドホン（図示せず））、視覚出力装置（例えば、CRTスクリーン、LCDスクリーン、プラズマスクリーン、OLEDスクリーンを含むスクリーン（1810）であって、それぞれがタッチスクリーン入力機能を有するかまたは有さず、それぞれが触知フィードバック機能を有するかまたは有さず、その一部は、ステレオ出力などの手段を介して二次元視覚出力または三次元超出力を出力することができてもよい；仮想現実メガネ（図示せず）、ホログラフィックディスプレイ、および煙タンク（図示せず））、およびプリンタ（図示せず）を含むことができる。 The computer system (1800) may also include certain human interface output devices. Such human interface output devices can stimulate the senses of one or more human users by, for example, tactile output, sound, light, and smell/taste. Such a human interface output device may be a tactile output device such as a touch screen (1810), a data glove (not shown), or haptic feedback via a joystick (1805), but a haptic feedback device that does not function as an input device. audio output devices (e.g. speakers (1809), headphones (not shown)), visual output devices (e.g. screens (1810) including CRT screens, LCD screens, plasma screens, OLED screens). , each with or without touch screen input capability, each with or without tactile feedback capability, some of which may provide two-dimensional visual output or three-dimensional output via means such as stereo output. May be capable of outputting super power; may include virtual reality glasses (not shown), holographic displays, and smoke tanks (not shown), and printers (not shown).

コンピュータシステム（1800）はまた、人間がアクセス可能な記憶装置およびそれらの関連媒体、例えば、CD/DVDなどの媒体を有するCD/DVD ROM/RW（1820）を含む光学媒体（1821）、サムドライブ（1822）、リムーバブルハードドライブまたはソリッド・ステート・ドライブ（1823）、テープおよびフロッピーディスク（図示せず）などのレガシー磁気媒体、セキュリティドングル（図示せず）などの専用ROM/ASIC/PLDベースの装置などを含むことができる。 The computer system (1800) also includes optical media (1821) including CD/DVD ROM/RW (1820) having media such as human accessible storage devices and their associated media, such as CD/DVD, thumb drives (1822), removable hard drives or solid state drives (1823), legacy magnetic media such as tapes and floppy disks (not shown), dedicated ROM/ASIC/PLD-based devices such as security dongles (not shown) and so on.

当業者はまた、ここで開示される主題に関連して使用される「コンピュータ可読媒体」という用語は、送信媒体、搬送波、または他の一時的な信号を包含しないことを理解するべきである。 Those skilled in the art should also understand that the term "computer-readable medium" as used in connection with the subject matter disclosed herein does not encompass transmission media, carrier waves, or other transitory signals.

コンピュータシステム（1800）は、1つまたは複数の通信ネットワークへのインターフェースも含み得る。ネットワークは、例えば、無線、有線、光であり得る。ネットワークはさらに、ローカル、広域、メトロポリタン、車両および産業、リアルタイム、遅延耐性などであり得る。ネットワークの例には、イーサネット、無線LANなどのローカルエリアネットワーク、GSM、3G、4G、5G、LTEなどを含むセルラーネットワーク、ケーブルTV、衛星TV、および地上波放送TVを含むテレビ有線または無線広域デジタルネットワーク、CANBusを含む車両および産業用などが含まれる。特定のネットワークは、一般に、特定の汎用データポートまたは周辺バス（1849）に取り付けられた外部ネットワークインターフェースアダプタを必要とする（例えば、コンピュータシステム（1800）のUSBポートなど）。他のものは、一般に、後述するようなシステムバスへの取り付け（例えば、PCコンピュータシステムへのイーサネットインターフェースまたはスマートフォンコンピュータシステムへのセルラーネットワークインターフェース）によってコンピュータシステム（1800）のコアに統合される。これらのネットワークのいずれかを使用して、コンピュータシステム（1800）は、他のエンティティと通信することができる。そのような通信は、例えば、ローカルまたは広域デジタルネットワークを使用する他のコンピュータシステムに対して、単方向、受信のみ（例えば、放送TV）、単方向送信のみ（例えば、特定のCANbus装置へのCANbus）、または双方向であり得る。特定のプロトコルおよびプロトコルスタックは、上述したように、それらのネットワークおよびネットワークインターフェースのそれぞれで使用することができる。 Computer system (1800) may also include interfaces to one or more communication networks. Networks can be, for example, wireless, wired, or optical. Networks can also be local, wide area, metropolitan, vehicular and industrial, real-time, delay tolerant, and the like. Examples of networks include local area networks such as Ethernet and WLAN; cellular networks including GSM, 3G, 4G, 5G, LTE, etc.; Networks, vehicles including CANBus and industrial use. A particular network generally requires an external network interface adapter attached to a particular general purpose data port or peripheral bus (1849), such as the USB port of a computer system (1800). Others are generally integrated into the core of the computer system (1800) by attachment to a system bus as described below (eg, an Ethernet interface to a PC computer system or a cellular network interface to a smartphone computer system). Using any of these networks, the computer system (1800) can communicate with other entities. Such communication may be, for example, unidirectional, receive only (e.g. broadcast TV), or unidirectional transmit only (e.g. CANbus to a particular CANbus device) to other computer systems using local or wide area digital networks. ), or bi-directional. Specific protocols and protocol stacks may be used on each of those networks and network interfaces, as described above.

前述のヒューマンインターフェース装置、ヒューマンアクセスストレージ装置、およびネットワークインターフェースは、コンピュータシステム（1800）のコア（1840）に取り付けることができる。 The aforementioned human interface devices, human access storage devices, and network interfaces can be attached to the core (1840) of the computer system (1800).

コア（1840）には、1つまたは複数の中央処理装置（CPU）（1841）、グラフィック処理装置（GPU）（1842）、フィールド・プログラマブル・ゲート・エリア（FPGA）（1843）、特定のタスクのハードウェアアクセラレータ（1844）などの形式の特殊なプログラマブル処理装置を含めることができる。これらの装置は、読取り専用メモリ（ROM）（1845）、ランダム・アクセス・メモリ（1846）、内部非ユーザアクセス可能ハードドライブなどの内部大容量ストレージ、SSDなど（1847）とともに、システムバス（1848）を介して接続されてもよい。いくつかのコンピュータシステムでは、システムバス（1848）は、追加のCPU、GPUなどによる拡張を可能にするために、1つまたは複数の物理プラグの形態でアクセス可能であり得る。周辺機器は、コアのシステムバス（1848）に直接取り付けることも、周辺バス（1849）を介して取り付けることもできる。周辺バスのアーキテクチャには、PCI、USBなどが含まれる。 The core (1840) includes one or more Central Processing Units (CPUs) (1841), Graphics Processing Units (GPUs) (1842), Field Programmable Gate Areas (FPGAs) (1843), Special programmable processing units in the form of hardware accelerators (1844) can be included. These devices include read-only memory (ROM) (1845), random-access memory (1846), internal mass storage such as internal non-user-accessible hard drives, SSDs, etc. (1847), along with the system bus (1848). may be connected via In some computer systems, the system bus (1848) may be accessible in the form of one or more physical plugs to allow expansion by additional CPUs, GPUs, etc. Peripherals can be attached directly to the core's system bus (1848) or through a peripheral bus (1849). Peripheral bus architectures include PCI, USB, and the like.

CPU（1841）、GPU（1842）、FPGA（1843）、およびアクセラレータ（1844）は、組み合わせて上述のコンピュータコードを構成することができる特定の命令を実行することができる。そのコンピュータコードは、ROM（1845）またはRAM（1846）に記憶することができる。移行データはまた、RAM（1846）に記憶することができ、一方、永続データは、例えば内部大容量ストレージ（1847）に記憶することができる。メモリ装置のいずれかへの高速記憶および検索は、1つまたは複数のCPU（1841）、GPU（1842）、大容量ストレージ（1847）、ROM（1845）、RAM（1846）などと密接に関連付けることができるキャッシュメモリの使用によって可能にすることができる。 The CPU (1841), GPU (1842), FPGA (1843), and accelerator (1844) are capable of executing specific instructions that can be combined to form the computer code described above. The computer code can be stored in ROM (1845) or RAM (1846). Migration data can also be stored in RAM (1846), while persistent data can be stored, for example, in internal mass storage (1847). Fast storage and retrieval to any of the memory devices should be closely associated with one or more CPU (1841), GPU (1842), mass storage (1847), ROM (1845), RAM (1846), etc. can be made possible through the use of cache memory that allows

コンピュータ可読媒体は、様々なコンピュータ実装動作を実行するためのコンピュータコードを有することができる。媒体およびコンピュータコードは、本開示の目的のために特別に設計および構築されたものであってもよく、またはコンピュータソフトウェア技術の当業者に周知で利用可能な型式のものであってもよい。 The computer-readable medium can have computer code for performing various computer-implemented operations. The media and computer code may be those specially designed and constructed for the purposes of the present disclosure, or they may be of the types well known and available to those having skill in the computer software arts.

限定ではなく、例として、アーキテクチャ、特にコア（1840）を有するコンピュータシステム（1800）は、1つまたは複数の有形のコンピュータ可読媒体で具現化されたソフトウェアを実行するプロセッサ（CPU、GPU、FPGA、アクセラレータなどを含む）の結果として機能を提供することができる。そのようなコンピュータ可読媒体は、上述のようなユーザアクセス可能な大容量ストレージ、ならびにコア内部大容量ストレージ（1847）またはROM（1845）などの非一時的な性質のコア（1840）の特定のストレージに関連付けられた媒体とすることができる。本開示の様々な実施形態を実施するソフトウェアは、そのような装置に格納され、コア（1840）によって実行することができる。コンピュータ可読媒体は、特定の必要性に応じて、1つまたは複数のメモリ装置またはチップを含むことができる。ソフトウェアは、コア（1840）、具体的にはその中のプロセッサ（CPU、GPU、FPGAなどを含む）に、RAM（1846）に記憶されたデータ構造を定義すること、およびソフトウェアによって定義された処理に従ってそのようなデータ構造を修正することを含む、本明細書に記載の特定の処理または特定の処理の特定の部分を実行させることができる。加えて、または代替として、コンピュータシステムは、結線で接続されたまたはそうでなく回路（例えば、アクセレーター（1844））に具現化されたロジックの結果として機能を提供することができ、ソフトウェアの代わりに、またはソフトウェアと共に動作して、本明細書に記載の特定の処理または特定の処理の特定の部分を実行することができる。ソフトウェアへの参照は、適切な場合には、ロジックを包含することができ、逆もまた同様である。コンピュータ可読媒体への言及は、必要に応じて、実行のためのソフトウェアを記憶する回路（集積回路（IC）など）、実行のためのロジックを具現化する回路、またはその両方を包含することができる。本開示は、ハードウェアとソフトウェアとの任意の適切な組み合わせを包含する。 By way of example, and not limitation, a computer system (1800) having an architecture, particularly a core (1840), is a processor (CPU, GPU, FPGA, (including accelerators, etc.). Such computer-readable media include user-accessible mass storage as described above, as well as core (1840) specific storage of a non-transitory nature such as core internal mass storage (1847) or ROM (1845). can be a medium associated with Software implementing various embodiments of the present disclosure may be stored in such devices and executed by the core (1840). A computer-readable medium may include one or more memory devices or chips, depending on particular needs. The software asks the core (1840), and specifically the processors (including CPU, GPU, FPGA, etc.) within it, to define data structures stored in RAM (1846) and to perform software-defined operations. Certain processes or portions of certain processes described herein may be performed, including modifying such data structures in accordance with. Additionally or alternatively, the computer system may provide functionality as a result of logic embodied in hard-wired or otherwise circuits (e.g., Accelerator (1844)), instead of software. or in conjunction with software to perform certain processes or certain portions of certain processes described herein. References to software can encompass logic, and vice versa, where appropriate. References to computer readable media may optionally include circuits (such as integrated circuits (ICs)) that store software for execution, circuits embodying logic for execution, or both. can. This disclosure encompasses any suitable combination of hardware and software.

本開示はいくつかの例示的な実施形態を説明してきたが、本開示の範囲内にある変更、置換、および様々な代替均等物が存在する。したがって、当業者は、本明細書に明示的に示されていないまたは記載されていないが、本開示の原理を具体化し、したがってその趣旨および範囲内にある多数のシステムおよび方法を考案することができることが理解されよう。 Although this disclosure has described several exemplary embodiments, there are alterations, permutations, and various alternative equivalents that fall within the scope of this disclosure. Accordingly, one skilled in the art may devise numerous systems and methods not expressly shown or described herein that embody the principles of the present disclosure and thus fall within its spirit and scope. Understand what you can do.

100 通信システム
105 ネットワーク、センサ
110 端末装置
120 端末装置
150 ネットワーク
200 ストリーミングシステム
201 点群ソース、復号器
202 点群
203 符号化器
204 点群
205 ストリーミングサーバ
206 クライアントサブシステム
207 点群
209 点群
210 復号器
211 点群
212 レンダリング装置
213 捕捉サブシステム
220 電子装置
230 電子装置
300 符号化器
304 パッチ情報モジュール
306 パッチ生成モジュール
308 パッチパッキングモジュール
310 ジオメトリ画像生成モジュール
312 テクスチャ画像生成モジュール
314 占有マップモジュール
316 画像パディングモジュール
320 グループ拡張モジュール
322 ビデオ圧縮モジュール
323 ビデオ圧縮モジュール
324 マルチプレクサ
332 ビデオ圧縮モジュール
334 エントロピー圧縮モジュール
336 平滑化モジュール
338 補助パッチ情報圧縮モジュール
400 復号器
432 デマルチプレクサ
434 ビデオ解凍モジュール
436 ビデオ解凍モジュール
438 占有マップ解凍モジュール
442 補助パッチ情報解凍モジュール
444 ジオメトリ再構築モジュール
446 平滑化モジュール
448 テクスチャ再構築モジュール
452 色平滑化モジュール
510 ビデオ復号器
520 解析器
521 シンボル
551 逆変換ユニット
552 イントラピクチャ予測ユニット、イントラ予測ユニット
553 動き補償予測ユニット
555 アグリゲータ
556 ループフィルタユニット
557 参照ピクチャメモリ
558 画像バッファ
603 ビデオ符号化器、符号化器
630 ソース符号化器
632 符号化エンジン
633 復号器、ローカルビデオ復号器
633 ローカル復号器
634 参照ピクチャメモリ、参照ピクチャキャッシュ
635 予測器
643 圧縮画像
645 エントロピー符号化器
650 コントローラ
700 符号化器
701 点群
701 入力点群
702 ビットストリーム、圧縮ビットストリーム
710 位置量子化モジュール、量子化モジュール
712 重複点除去モジュール
720 属性転送モジュール
730 八分木符号化モジュール
740 LOD生成モジュール
750 予測モジュール
760 残差量子化モジュール
770 算術符号化モジュール
800 復号器
801 圧縮ビットストリーム
802 点群
810 算術復号モジュール
820 逆量子化モジュール
830 八分木復号モジュール
840 LOD生成モジュール
850 逆量子化モジュール
860 予測モジュール
900 3D立方体
1010 八分木区分
1020 八分木構造
1201 3D立方体
1202 3D立方体
1203 3D立方体
1301 3D立方体
1302 3D立方体
1303 3D立方体
1400 処理
1500 処理
1800 コンピュータシステム
1801 キーボード
1802 マウス
1803 トラックパッド
1805 ジョイスティック
1806 マイクロフォン
1807 スキャナ
1808 カメラ
1809 スピーカ
1810 スクリーン、タッチスクリーン
1821 光学媒体
1822 サムドライブ
1823 ソリッド・ステート・ドライブ
1840 コア
1841 中央処理装置
1842 GPU
1843 FPGA
1844 アクセラレータ、ハードウェアアクセラレータ
1845 ROM
1846 RAM
1847 大容量ストレージ
1848 システムバス
1849 周辺バス 100 communication systems
105 Networks, Sensors
110 terminal equipment
120 terminal equipment
150 networks
200 streaming system
201 point cloud source, decoder
202 point cloud
203 Encoder
204 point cloud
205 Streaming Server
206 Client Subsystem
207 point cloud
209 point cloud
210 Decoder
211 point cloud
212 Rendering Equipment
213 Acquisition Subsystem
220 Electronics
230 Electronics
300 encoder
304 patch information module
306 patch generation module
308 patch packing module
310 Geometry Image Generation Module
312 Texture Image Generation Module
314 Occupancy Map Module
316 Image Padding Module
320 group expansion module
322 video compression module
323 video compression module
324 multiplexer
332 video compression module
334 Entropy Compression Module
336 Smoothing Module
338 Auxiliary Patch Information Compression Module
400 Decoder
432 Demultiplexer
434 video decompression module
436 video decompression module
438 Occupancy map decompression module
442 Auxiliary patch information decompression module
444 Geometry Reconstruction Module
446 Smoothing Module
448 Texture Reconstruction Module
452 color smoothing module
510 video decoder
520 Analyzer
521 symbols
551 Inverse Transform Unit
552 intra picture prediction unit, intra prediction unit
553 Motion Compensated Prediction Unit
555 Aggregator
556 loop filter unit
557 Reference Picture Memory
558 image buffers
603 video encoder, encoder
630 Source Encoder
632 encoding engine
633 decoder, local video decoder
633 Local Decoder
634 reference picture memory, reference picture cache
635 Predictor
643 compressed images
645 Entropy Encoder
650 controller
700 encoder
701 point cloud
701 Input point cloud
702 bitstream, compressed bitstream
710 position quantization module, quantization module
712 Duplicate Removal Module
720 Attribute Transfer Module
730 Octtree Encoding Module
740 LOD generation module
750 Forecast Module
760 residual quantization module
770 Arithmetic Coding Module
800 Decoder
801 compressed bitstream
802 point cloud
810 Arithmetic Decoding Module
820 Inverse Quantization Module
830 octree decoding module
840 LOD generation module
850 Inverse Quantization Module
860 Forecast Module
900 3D cubes
1010 Octree partition
1020 octatree structure
1201 3D Cube
1202 3D Cube
1203 3D Cube
1301 3D Cube
1302 3D Cube
1303 3D Cube
1400 treatments
1500 processed
1800 computer system
1801 keyboard
1802 mouse
1803 trackpad
1805 joystick
1806 microphone
1807 Scanner
1808 camera
1809 speaker
1810 screen, touch screen
1821 optical media
1822 thumb drive
1823 solid state drive
1840 cores
1841 central processor
1842 GPUs
1843 FPGA
1844 accelerator, hardware accelerator
1845 ROMs
1846 RAM
1847 mass storage
1848 system bus
1849 Peripheral Bus

Claims

A method of point cloud geometry decoding in a point cloud decoder, comprising:
receiving, by a processing circuit, first signaling information indicative of segmentation information for a point cloud from an encoded bitstream of a point cloud comprising a set of points in a three-dimensional (3D) space;
determining, by the processing circuit, second signaling information from the encoded bitstream of the point cloud based on the first signaling information indicative of a first value; signaling information indicates a partitioning mode of the set of points in the 3D space;
determining, by the processing circuit, the partitioning mode for the set of points in the 3D space based on the second signaling information;
reconstructing, by the processing circuit, the point cloud based on the segmentation mode;
including
The step of determining the partitioning mode comprises:
receiving 3-bit signaling information for each of a plurality of partitioning levels in the partitioning mode based on the second signaling information indicative of the first value, for each of the plurality of partitioning levels; the 3-bit signaling information of indicates a partitioning direction along x, y, and z directions for the respective partitioning level in the partitioning mode;
Method.

The step of determining the partitioning mode comprises:
further comprising determining the partitioning mode to be a predefined quadtree and binary tree (QtBt) partition based on the second signaling information indicative of a second value;
The method of Claim 1.

receiving third signaling information indicating that the 3D space is an asymmetric cuboid;
determining dimensions of the 3D space signaled along x, y, and z directions based on the third signaling information indicative of the first value;
3. The method of claim 1 or 2, further comprising:

4. The method of claim 3 , wherein the 3-bit signaling information is determined based on the dimensionality of the 3D space.

determining said partitioning mode based on said first signaling information indicative of a second value, said partitioning mode comprising a respective octree partition for each of a plurality of partitioning levels in said partitioning mode; , further comprising the step of
5. A method according to any one of claims 1-4 .

A method of point cloud geometry decoding in a point cloud decoder, comprising:
receiving, by a processing circuit, first signaling information indicative of segmentation information for a point cloud from an encoded bitstream of a point cloud comprising a set of points in a three-dimensional (3D) space;
determining, by the processing circuit, a partitioning mode for the set of points in the 3D space based on the first signaling information, the partitioning mode comprising a plurality of partitioning levels;
reconstructing, by the processing circuit, the point cloud based on the segmentation mode;
including
The step of determining the partitioning mode comprises:
receiving 3-bit signaling information for each of the plurality of partitioning levels in the partitioning mode based on the first signaling information indicative of a first value, for each of the plurality of partitioning levels; the 3-bit signaling information of indicates a partitioning direction along x, y, and z directions for the respective partitioning level in the partitioning mode;
Method.

7. The method of claim 6 , wherein the 3-bit signaling information is determined based on the dimensionality of the 3D space.

The step of determining the partitioning mode comprises:
6. Further comprising, based on said first signaling information indicative of a second value, determining said partitioning mode including a respective octree partition for each of said plurality of partitioning levels in said partitioning mode. Or the method described in 7 .

receiving second signaling information from the encoded bitstream for the point cloud, wherein the second signaling information indicates the 3D space when the second signaling information is a first value; is an asymmetrical cuboid and the second signaling information is a second value, the second signaling information indicating the 3D space is a symmetrical cuboid, from claim 6 , further comprising: 9. The method of any one of paragraphs 8 .

The step of determining the partitioning mode comprises:
each of a first partition level of the plurality of partition levels of the partition mode based on the first signaling information indicative of the second value and the second signaling information indicative of the first value; determining the partition mode to include each octree partition in
the following conditions
determining a partition type and partition direction at a last partition level of said plurality of partition levels of said partition mode according to, where d _x , d _y and d _z are x, y , and a step, which is the log2 size in 3D space in the z direction, and
10. The method of claim 9 , further comprising:

determining second signaling information based on said first signaling information indicative of a first value, said second signaling information being indicative of said first value when said second signaling information is indicative of said first value; indicates that the 3D space is an asymmetric cuboid, and when the second signaling information indicates a second value, the second signaling information indicates that the 3D space is a symmetric cuboid;
determining dimensions of the 3D space signaled along x, y, and z directions based on the second signaling information indicative of the first value;
11. The method of any one of claims 6-10 , further comprising

Apparatus configured to carry out the method according to any one of claims 1-5 .

Apparatus arranged to carry out the method according to any one of claims 6 to 10 .

A program for causing a computer to execute the method according to any one of claims 1 to 5.

A program for causing a computer to execute the method according to any one of claims 6 to 11 .