JP4515378B2

JP4515378B2 - Image decoding apparatus, program, and computer-readable recording medium

Info

Publication number: JP4515378B2
Application number: JP2005329270A
Authority: JP
Inventors: 喜秀外村; 孝之仲地
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2005-11-14
Filing date: 2005-11-14
Publication date: 2010-07-28
Anticipated expiration: 2025-11-14
Also published as: JP2007142498A

Description

本発明は、画像復号装置及びプログラム及びコンピュータ読み取り可能な記録媒体に係り、特に、多視点画像の符号化、即ち、互いを独立に符号化する必要のあるシステム及び、無線やインターネット等の損失が発生するネットワーク通信システムにおいて、分割され、別々に符号化された一連の画像（静止画・動画）を復元するための画像復号装置及びプログラム及びコンピュータ読み取り可能な記録媒体に関する。 The present invention relates to an image decoding apparatus, a program, and a computer-readable recording medium, and in particular, multi-viewpoint image encoding, that is, a system that needs to encode each other independently, and loss of radio, the Internet, etc. BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image decoding apparatus and program for restoring a series of images (still images and moving images) that are divided and encoded separately in a generated network communication system, and a computer-readable recording medium.

近年、アナログ信号システムからディジタル信号システムへと移行しており、ディジタル画像の需要が増加している。しかし、ディジタル画像はそのままではデータ量が膨大になることから、画像を効率的に圧縮する符号化技術が重要なものになっており、国際標準アルゴリズムが広く用いられている。 In recent years, there has been a shift from analog signal systems to digital signal systems, and the demand for digital images has increased. However, since a digital image has a huge amount of data as it is, an encoding technique for efficiently compressing the image is important, and an international standard algorithm is widely used.

一例として、ディジタルＴＶにはMPEG-2(“Information technology‐Generic coding of movie pictures and associated audio information: Video,” ISO/IEC 13818-2, May 1996)が用いられており、次世代ＤＶＤの規格にはH.264(“Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (IUT-T Rec. H.264‐ISO/IEC 14496-10 AVC),” Joint Video Team of ISO/IEC MPEG & ITU-T, 2003)が用いられることが決定している。 As an example, MPEG-2 (“Information technology-Generic coding of movie pictures and associated audio information: Video,” ISO / IEC 13818-2, May 1996) is used for digital TV. H.264 (“Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (IUT-T Rec. H.264-ISO / IEC 14496-10 AVC),” Joint Video Team of ISO / IEC MPEG & ITU -T, 2003) has been decided to be used.

これらの画像圧縮アルゴリズムの具体的な圧縮アルゴリズムの具体的な圧縮アルゴリズムは、まず、エンコーダ側では動き補償（ＭＣ）が行われ、時間的な信号の冗長性が排除され、その後、離散コサイン変換（ＤＣＴ）等の周波数変換により空間的な信号の冗長性が排除された後に、エントロピー符号化が行われる。デコーダ側は真逆の処理を行うことにより復号する。 The specific compression algorithm of these image compression algorithms is such that motion compensation (MC) is first performed on the encoder side to eliminate temporal signal redundancy, and then discrete cosine transform ( Entropy coding is performed after spatial signal redundancy is eliminated by frequency conversion such as DCT). The decoder side performs decoding by performing the reverse process.

しかし、近年標準化された動画像の圧縮アルゴリズムは、圧縮効率こそ高くはなっているが、それに伴い、演算量も膨大なものになっている。例えば、H.264では大量の演算処理が必要であり、MPEG-2に比べて処理演算量は５〜１０倍程度増加することが知られている。この大量の演算処理は複雑なフレーム間予測や可変マクロブロックサイズの最適化等に必要な演算であり、エンコーダ側において必要となる。 However, in recent years, the standardized moving image compression algorithm has a high compression efficiency, but the amount of computation has become enormous. For example, H.264 requires a large amount of arithmetic processing, and it is known that the processing arithmetic amount increases by about 5 to 10 times compared to MPEG-2. This large amount of arithmetic processing is necessary for complicated inter-frame prediction, variable macroblock size optimization, and the like, and is necessary on the encoder side.

これに対して、例え、デコーダ側の演算処理は増えたとしてもエンコーダ側の処理が少ない方が好ましいという要求がセンサーカメラや携帯電話の動画処理などのアプリケーションでは考えられる。そして、この要求を満たす、エンコーダ側の高負荷演算処理をデコーダ側に移行したシステムの１つがDistributed Source Coding(DSC)である。 On the other hand, for example, there is a demand for less processing on the encoder side even if arithmetic processing on the decoder side is increased in applications such as sensor camera and mobile phone video processing. One system that satisfies this requirement and has shifted the high-load computation processing on the encoder side to the decoder side is Distributed Source Coding (DSC).

DSCは、図７に示すように、複数の相関のある情報源に対して互いを観測することなく分散して符号化し、受信側では別々に符号化された各データを一括して復号するシステムである。DSCはもともと、１９７０年代にSlepianとWolfにより確立されたSlepian-Wolf定理（D. Slepian and J. K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inform. Theory, vol. 19, no.4, pp.471-480, 1973）とWynerとZivにより確立されたWyner-Ziv定理（A. Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Trans. Inform. Theory, vol. 22, no1, pp.1-10, 1976）に基づいており、それらの定理を実際のシステムへと拡張したものである。Slepian-Wolf定理は２つの情報源を互いに観測しないで別々に符号化した場合における無歪み状態で復号できる許容圧縮レート領域を与えたものであり、Wyer-Ziv定理は２つの情報源において、１つの情報源に歪みが発生した場合についてレート歪み領域を与えたものである。これらの定理により、互いに観測しないで別々に符号化する場合の圧縮限界は、Slepian-Wolf定理及びWyner-Ziv定理の条件の範囲内にてではあるが、互いを観測して符号化した場合と等しいことが証明されている。近年では、Turbo符号を用いてSlepian-Wolf定理やWyner-Ziv定理で与えられた限界にどの程度迫れるかが研究されている（例えば、非特許文献１、非特許文献２参照）。 As shown in FIG. 7, DSC is a system in which a plurality of correlated information sources are distributed and encoded without observing each other, and the separately encoded data is collectively decoded on the receiving side. It is. DSC was originally established by the Slepian-Wolf theorem (D. Slepian and JK Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inform. Theory, vol. 19, no. 4, pp.471-480, 1973) and the Wyner-Ziv theorem established by Wyner and Jiv (A. Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Trans. Inform Theory, vol. 22, no1, pp.1-10, 1976), which is an extension of these theorems to actual systems. The Slepian-Wolf theorem gives an allowable compression rate region that can be decoded without distortion when the two information sources are encoded separately without observing each other. The Wyer-Ziv theorem is A rate distortion region is given when distortion occurs in one information source. According to these theorems, the compression limit when coding separately without observing each other is within the range of the Slepian-Wolf and Wyner-Ziv theorem, but when observing each other and coding. Proven to be equal. In recent years, research has been conducted on the extent to which the limits given by the Slepian-Wolf theorem and the Wyner-Ziv theorem are approached using Turbo codes (for example, see Non-Patent Document 1 and Non-Patent Document 2).

DSCの実際の構成方法は種々あげられているが、例えば、図８に示すように、Wyner-Zivフレームの符号化及び復号にターボ符号を用いた方法が提案されている。この手法は、エンコーダ側の処理を軽減でき、さらにKeyフレームからの補足情報を作成することにより符号化効率の向上を図っている（例えば、非特許文献３参照）。 There are various DSC configuration methods. For example, as shown in FIG. 8, a method using a turbo code for encoding and decoding a Wyner-Ziv frame has been proposed. This method can reduce the processing on the encoder side and further improves the coding efficiency by creating supplementary information from the Key frame (see, for example, Non-Patent Document 3).

しかし、Keyフレームからの補助情報の作成において、一旦、全て復号し、それから補助情報を作成しなおす構造のため、情報処理量が多くなる。また、適した補足情報が作成できなかった場合にパリティビットの再送等を行う必要があり、ネットワークを用いた大規模配信システムやリアルタイム性が必要とされるシステムでは用いることができない。
J. Bajcsy and P. Mitran, “Coding for the Slepian‐Wolf problem with turbo codes,” in Proc. IEEE Global Communications Conf., vol.2, 2001, pp.1400-1404 A. Aaron and B. Girod, “Compression with side information using turbo codes,” in Proc. IEEE Data Compression Conf., 2002, pp.252-261 B. Girod, A. Aaron, S. Rane and D. Rebollo- Monedero, “Distributed video coding,” Proceedings of the IEEE, Special Issue on Video Coding and Delivery, vol. 93, no. 1, pp. 71-83, January 2005 However, in the creation of auxiliary information from the Key frame, the amount of information processing increases because of the structure in which all information is once decoded and then auxiliary information is recreated. In addition, when suitable supplementary information cannot be created, it is necessary to retransmit parity bits and the like, which cannot be used in a large-scale distribution system using a network or a system that requires real-time performance.
J. Bajcsy and P. Mitran, “Coding for the Slepian-Wolf problem with turbo codes,” in Proc. IEEE Global Communications Conf., Vol.2, 2001, pp.1400-1404 A. Aaron and B. Girod, “Compression with side information using turbo codes,” in Proc. IEEE Data Compression Conf., 2002, pp.252-261 B. Girod, A. Aaron, S. Rane and D. Rebollo- Monedero, “Distributed video coding,” Proceedings of the IEEE, Special Issue on Video Coding and Delivery, vol. 93, no. 1, pp. 71-83 , January 2005

上記のように、DSCは多視点映像符号化やＩＰネットワーク等のベストエフォート型の通信システムに対する符号化として効果が期待されている。しかし、ターボ符号器を用いた場合やMPEG-2を用いた場合においても、補足情報の作成にデコード処理及び部分的なエンコード処理が必要となり、演算処理量は増加している。 As described above, DSC is expected to be effective as a coding for a best-effort communication system such as multi-view video coding or IP network. However, even when a turbo encoder is used or MPEG-2 is used, decoding processing and partial encoding processing are required to create supplementary information, and the amount of calculation processing is increasing.

本発明は、上記の点に鑑みなされたもので、近年注目されているスケーラビリティ機能を利用することにより、補助情報を作成することによりＤＳＣシステムを構築し、高品質な動画像を提供するための画像復号装置及びプログラム及びコンピュータ読み取り可能な記録媒体を提供することを目的とする。 The present invention has been made in view of the above points, and by using a scalability function that has been attracting attention in recent years, a supplementary information is created to construct a DSC system and provide a high-quality moving image. An object is to provide an image decoding apparatus, a program, and a computer-readable recording medium.

図１は、本発明の原理構成図である。 FIG. 1 is a principle configuration diagram of the present invention.

本発明（請求項１）は、別々に符号化された複数の画像データ列を復号する画像復号装置であって、
複数のデータ列のうちのデータ列Ａを２つのパスに分け、そのうちの第１のパスを復号して出力し、第２のパスのデータ列の低域成分を第１の補助情報、高域成分を第２の補助情報として復号する第１の復号手段と、
第１及び第２の補助情報に対して動き補償を行い、それぞれ第３及び第４の補助情報として出力する補助情報作成手段と、
複数のデータ列のうちデータ列Ａとは異なるデータ列Ｂを第３の補助情報を用いて復号する第２の復号手段と、
第４の補助情報が尤もらしいかを信号の確率分布または相関値を用いて判断し、尤もらしいと判断された場合は、第２の復号手段で復号された画像と加算し、出力する再構成手段と、を有する。 The present invention (Claim 1 ) is an image decoding apparatus for decoding a plurality of separately encoded image data sequences,
The data string A of the plurality of data strings is divided into two paths, the first path is decoded and output, and the low frequency component of the data string of the second path is the first auxiliary information, the high frequency First decoding means for decoding the component as second auxiliary information;
Auxiliary information creating means for performing motion compensation on the first and second auxiliary information and outputting them as third and fourth auxiliary information, respectively;
A second decoding means for decoding a data string B different from the data string A among the plurality of data strings using the third auxiliary information;
It is determined whether the fourth auxiliary information is likely using the probability distribution or the correlation value of the signal. If it is judged that the fourth auxiliary information is likely, the reconstruction is performed by adding to the image decoded by the second decoding means and outputting it. Means.

本発明（請求項２）は、別々に符号化された複数の画像データ列を復号する画像復号装置であって、
複数のデータ列のうちのデータ列Ａを２つのパスに分け、そのうちの第１のパスを復号して出力し、第２のパスのデータ列の低域成分を第１の補助情報、高域成分を第２の補助情報として復号する第１の復号手段と、
第１及び第２の補助情報に対して動き補償を行い、それぞれ第３及び第４の補助情報として出力する補助情報作成手段と、
補助情報作成手段から出力された第４の補助情報の高域成分の不足分を補い、第５の補助情報として出力する高域成分推定手段と、
複数のデータ列のうちデータ列Ａとは異なるデータ列Ｂを第３の補助情報を用いて復号する第２の復号手段と、
第５の補助情報が尤もらしいかを信号の確率分布または相関値を用いて判断し、尤もらしいと判断された場合は、第２の復号手段で復号された画像と加算し、出力する再構成手段と、を有する。 The present invention (Claim 2 ) is an image decoding apparatus for decoding a plurality of separately encoded image data sequences,
The data string A of the plurality of data strings is divided into two paths, the first path is decoded and output, and the low frequency component of the data string of the second path is the first auxiliary information, the high frequency First decoding means for decoding the component as second auxiliary information;
Auxiliary information creating means for performing motion compensation on the first and second auxiliary information and outputting them as third and fourth auxiliary information, respectively;
High frequency component estimation means for compensating for the shortage of the high frequency component of the fourth auxiliary information output from the auxiliary information creation means, and outputting as fifth auxiliary information;
A second decoding means for decoding a data string B different from the data string A among the plurality of data strings using the third auxiliary information;
It is determined whether the fifth auxiliary information is likely using the probability distribution or the correlation value of the signal. If it is determined that the fifth auxiliary information is likely, the reconstruction is performed by adding the image decoded by the second decoding means and outputting it. Means.

本発明（請求項３）は、請求項１または２に記載の画像復号装置において、データ列Ａが、国際標準規格であるJPEG2000により符号化されたデータとする。 According to the present invention (Claim 3 ), in the image decoding device according to Claim 1 or 2 , the data string A is data encoded by JPEG2000, which is an international standard.

本発明（請求項４）は、コンピュータを、請求項１乃至３のいずれか１項に記載の画像復号装置として機能させる画像復号プログラムである。 The present invention (Claim 4 ) is an image decoding program that causes a computer to function as the image decoding apparatus according to any one of Claims 1 to 3 .

本発明（請求項５）は、コンピュータを、請求項１乃至３のいずれか１項に記載の画像復号装置として機能させるプログラムを格納したコンピュータ読み取り可能な記録媒体である。

The present invention (Claim 5 ) is a computer-readable recording medium storing a program that causes a computer to function as the image decoding device according to any one of Claims 1 to 3 .

上記のように本発明によれば、JPEG2000の国際標準アルゴリズムを功名に利用したDSCの構成が可能であり、デコーダ側にて予測等を行わない一般的なJPEG2000に比べて符号化効率が高くなり、画質の向上が期待できる。 As described above, according to the present invention, it is possible to construct a DSC that effectively uses the JPEG2000 international standard algorithm, and the encoding efficiency is higher than that of a general JPEG2000 that does not perform prediction on the decoder side. Improvement in image quality can be expected.

また、本システムの構造上、補足画像の高域成分が不足するデメリットがあるが、それを補う時間解像度の拡大処理を導入し、非線形処理にて高域成分の推定を実施することにより、高域成分を持った動画像の時間解像度の拡張が可能である。JPEG2000のPart2にて実装可能なマルチコンポーネント変換機構を利用した時間解像度の拡張やスケーラビリティ機能の利用により既存システムを利用した効率的なＤＳＣが構成可能である。 In addition, there is a demerit that the high-frequency component of the supplementary image is insufficient due to the structure of this system, but by introducing a time resolution expansion process that compensates for it, high frequency component estimation is performed by nonlinear processing. It is possible to extend the temporal resolution of a moving image having a band component. An efficient DSC using an existing system can be configured by expanding the time resolution using a multi-component conversion mechanism that can be implemented in JPEG2000 Part 2 and using a scalability function.

以下、図面と共に本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図２は、本発明により実現可能なシステム構成例を示す。 FIG. 2 shows a system configuration example that can be realized by the present invention.

同図では、Wyner-Zivフレーム及びKeyフレームの作成原理を示している。 This figure shows the principle of creating a Wyner-Ziv frame and a Key frame.

エンコーダ側においては、Keyフレームは圧縮装置３により、エンベデット符号化により圧縮された符号列、例えば、JPEG2000といった国際標準規格に制定されている圧縮アルゴリズムを用いて圧縮される。また、デコード側においては、Keyフレームは伸張装置４において、同様の圧縮アルゴリズムを用いて伸張される。 On the encoder side, the Key frame is compressed by the compression device 3 using a code string compressed by embedded coding, for example, a compression algorithm established in an international standard such as JPEG2000. On the decoding side, the Key frame is decompressed by the decompressing device 4 using the same compression algorithm.

また、エンコーダ側のWiner-Zivフレームの圧縮装置１では、LDPC符号やTurbo符号等の通信路符号化が行われ、作成される冗長情報のみが送信される。つまり、冗長情報量をどの程度付加するかで圧縮率が決まる。デコーダ側のWiner-Zivフレームの伸張装置２では、後述する補助情報作装置５でKeyフレームより作成される補助情報ａを利用して、sum-product法等を用いて伸張される。補助情報ａは、エンベデット符号化されたKeyフレームより演算等を行わないで抽出された補助情報ｂより動き予測やフィルタ処理により作成される。また、Winer-Zivフレームでは補助情報ａを用いて伸張された画像に対して、再構成装置６で補助情報ａと排他的となる成分で構成される補助情報ｃを用いて再構成される。このようにして、本発明では、高画質な画像の作成が可能となる。 Also, in the compressor 1 of the Winer-Ziv frame on the encoder side, channel coding such as LDPC code or Turbo code is performed, and only the redundant information to be created is transmitted. That is, the compression rate is determined by how much redundant information is added. In the decoder-side Winer-Ziv frame decompression device 2, the supplementary information a created from the Key frame by the supplementary information creation device 5 described later is used to decompress using the sum-product method or the like. The auxiliary information a is created by motion prediction or filter processing from the auxiliary information b extracted without performing calculation or the like from the embedded encoded key frame. Further, in the Winer-Ziv frame, an image expanded using the auxiliary information a is reconstructed by using the auxiliary information c configured by a component exclusive of the auxiliary information a by the reconstruction device 6. Thus, in the present invention, it is possible to create a high-quality image.

［第１の実施の形態］
図３は、本発明の第１の実施の形態におけるＤＳＣシステムの基本構成図である。 [First Embodiment]
FIG. 3 is a basic configuration diagram of the DSC system according to the first embodiment of the present invention.

同図では、エンコーダ側において、複数のデータ列（Keyフレーム及びWyner-Zivフレーム）を符号化し、デコーダ側では、符号化された複数のデータ列を受信し、KeyフレームからWyner-Zivフレームを復号するシステムを示している。 In the figure, the encoder side encodes multiple data strings (Key frame and Wyner-Ziv frame), and the decoder side receives the encoded multiple data strings and decodes the Wyner-Ziv frame from the Key frame. Shows the system to do.

同図に示すシステムは、圧縮装置１、２、伸張装置７，８、補助画素作成装置９から構成される。 The system shown in FIG. 1 includes compression devices 1 and 2, expansion devices 7 and 8, and auxiliary pixel creation device 9.

圧縮装置１はWyner-zivフレームの符号化部１０、圧縮装置３はKeyフレームの符号化部１１を有し、伸張装置７はWyne-Zivの復号部１２、伸張装置８はKeyフレームの復号部１３、抽出部１４を有し、補助画素作成装置９は部分復号部１５と動き補償部１６を有する。 The compression device 1 has a Wyner-ziv frame encoding unit 10, the compression device 3 has a Key frame encoding unit 11, the decompression device 7 is a Wyne-Ziv decoding unit 12, and the decompression device 8 is a Key frame decoding unit. 13, the extraction unit 14, and the auxiliary pixel creation device 9 includes a partial decoding unit 15 and a motion compensation unit 16.

以下、それぞれの構成要素について説明する。 Hereinafter, each component will be described.

まず、情報源はKeyフレームとWyner-Zivフレームとに分けられる。各圧縮装置１，３において別々に符号化部１０及び１１により圧縮される。 First, the information source is divided into a Key frame and a Wyner-Ziv frame. Each of the compression devices 1 and 3 is compressed by the encoding units 10 and 11 separately.

Wyner-Zivフレームにおける符号化部１０では、誤り訂正符号が用いられ、パリティビットが生成される。そして、パリティビットのみが伸張装置７へと送られる。つまり、誤り訂正符号を用いてメッセージデータｋを冗長データｍに圧縮されることと等しく、誤り訂正符号化効率Ｒが１に近いほど圧縮率が高くなる。Keyフレームにおける符号化部１１では、JPEG2000等のエンベデット符号化が行われ、伸張装置８に送られる。 The encoding unit 10 in the Wyner-Ziv frame uses an error correction code and generates a parity bit. Only the parity bits are sent to the decompression device 7. In other words, this is equivalent to compressing the message data k into redundant data m using an error correction code. The closer the error correction encoding efficiency R is to 1, the higher the compression rate. The encoding unit 11 in the Key frame performs embedded encoding such as JPEG2000 and sends it to the decompression device 8.

Keyフレームの伸張装置８においては、符号化されたKeyフレームが２つのパスに分けられ、１つのパスでは復号部１３により復号が行われ、Keyフレームの復号画像として出力される。もう一方のパスは抽出部１４に送られ、符号化部１１で符号化されたデータ列の一部分が抽出される。抽出されたデータは、補助画素作成装置９の部分復号部１５に入力され、低域成分が復号され、動き補償部１６に送られる。送られた低域部分は動き補償部１６において、線形フィルタ処理により動き補償が行われる。このとき、Keyフレーム（ｔ）及びKeyフレーム（ｔ−１）等の低域成分を用いて動き補償が行われ、補助情報が作成される。作成された補助情報はWyner-Zivフレームの伸張装置７の復号部１２に送られる。なお、上記の動き補償部１６において、線形フィルタを用いて動き補償を行う例を示しているが、この例に限定されることなく、動き補償が可能な技術であれば他の技術を用いてもよい。 In the key frame decompression device 8, the encoded key frame is divided into two passes, and in one pass, decoding is performed by the decoding unit 13 and output as a decoded image of the key frame. The other path is sent to the extraction unit 14 and a part of the data string encoded by the encoding unit 11 is extracted. The extracted data is input to the partial decoding unit 15 of the auxiliary pixel creation device 9, and the low frequency component is decoded and sent to the motion compensation unit 16. The motion compensation unit 16 performs motion compensation on the sent low-frequency portion by linear filter processing. At this time, motion compensation is performed using low frequency components such as a Key frame (t) and a Key frame (t-1), and auxiliary information is created. The created auxiliary information is sent to the decoding unit 12 of the expansion device 7 of the Wyner-Ziv frame. In addition, although the example which performs motion compensation using a linear filter is shown in said motion compensation part 16, it is not limited to this example, If it is a technique in which motion compensation is possible, it will use other techniques. Also good.

復号部１２では、動き補償部１６から出力された補助情報と符号化部１０で作成されたパリティビットを用いて復号処理が行われ、Wyner-Zivフレームが出力される。 In the decoding unit 12, decoding processing is performed using the auxiliary information output from the motion compensation unit 16 and the parity bits generated by the encoding unit 10, and a Wyner-Ziv frame is output.

以上の操作により、Wyner-Zivフレームはインターフレームデコードを行ったものと等しい効果が得られ、圧縮効率の向上と、誤り訂正符号を利用した方式であるためにロバスト性の向上が期待できる。 By the above operation, the Wyner-Ziv frame has the same effect as that obtained by inter-frame decoding, and it can be expected that the compression efficiency is improved and the robustness is improved because the method uses an error correction code.

［第２の実施の形態］
図４は、本発明の第２の実施の形態におけるＤＳＣシステムの基本構成図である。 [Second Embodiment]
FIG. 4 is a basic configuration diagram of a DSC system according to the second embodiment of the present invention.

同図では、エンコーダ側において、複数のデータ列（Keyフレーム及びWyner-Zivフレーム）を符号化し、デコーダ側では、複数のデータ列を受信し、補助画素作成装置１０５においてKeyフレームからWyner-Zivフレームを復号し、再構成装置１０６で補助情報を作成するシステムを示している。 In the figure, a plurality of data strings (Key frame and Wyner-Ziv frame) are encoded on the encoder side, and a plurality of data strings are received on the decoder side, and the auxiliary pixel creation device 105 receives the Wyner-Ziv frame from the Key frame. 2 shows a system in which auxiliary information is created by the reconstruction device 106.

同図に示すシステムは、圧縮装置１０１、１０２、伸張装置１０３、１０４、補助画素作成装置１０５、再構成装置１０６から構成される。 The system shown in the figure includes compression apparatuses 101 and 102, expansion apparatuses 103 and 104, an auxiliary pixel creation apparatus 105, and a reconstruction apparatus 106.

圧縮装置１０１はWyer-Zivフレームの符号化部２０を有し、圧縮装置１０２はKeyフレームの符号化部２１を有する。また、伸張装置１０３はWyer-Zivフレームの復号部２２を有し、伸張装置１０４はKeyフレームの復号部２３と抽出部２４を有する。補助画素作成装置１０５は、部分復号部２５と動き補償部２６を有する。再構成装置１０６は、補助情報判定部２７と合成部２８を有する。 The compression apparatus 101 includes a Wyer-Ziv frame encoding unit 20, and the compression apparatus 102 includes a Key frame encoding unit 21. The decompressor 103 includes a Wyer-Ziv frame decoder 22, and the decompressor 104 includes a Key frame decoder 23 and an extractor 24. The auxiliary pixel creation device 105 includes a partial decoding unit 25 and a motion compensation unit 26. The reconstruction device 106 includes an auxiliary information determination unit 27 and a synthesis unit 28.

本実施の形態において、第１の実施の形態と異なる点は、再構成装置１０６を含んでいることである。第１の実施の形態では、Keyフレームから低域成分を抽出することにより演算量を少なく符号化効率を高めていた反面、高域成分が不足したWyner-Zivフレームが作成される可能性がある。そこで、本実施の形態では、再構成装置１０６を導入することにより、低域成分はWyner-Zivフレームの伸張装置１０３の復号部２２で復号され、再構成装置１０６により不足している高域成分を補うものである。

The present embodiment is different from the first embodiment in that a reconstruction device 106 is included. In the first embodiment, the low-frequency component is extracted from the Key frame to reduce the amount of calculation and increase the coding efficiency, but there is a possibility that a Wyner-Ziv frame lacking the high-frequency component is created. . Therefore, in this embodiment, by introducing the reconstruction device 106, the low frequency component is decoded by the decoding unit 22 of the expansion device 103 of the Wyner-Ziv frame, and the high frequency component that is lacking by the reconstruction device 106. It supplements.

第１の実施の形態では、補助画素作成装置９の部分復号部１５は低域成分しか動き補償部１６に送られなかったが、本実施の形態では、補助画素作成装置１０５の部分復号部２５は、高域成分も動き補償部２６に送ることができる。動き補償部２６では、低域成分に関しては、第１の実施の形態と同様に、動き補償が行われ、補助情報が復号部２２に送られる。一方、高域成分に対する動き補償は低域成分に対して行う場合よりも精度を上げることが難しく、符号化効率の低下につながる。そのため、動き補償部２６で動き補償が行われた後に、再構成装置１０６の補助情報判定部２７にてその補助情報が尤もらしいか判断される。尤もらしいかの判断は信号の確率分布や相関値等を用いて判断される。尤もらしいと判断された場合、再構成装置１０６の合成部２８で、復号部２２より復号された信号と足し合わされ、Wyner-Zivフレームが出力される。 In the first embodiment, the partial decoding unit 15 of the auxiliary pixel generation device 9 has only been sent to the motion compensation unit 16 in the low-frequency component, but in this embodiment, the partial decoding unit 25 of the auxiliary pixel generation device 105. The high frequency component can also be sent to the motion compensation unit 26. The motion compensation unit 26 performs motion compensation on the low frequency components, and sends auxiliary information to the decoding unit 22 as in the first embodiment. On the other hand, it is more difficult to increase the accuracy of motion compensation for high frequency components than for low frequency components, leading to a decrease in coding efficiency. Therefore, after motion compensation is performed by the motion compensation unit 26, the auxiliary information determination unit 27 of the reconstruction device 106 determines whether the auxiliary information is likely. Judgment of plausibility is made using signal probability distribution, correlation value, and the like. If it is determined to be plausible, the signal is added to the signal decoded by the decoding unit 22 by the combining unit 28 of the reconstruction device 106, and a Wyner-Ziv frame is output.

上記の一例として、LDPC符号とJPEG2000を用いたＤＳＣの構成を図５に示す。 As an example of the above, FIG. 5 shows a DSC configuration using LDPC code and JPEG2000.

まず、符号化側のKeyフレーム（奇数フレーム）ではJPEG2000により符号化される。これに対し、W-Zフレーム（偶数フレーム）はウェーブレット変換（DWT）によりMallet分割された後、画像の低周波数成分のみが残るようにポスト量子化される。量子化されたDWT係数はグレーコード変換変換（１０進数の隣り合う数字のハミング距離を２進数においても１とする変換：Gray, F. “Pulse Code Communication.” United States Patent Number 2632058. March 17, 1953）が行われ、低密度パリティ検査（LDPC:Low Density Parity Check）エンコーダ（R.G. Gallager, “Low density parity check codes,” in Research Monograph series. Cambridge, MIT Press, 1963）により符号化される。そして、LDPC符号により作成されたシンドロームビットのみがW-Z Framesにおいて送信される。 First, the key frame (odd frame) on the encoding side is encoded by JPEG2000. On the other hand, the W-Z frame (even frame) is subjected to post-quantization so that only the low-frequency component of the image remains after being subjected to Mallet division by wavelet transform (DWT). The quantized DWT coefficient is converted to a Gray code conversion (a conversion in which the hamming distance between adjacent decimal numbers is also 1 in a binary number: Gray, F. “Pulse Code Communication.” United States Patent Number 2632058. March 17, 1953) and encoded by a Low Density Parity Check (LDPC) encoder (RG Gallager, “Low density parity check codes,” in Research Monograph series. Cambridge, MIT Press, 1963). Only the syndrome bits created by the LDPC code are transmitted in W-Z Frames.

なお、図５に示す構成では、ウェーブレット変換器（DWT）でウェーブレット変換と量子化器(Q)での量子化により低周波数成分のみにすること、及び、グレーコード符号器でグレー符号化を実施することにより、デコーダ側での動き予測の効率向上を図っている。 In the configuration shown in FIG. 5, the wavelet transformer (DWT) uses only wavelet transform and the quantizer (Q) to make only low frequency components, and the gray code encoder performs gray coding. This improves the efficiency of motion prediction on the decoder side.

復号処理では、Keyフレーム(奇数フレーム)は、JPEG2000デコーダにて復号された後、W-Zフレーム（偶数フレーム）の復号に必要な補助情報（Side Information）を作成する。補助情報は、復号されたKeyフレームから線形予測により作成される予測信号をW-Zフレームのエンコード時と同様のウェーブレット変換(DWT)及び量子化(Q)、グレーコード変換(グレーコード変換器)を実施することにより作成され、LDPC復号器へ送られる。 In the decoding process, after the Key frame (odd frame) is decoded by the JPEG2000 decoder, auxiliary information (Side Information) necessary for decoding the W-Z frame (even frame) is created. Auxiliary information, wavelet transform (DWT), quantization (Q), and Gray code conversion (Gray code converter) are applied to the prediction signal created from the decoded Key frame by linear prediction. Is created and sent to the LDPC decoder.

LDPC復号器では、受信したシンドロームビットを用いて補助情報の誤りをSum-product復号法によって訂正する。つまり、デコーダ側で予測画像が外れた信号を訂正していることと等しい。sum-product復号法により復号された信号は、逆グレーコード変換、逆量子化、逆ウェーブレット変換が実施され、Keyフレームから作成される中間信号と再構成され、W-Zフレームの復号処理は完了する。 In the LDPC decoder, the error of the auxiliary information is corrected by the sum-product decoding method using the received syndrome bits. In other words, this is equivalent to correcting a signal from which the predicted image has been removed on the decoder side. The signal decoded by the sum-product decoding method is subjected to inverse gray code transformation, inverse quantization, and inverse wavelet transformation, reconstructed with an intermediate signal created from the Key frame, and the decoding process of the W-Z frame is completed.

なお、デコーダ側でのウェーブレット変換及び量子化処理は、JPEG2000のスケーラビリティ機能を用いることにより省略可能であり、その場合、抽出された低周波数成分に対して中間信号を作成し、グレー変換を実施し、LDPC復号器に送られる。また、中間画像の作成は、JPEG2000のPart2のマルチキャストコンポーネント変換を用いて作成できる。 Note that the wavelet transform and quantization processing on the decoder side can be omitted by using the JPEG2000 scalability function. In that case, an intermediate signal is created for the extracted low-frequency components, and gray transform is performed. Sent to the LDPC decoder. The intermediate image can be created using multicast component conversion of JPEG 2000 Part 2.

［第３の実施の形態］
図６は、本発明の第３の実施の形態におけるＤＳＣシステムの基本構成図である。 [Third Embodiment]
FIG. 6 is a basic configuration diagram of a DSC system according to the third embodiment of the present invention.

同図では、エンコーダ側において、複数のデータ列（Keyフレーム及びWyner-Zivフレーム）を符号化し、デコーダ側では、複数のデータ列を受信し、KeyフレームからWyner-Zivフレームの補助情報を作成するシステムを示す。 In the figure, the encoder side encodes a plurality of data strings (Key frame and Wyner-Ziv frame), and the decoder side receives the plurality of data strings and creates auxiliary information of the Wyner-Ziv frame from the Key frame. Indicates the system.

同図に示すシステムは、圧縮装置２０１，２０２、伸張装置２０３，２０４、補助画素作成装置２０５、再構成装置２０６、高域成分推定装置２０７から構成される。 The system shown in the figure includes compression devices 201 and 202, decompression devices 203 and 204, an auxiliary pixel creation device 205, a reconstruction device 206, and a high frequency component estimation device 207.

圧縮装置２０１はWyner-Zivフレームの符号化部３０を有し、圧縮装置２０２はKeyフレームの符号化部３１を有し、伸張装置２０３はWyner-Zivフレームの復号部３２を有し、伸張装置２０４はKeyフレームの復号部３３と抽出部３４を有し、補助画素作成装置２０５は部分復号部３５と動き補償部３６を有し、高域成分推定装置２０７は高域成分推定部３７を有し、再構成装置２０６は補助情報判定部３８と合成部３９を有する。 The compression device 201 includes a Wyner-Ziv frame encoding unit 30, the compression device 202 includes a Key frame encoding unit 31, and the decompression device 203 includes a Wyner-Ziv frame decoding unit 32. 204 includes a key frame decoding unit 33 and an extraction unit 34, an auxiliary pixel generation device 205 includes a partial decoding unit 35 and a motion compensation unit 36, and a high frequency component estimation device 207 includes a high frequency component estimation unit 37. The reconstruction device 206 includes an auxiliary information determination unit 38 and a synthesis unit 39.

図６に示す構成において、第２の実施の形態と異なる点は、高域成分推定装置２０７が含まれている点である。第２の実施の形態では、Keyフレームから低域成分とは別に高域成分を動き補償部３６に送っていたが、（フィルタ処理等の）動き補償により高域成分は不足したものとなる。そこで、本実施の形態では、高域成分推定装置２０７を導入することにより補う。そのために、符号化部３１と相性のよい、時間解像度の拡大処理を行う高域成分推定装置を用いる。当該高域成分推定装置２０７は、入力された低時間解像度の動画像の時間軸方向のサンプル数（フレームレート）をアップサンプルし、フィルタにより不必要な高域成分（イメージング成分）を除去し、入力された低時間解像度の動画像について、非線形予測法を用いて時間軸方向の高域予測を行うものである。 The configuration shown in FIG. 6 is different from the second embodiment in that a high frequency component estimation device 207 is included. In the second embodiment, a high frequency component is sent from the Key frame to the motion compensation unit 36 separately from the low frequency component, but the high frequency component is insufficient due to motion compensation (such as filter processing). Therefore, in the present embodiment, the high frequency component estimation device 207 is introduced to compensate. For this purpose, a high-frequency component estimation device that is compatible with the encoding unit 31 and performs time resolution expansion processing is used. The high-frequency component estimation device 207 upsamples the number of samples (frame rate) in the time axis direction of the input low-resolution video, removes unnecessary high-frequency components (imaging components) by a filter, The input low-resolution moving image is subjected to high-frequency prediction in the time axis direction using a nonlinear prediction method.

また、上記の第１〜第３の実施の形態の動作をプログラムとして構築し、エンコーダ側から入力された符号化された複数のデータ列の復号処理を行うコンピュータにインストールして実行させることが可能である。 In addition, the operations of the first to third embodiments described above can be constructed as a program, and can be installed and executed on a computer that performs decoding processing of a plurality of encoded data strings input from the encoder side. It is.

また、構築されたプログラムを、ハードディスク装置や、フレキシブルディスク・ＣＤ−ＲＯＭ等の可搬記憶媒体に格納し、コンピュータにインストールして実行させる、または配布することが可能である。 The constructed program can be stored in a portable storage medium such as a hard disk device or a flexible disk / CD-ROM, installed in a computer, executed, or distributed.

なお、本発明は、上記の実施の形態に限定されることなく、特許請求の範囲内において種々変更・応用が可能である。 The present invention is not limited to the above-described embodiment, and various modifications and applications can be made within the scope of the claims.

本発明は、動画像生成技術、特に、符号化された動画像信号を復号する技術に適用可能である。 The present invention is applicable to a moving image generation technique, particularly a technique for decoding an encoded moving image signal.

本発明の原理構成図である。It is a principle block diagram of this invention. 本発明により実現可能なシステム構成例である。2 is a system configuration example that can be realized by the present invention. 本発明の第１の実施の形態におけるＤＳＣシステムの基本構成図である。1 is a basic configuration diagram of a DSC system in a first embodiment of the present invention. 本発明の第２の実施の形態におけるＤＳＣシステムの基本構成図である。It is a basic block diagram of the DSC system in the 2nd Embodiment of this invention. 本発明の第２の実施の形態におけるＤＳＣシステムの構成例である。It is a structural example of the DSC system in the 2nd Embodiment of this invention. 本発明の第３の実施の形態におけるＤＳＣシステムの基本構成図である。It is a basic block diagram of the DSC system in the 3rd Embodiment of this invention. Distributed Source Coding (DSC)の概念図である。It is a conceptual diagram of Distributed Source Coding (DSC). 符号化・復号化にターボ符号を用いたシステム構成例である。It is an example of a system configuration using a turbo code for encoding / decoding.

Explanation of symbols

１符号化手段、圧縮装置
２第２の復号手段、伸張装置
３符号化手段、圧縮装置
４第１の復号手段、伸張装置
５補助情報作成手段、補助情報作成装置
６再構成装置
７伸張装置
８伸張装置
９補助画素作成装置
１０符号化部
１１符号化部
１２復号部
１３復号部
１４抽出部
１５部分復号部
１６動き補償部
２０符号化部
２１符号化部
２２復号部
２３復号部
２４抽出部
２５部分復号部
２６動き補償部
２７補助情報判定部
２８合成部
３０符号化部
３１符号化部
３２復号部
３３復号部
３４抽出部
３５部分復号部
３６動き補償部
３７高域成分推定部
３８補助情報判定部
３９合成部
１０１圧縮装置
１０２圧縮装置
１０３伸張装置
１０４伸張装置
１０５補助画素作成装置
１０６再構成装置
２０１圧縮装置
２０２圧縮装置
２０３伸張装置
２０４伸張装置
２０５補助画素作成装置
２０６再構成装置
２０７高域成分推定装置 DESCRIPTION OF SYMBOLS 1 Encoding means, compression apparatus 2 2nd decoding means, decompression apparatus 3 Encoding means, compression apparatus 4 1st decoding means, expansion apparatus 5 Auxiliary information creation means, auxiliary information creation apparatus 6 Reconstruction apparatus 7 Expansion apparatus 8 Decompression device 9 Auxiliary pixel creation device 10 Encoding unit 11 Encoding unit 12 Decoding unit 13 Decoding unit 14 Extraction unit 15 Partial decoding unit 16 Motion compensation unit 20 Encoding unit 21 Encoding unit 22 Decoding unit 23 Decoding unit 24 Extraction unit 25 Partial decoding unit 26 Motion compensation unit 27 Auxiliary information determining unit 28 Combining unit 30 Encoding unit 31 Encoding unit 32 Decoding unit 33 Decoding unit 34 Extraction unit 35 Partial decoding unit 36 Motion compensation unit 37 High frequency component estimation unit 38 Auxiliary information determination 39 Combining unit 101 Compression device 102 Compression device 103 Expansion device 104 Expansion device 105 Auxiliary pixel creation device 106 Reconstruction device 201 Compression device 202 Compression device 203 Expansion device 204 expansion apparatus 205 auxiliary pixel producing apparatus 206 reconstructor 207 high-frequency component estimating unit

Claims

An image decoding device for decoding a plurality of separately encoded image data sequences,
The data string A of the plurality of data strings is divided into two paths, the first path is decoded and output, and the low frequency component of the data string of the second path is the first auxiliary information, high First decoding means for decoding the band component as second auxiliary information;
Auxiliary information creating means for performing motion compensation on the first and second auxiliary information and outputting them as third and fourth auxiliary information, respectively;
A second decoding means for decoding a data string B different from the data string A among the plurality of data strings using the third auxiliary information;
Whether the fourth auxiliary information is likely is determined using the probability distribution or correlation value of the signal. If it is determined that the fourth auxiliary information is likely, the fourth auxiliary information is added to the image decoded by the second decoding means and output. Reconstruction means;
An image decoding apparatus comprising:

An image decoding device for decoding a plurality of separately encoded image data sequences,
The data string A of the plurality of data strings is divided into two paths, the first path is decoded and output, and the low frequency component of the data string of the second path is the first auxiliary information, high First decoding means for decoding the band component as second auxiliary information;
Auxiliary information creating means for performing motion compensation on the first and second auxiliary information and outputting them as third and fourth auxiliary information, respectively;
High frequency component estimation means for compensating for the shortage of the high frequency component of the fourth auxiliary information output from the auxiliary information creation means and outputting as fifth auxiliary information;
A second decoding means for decoding a data string B different from the data string A among the plurality of data strings using the third auxiliary information;
Whether the fifth auxiliary information is likely is determined using the probability distribution or correlation value of the signal. If it is determined that the fifth auxiliary information is likely, it is added to the image decoded by the second decoding means and output. Reconstruction means;
An image decoding apparatus comprising:

The data string A is data encoded by the international standard JPEG2000.
The image decoding apparatus according to claim 1 or 2 .

Computer
An image decoding program that causes the image decoding apparatus according to any one of claims 1 to 3 to function.

Computer
A computer-readable recording medium storing a program that functions as the image decoding device according to any one of claims 1 to 3 .