JP2016021097A

JP2016021097A - Image processing device, image processing method, and program

Info

Publication number: JP2016021097A
Application number: JP2014143691A
Authority: JP
Inventors: 小林　達也; Tatsuya Kobayashi; 達也小林; 加藤　晴久; Haruhisa Kato; 晴久加藤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2014-07-11
Filing date: 2014-07-11
Publication date: 2016-02-04
Anticipated expiration: 2034-07-11
Also published as: JP6305856B2; MY178928A

Abstract

PROBLEM TO BE SOLVED: To improve recognition robustness of an object and reduce processing load in an AR technology.SOLUTION: An image processing device 1 which superposes virtual information on a preview image comprises an image acquisition unit 10, an image recognition unit 20, an object relation estimation unit 30, and a virtual information display unit 70. The image acquisition unit 10 acquires a preview image. The image recognition unit 20 recognizes an object in the preview image. The object relation estimation unit 30 estimates a relative posture between the objects recognized by the image recognition unit 20, classifies the objects, and recognizes objects other than the main objects classified into the same group on the basis of the recognition results of the main objects by the image precognition unit 20. The virtual information display unit 70 superposes virtual information on the preview image on the basis of the recognition results by the image recognition unit 20 and the recognition results by the object relation estimation unit 30.SELECTED DRAWING: Figure 1

Description

本発明は、画像処理装置、画像処理方法、およびプログラムに関する。 The present invention relates to an image processing device, an image processing method, and a program.

近年、現実空間の画像（映像）をコンピュータで処理して仮想情報を重畳するＡＲ（拡張現実感）技術が注目を集めている。ＡＲ技術を用いることで、ユーザの行動を支援したり、ユーザに直観的な情報提示を行ったりすることが可能となる。例えば、ユーザの周囲に存在する看板や広告にＡＲ技術を用いることで、限られたスペースでは伝えることのできない詳細な情報や動画や３Ｄコンテンツなどを提示したり、場所や時間や閲覧者の属性などによって提示する情報を適宜変更したりすることができる。 In recent years, AR (augmented reality) technology for processing virtual space information (video) by a computer and superimposing virtual information has attracted attention. By using the AR technology, it becomes possible to support the user's action or present information intuitively to the user. For example, by using AR technology for signs and advertisements around the user, detailed information that cannot be conveyed in a limited space, video, 3D content, etc. can be presented, location, time, and viewer attributes The information to be presented can be changed as appropriate.

ＡＲ技術の主要なプラットフォームとして、携帯端末が期待されている。この携帯端末としては、例えば、撮像装置（カメラ）およびディスプレイを搭載し、画像処理に十分な処理性能を備えたスマートフォンやＨＭＤ（Head Mounted Display）などの端末がある。 Mobile terminals are expected as a major platform for AR technology. As this portable terminal, for example, there are terminals such as a smartphone and an HMD (Head Mounted Display) equipped with an imaging device (camera) and a display and having sufficient processing performance for image processing.

ＡＲ技術では、仮想情報を正しい位置に重畳するために、撮像装置と現実空間との相対的な姿勢（位置および向き）をリアルタイムで推定する必要がある。 In the AR technology, in order to superimpose virtual information at a correct position, it is necessary to estimate the relative posture (position and orientation) between the imaging device and the real space in real time.

上述の姿勢推定の手法として、例えば、認識対象となる基準マーカを用いる手法が提案されている（例えば、非特許文献１、２参照）。基準マーカとして、非特許文献１ではＡＲマーカが適用され、非特許文献２では任意の画像が適用される。しかし、非特許文献１、２に示されている手法では、上述の姿勢推定を行う装置に、基準マーカを予め登録しておく必要がある。 As a technique for estimating the posture described above, for example, a technique using a reference marker to be recognized has been proposed (for example, see Non-Patent Documents 1 and 2). As a reference marker, an AR marker is applied in Non-Patent Document 1, and an arbitrary image is applied in Non-Patent Document 2. However, in the methods shown in Non-Patent Documents 1 and 2, it is necessary to register a reference marker in advance in the apparatus that performs the posture estimation described above.

そこで、上述の姿勢推定の手法として、仮想情報を重畳する前段階の処理で現実空間をモデリングし、復元（モデリング）された空間全体を基準マーカとして扱うための手法が提案されている（例えば、非特許文献３参照）。この手法によれば、基準マーカを適宜作成するので、上述の姿勢推定を行う装置に、基準マーカを予め登録しておく必要がなくなる。 Therefore, as a method for estimating the posture described above, a method for modeling the real space in the process of the previous stage of superimposing virtual information and treating the entire restored (modeled) space as a reference marker has been proposed (for example, Non-Patent Document 3). According to this method, since the reference marker is appropriately created, it is not necessary to register the reference marker in advance in the above-described posture estimation apparatus.

これらＡＲマーカを用いる手法と、任意の画像を用いる手法と、基準マーカを適宜作成する手法とには、それぞれ利便性や処理負荷のトレードオフが存在する。このため、適切な手法を、状況に応じて選択する必要がある。 There are trade-offs in convenience and processing load between the method using the AR marker, the method using an arbitrary image, and the method of appropriately creating a reference marker. For this reason, it is necessary to select an appropriate method according to the situation.

また、処理性能の低い端末でも上述のような各種手法に対応できるように、認識アルゴリズムの高速化（効率化）の検討が進められている。例えば特許文献１には、初期姿勢の推定処理と、姿勢の追跡処理と、を組み合わせ、姿勢の追跡処理では連続的に入力されるプレビュー画像内で特徴点の追跡を行う手法が提案されている。この手法によれば、処理性能の低い端末でも、姿勢の推定をリアルタイムで行うことができる。 In addition, a high-speed (efficient) recognition algorithm is being studied so that a terminal with low processing performance can cope with the above-described various methods. For example, Patent Document 1 proposes a method of tracking feature points in a preview image that is continuously input in posture tracking processing by combining initial posture estimation processing and posture tracking processing. . According to this method, a posture can be estimated in real time even with a terminal having low processing performance.

ところで、ＡＲ技術において、仮想情報を配置する方法として、２つの方法がある。１つ目の方法は、基準マーカに対する仮想情報の相対的な位置関係を登録しておくことで、仮想情報をＡＲ空間内に固定配置する方法である。２つ目の方法は、基準マーカとは異なるオブジェクトに対する仮想情報の相対的な位置関係を登録しておくことで、仮想情報をＡＲ空間内に配置する方法である。 By the way, in the AR technology, there are two methods for arranging virtual information. The first method is a method in which the virtual information is fixedly arranged in the AR space by registering the relative positional relationship of the virtual information with respect to the reference marker. The second method is a method of arranging virtual information in the AR space by registering a relative positional relationship of virtual information with respect to an object different from the reference marker.

２つ目の方法では、カメラの動きや、基準となるオブジェクトの動きに合わせて、仮想情報の表示位置が変化する。この２つ目の方法は、例えばトレーディングカードの上に３Ｄモデルといった仮想情報を表示する場合に用いられる。 In the second method, the display position of the virtual information changes according to the movement of the camera and the movement of the reference object. This second method is used, for example, when displaying virtual information such as a 3D model on a trading card.

トレーディングカードの上に３Ｄモデルといった仮想情報を表示する場合、オブジェクト（トレーディングカード）と現実空間との間の姿勢が固定ではないため、ＡＲ空間（例えばカードゲームのフィールドなど）を提示するためには、固定配置された基準マーカが必要になる。また、オブジェクトの認識についても、基準マーカを認識する際と同様の処理が必要になる。さらに、複数のオブジェクト上に仮想情報を表示する場合には、各オブジェクトを認識する（姿勢を推定する）必要があるため、高い処理能力が端末に要求される。 When virtual information such as a 3D model is displayed on a trading card, since the posture between the object (trading card) and the real space is not fixed, in order to present an AR space (for example, a card game field) A fixedly arranged reference marker is required. Moreover, the same processing as that for recognizing the reference marker is required for the recognition of the object. Furthermore, when displaying virtual information on a plurality of objects, it is necessary to recognize each object (estimate the posture), and thus high processing capability is required of the terminal.

特開２０１３−５０８８４４号公報JP 2013-508844 A

H. Kato and M. Billinghurst, “Marker tracking and hmd calibration for a video-based augmented reality conferencing system,” in Proc. Of IEEE and ACM International Workshop on Augmented Reality, 1999.H. Kato and M. Billinghurst, “Marker tracking and hmd calibration for a video-based augmented reality conferencing system,” in Proc. Of IEEE and ACM International Workshop on Augmented Reality, 1999. D. Wagner, G. Reitmayr, A. Mulloni, T. Drummond, and D. Schmalstieg, “Real-time detection and tracking for augmented reality on mobile phones,” IEEE Trans. On Visualization and Computer Graphics, 2010.D. Wagner, G. Reitmayr, A. Mulloni, T. Drummond, and D. Schmalstieg, “Real-time detection and tracking for augmented reality on mobile phones,” IEEE Trans. On Visualization and Computer Graphics, 2010. G. Klein and D. Murray. Parallel tracking and mapping for small ar workspaces. In Proc. Of International Symposium on Mixed and Augmented Reality, 2007.G. Klein and D. Murray.Parallel tracking and mapping for small ar workspaces.In Proc.Of International Symposium on Mixed and Augmented Reality, 2007. S. Benhimane and E. Malis, “Homography-based 2d visual tracking and servoing,” International Journal of Robotics Research, 2007.S. Benhimane and E. Malis, “Homography-based 2d visual tracking and servoing,” International Journal of Robotics Research, 2007.

特許文献１や非特許文献１、２の手法では、仮想情報をオブジェクトに登録した場合、各端末は、独立してオブジェクトを認識する必要がある。このため、オブジェクトの数が増加するに従って、各端末における処理負荷が増大し、各端末でのリアルタイム処理の実現が困難となり、その結果、各端末が表示可能な仮想情報の数が限定されてユーザビリティが低下してしまうおそれがあった。 In the methods of Patent Document 1 and Non-Patent Documents 1 and 2, when virtual information is registered in an object, each terminal needs to recognize the object independently. For this reason, as the number of objects increases, the processing load on each terminal increases, making it difficult to realize real-time processing on each terminal. As a result, the number of virtual information that can be displayed on each terminal is limited, and usability There was a risk that it would fall.

また、オブジェクトの姿勢の推定は、プレビュー画像ごとに独立に行われる。このため、オクルージョン（遮蔽）や光の反射（白飛び）などによって、オブジェクトの認識に一時的に失敗してしまい、仮想情報の表示が中断してしまうことがあった。 Further, the estimation of the posture of the object is performed independently for each preview image. For this reason, the object recognition may be temporarily failed due to occlusion (shielding) or reflection of light (out-of-white), and display of virtual information may be interrupted.

さらに、初期姿勢を推定する処理は、姿勢の追跡処理と比べて撮影角度や撮影距離に対する頑健性に欠けることがあるので、撮影位置によっては、仮想情報の表示が中断すると、仮想情報の表示を再開できないこともあった。 Furthermore, the process of estimating the initial posture may lack robustness with respect to the shooting angle and shooting distance compared to the tracking processing of the posture, so depending on the shooting position, if the virtual information display is interrupted, the virtual information display may be Sometimes it could not be resumed.

そこで、本発明は、上述の課題に鑑みてなされたものであり、ＡＲ技術において、処理負荷を軽減したり、オブジェクトの認識の頑健性を向上させたりすることを目的とする。 Therefore, the present invention has been made in view of the above-described problems, and it is an object of the AR technology to reduce processing load and improve robustness of object recognition.

本発明は、上記の課題を解決するために、以下の事項を提案している。
（１）本発明は、プレビュー画像に仮想情報を重畳させる画像処理装置（例えば、図１の画像処理装置１に相当）であって、前記プレビュー画像を取得する画像取得手段（例えば、図１の画像取得部１０に相当）と、前記画像取得手段により取得されたプレビュー画像内のオブジェクトを認識する画像認識手段（例えば、図１の画像認識部２０に相当）と、前記画像認識手段により認識されたオブジェクト間の関係性（例えば、後述のオブジェクト間の相対姿勢に相当）を推定し、推定結果に基づいてオブジェクトを分類し、同一のグループに分類したオブジェクトのうちの１つである主要オブジェクトの前記画像認識手段による認識結果に基づいて、当該グループに分類した当該主要オブジェクト以外のオブジェクトを認識するオブジェクト関係推定手段（例えば、図１のオブジェクト関係推定部３０に相当）と、前記画像認識手段による認識結果と、前記オブジェクト関係推定手段による認識結果と、に基づいて、前記画像取得手段により取得されたプレビュー画像に仮想情報を重畳させる仮想情報表示手段（例えば、図１の仮想情報表示部７０に相当）と、を備えることを特徴とする画像処理装置を提案している。 The present invention proposes the following matters in order to solve the above problems.
(1) The present invention is an image processing apparatus (for example, equivalent to the image processing apparatus 1 in FIG. 1) that superimposes virtual information on a preview image, and an image acquisition unit (for example, in FIG. 1) that acquires the preview image. Image recognition unit 10), an image recognition unit that recognizes an object in the preview image acquired by the image acquisition unit (for example, equivalent to the image recognition unit 20 in FIG. 1), and the image recognition unit. The relationship between objects (e.g., corresponding to the relative posture between objects described later) is estimated, the objects are classified based on the estimation result, and the main object that is one of the objects classified into the same group An object that recognizes an object other than the main object classified into the group based on the recognition result by the image recognition means. Acquired by the image acquisition means based on a recognition result by the image relation estimation means (for example, corresponding to the object relation estimation unit 30 in FIG. 1), a recognition result by the image recognition means, and a recognition result by the object relation estimation means. Proposed is an image processing apparatus including virtual information display means (for example, corresponding to the virtual information display unit 70 in FIG. 1) for superimposing virtual information on the preview image.

この発明によれば、プレビュー画像に仮想情報を重畳させる画像処理装置に、画像取得手段、画像認識手段、オブジェクト関係推定手段、および仮想情報表示手段を設け、画像取得手段により、プレビュー画像を取得することとした。また、画像認識手段により、画像取得手段により取得されたプレビュー画像内のオブジェクトを認識することとした。また、オブジェクト関係推定手段により、画像認識手段により認識されたオブジェクト間の関係性を推定し、推定結果に基づいてオブジェクトを分類し、同一のグループに分類したオブジェクトのうちの１つである主要オブジェクトの画像認識手段による認識結果に基づいて、このグループに分類した主要オブジェクト以外のオブジェクトを認識することとした。また、仮想情報表示手段により、画像認識手段による認識結果と、オブジェクト関係推定手段による認識結果と、に基づいて、画像取得手段により取得されたプレビュー画像に仮想情報を重畳させることとした。このため、オブジェクト関係推定手段により、オブジェクトを、このオブジェクトと関係性の高いオブジェクトの認識結果に基づいて認識することができる。したがって、オブジェクト関係推定手段によりオブジェクトを認識することで、画像認識手段により認識するオブジェクトの数を減少させたり、画像認識手段では認識できなかったオブジェクトを認識したりすることができる。よって、ＡＲ技術において、処理負荷を軽減したり、オブジェクトの認識の頑健性を向上させたりすることができる。 According to the present invention, an image processing device that superimposes virtual information on a preview image is provided with image acquisition means, image recognition means, object relation estimation means, and virtual information display means, and the preview image is acquired by the image acquisition means. It was decided. The image recognition means recognizes the object in the preview image acquired by the image acquisition means. In addition, the object relationship estimation means estimates the relationship between the objects recognized by the image recognition means, classifies the objects based on the estimation results, and is a main object that is one of the objects classified into the same group Based on the recognition result by the image recognition means, objects other than the main objects classified into this group are recognized. The virtual information display means superimposes the virtual information on the preview image acquired by the image acquisition means based on the recognition result by the image recognition means and the recognition result by the object relationship estimation means. For this reason, the object relation estimating means can recognize the object based on the recognition result of the object having a high relation with the object. Therefore, by recognizing the object by the object relationship estimating means, it is possible to reduce the number of objects recognized by the image recognizing means or to recognize an object that could not be recognized by the image recognizing means. Therefore, in the AR technology, it is possible to reduce the processing load and improve the robustness of object recognition.

（２）本発明は、（１）の画像処理装置について、前記オブジェクト関係推定手段は、前記主要オブジェクトとして、前記画像認識手段により認識できたオブジェクトを適用し、前記主要オブジェクトと同一のグループに分類した当該主要オブジェクト以外のオブジェクトとして、当該主要オブジェクトと同一のグループに分類されているとともに前記画像認識手段により認識できなかったオブジェクトを適用することを特徴とする画像処理装置を提案している。 (2) In the image processing apparatus according to (1), the object relationship estimation unit applies the object recognized by the image recognition unit as the main object, and classifies the main object into the same group as the main object. As an object other than the main object, an image processing apparatus is proposed in which an object that is classified into the same group as the main object and that cannot be recognized by the image recognition unit is applied.

この発明によれば、（１）の画像処理装置において、オブジェクト関係推定手段により、主要オブジェクトとして、画像認識手段により認識できたオブジェクトを適用し、主要オブジェクトと同一のグループに分類した主要オブジェクト以外のオブジェクトとして、主要オブジェクトと同一のグループに分類されているとともに画像認識手段により認識できなかったオブジェクトを適用することとした。このため、画像認識手段では認識できなかったオブジェクトを、主要オブジェクトの認識結果に基づいて認識することができる。 According to the present invention, in the image processing apparatus of (1), the object relation estimation means applies the object recognized by the image recognition means as the main object, and other than the main objects classified into the same group as the main object. As the object, an object classified into the same group as the main object and not recognized by the image recognition means is applied. For this reason, the object which could not be recognized by the image recognition means can be recognized based on the recognition result of the main object.

（３）本発明は、（１）または（２）の画像処理装置について、前記オブジェクト関係推定手段は、各グループからオブジェクトを１つずつ主要オブジェクトとして選択し、前記主要オブジェクトと同一のグループに分類された当該主要オブジェクト以外のオブジェクトについて、前記画像認識手段による認識を休止させる認識処理制御手段（例えば、図１１の認識処理制御部４０に相当）を備えることを特徴とする画像処理装置を提案している。 (3) In the image processing apparatus according to (1) or (2), the object relation estimation unit selects one object from each group as a main object and classifies the same into the same group as the main object. Proposed is an image processing apparatus comprising a recognition processing control unit (for example, corresponding to the recognition processing control unit 40 in FIG. 11) that pauses recognition by the image recognition unit for objects other than the main object that has been performed. ing.

この発明によれば、（１）または（２）の画像処理装置において、オブジェクト関係推定手段により、各グループからオブジェクトを１つずつ主要オブジェクトとして選択することとした。また、（１）または（２）の画像処理装置に、主要オブジェクトと同一のグループに分類された主要オブジェクト以外のオブジェクトについて、画像認識手段による認識を休止させる認識処理制御手段を設けることとした。このため、主要オブジェクトと同一のグループに分類された主要オブジェクト以外のオブジェクトについて、画像認識手段による認識を休止させてもオブジェクト関係推定手段により認識することができるので、画像認識手段により認識するオブジェクトの数を減少させることができる。 According to the present invention, in the image processing apparatus according to (1) or (2), the object relation estimation unit selects one object from each group as a main object. In addition, the image processing apparatus of (1) or (2) is provided with a recognition processing control unit that pauses recognition by the image recognition unit for objects other than the main object classified into the same group as the main object. For this reason, objects other than the main object classified into the same group as the main object can be recognized by the object relationship estimation unit even when the recognition by the image recognition unit is suspended. The number can be reduced.

（４）本発明は、（３）の画像処理装置について、前記認識処理制御手段は、前記画像認識手段による主要オブジェクトの認識が失敗すると、当該主要オブジェクトと同一のグループに分類された当該主要オブジェクト以外のオブジェクトについて、前記画像認識手段による認識を再開させることを特徴とする画像処理装置を提案している。 (4) In the image processing apparatus according to (3), when the recognition processing control unit fails to recognize the main object by the image recognition unit, the main object classified into the same group as the main object. An image processing apparatus is proposed in which recognition by the image recognition means is resumed for objects other than the above.

この発明によれば、（３）の画像処理装置において、認識処理制御手段により、画像認識手段による主要オブジェクトの認識が失敗すると、主要オブジェクトと同一のグループに分類された主要オブジェクト以外のオブジェクトについて、画像認識手段による認識を再開させることとした。このため、オブジェクト関係推定手段による認識ができなくなった場合には、画像認識手段による認識を再開させることができ、オブジェクトの認識の頑健性をさらに向上させることができる。 According to the present invention, in the image processing apparatus of (3), when the recognition processing control unit fails to recognize the main object by the image recognition unit, the objects other than the main object classified into the same group as the main object are The recognition by the image recognition means is resumed. For this reason, when recognition by an object relationship estimation means becomes impossible, recognition by an image recognition means can be restarted and the robustness of object recognition can be further improved.

（５）本発明は、（３）または（４）の画像処理装置について、前記認識処理制御手段は、前記画像認識手段による認識を休止させているオブジェクトについての前記オブジェクト関係推定手段による認識結果を、前記画像取得手段により取得されたプレビュー画像と照合し、照合に失敗すれば、当該オブジェクトについて前記画像認識手段による認識を再開させることを特徴とする画像処理装置を提案している。 (5) In the image processing apparatus according to (3) or (4), the recognition processing control unit may recognize a recognition result by the object relationship estimation unit for an object whose recognition by the image recognition unit is suspended. The image processing apparatus has been proposed in which the preview image obtained by the image obtaining unit is collated and if the collation fails, the recognition of the object by the image recognizing unit is resumed.

この発明によれば、（３）または（４）の画像処理装置において、認識処理制御手段により、画像認識手段による認識を休止させているオブジェクトについてのオブジェクト関係推定手段による認識結果を、画像取得手段により取得されたプレビュー画像と照合し、照合に失敗すれば、オブジェクトについて画像認識手段による認識を再開させることとした。このため、オブジェクト関係推定手段による認識結果が正しいか否かを判定することができる。 According to the present invention, in the image processing apparatus according to (3) or (4), the recognition processing control means recognizes the recognition result by the object relation estimation means for the object whose recognition by the image recognition means is suspended, as the image acquisition means. When the collation fails, the recognition by the image recognition means is resumed. For this reason, it can be determined whether the recognition result by the object relationship estimation means is correct.

（６）本発明は、（３）から（５）のいずれかの画像処理装置について、前記認識処理制御手段は、前記画像認識手段による認識を再開させる際に、前記画像取得手段により前回取得されたプレビュー画像における前記オブジェクト関係推定手段による認識結果を初期値として、前記画像認識手段に姿勢を追跡させることを特徴とする画像処理装置を提案している。 (6) In the image processing apparatus according to any one of (3) to (5), the recognition processing control unit is previously acquired by the image acquisition unit when the recognition by the image recognition unit is resumed. An image processing apparatus is proposed in which the image recognition unit is made to track the posture with the recognition result of the object relation estimation unit in the preview image as an initial value.

ここで、オクルージョンや光の反射などによってオブジェクトの認識に失敗してしまうのが、一時的なものであれば、オブジェクトの認識の失敗の解消時には、姿勢追跡部２３による姿勢の追跡処理に成功することが想定される。一般的に、初期姿勢推定部２２による姿勢の初期値の推定よりも、正確な姿勢の初期値を用いた画像認識手段による姿勢の追跡処理の方が、処理負荷や、姿勢推定の精度や、認識の頑健性に優れる。そこで、この発明によれば、（３）から（５）のいずれかの画像処理装置において、認識処理制御手段により、画像認識手段による認識を再開させる際に、画像取得手段により前回取得されたプレビュー画像におけるオブジェクト関係推定手段による認識結果を初期値として、画像認識手段に姿勢を追跡させることとした。このため、処理負荷を軽減することができるとともに、姿勢推定の精度および認識の頑健性を向上させることができる。 Here, if the object recognition failure due to occlusion or light reflection is temporary, the posture tracking unit 23 succeeds in the posture tracking processing when the object recognition failure is resolved. It is assumed that In general, rather than the estimation of the initial value of the posture by the initial posture estimation unit 22, the posture tracking processing by the image recognition means using the accurate initial value of the posture is the processing load, the accuracy of the posture estimation, Excellent recognition robustness. Therefore, according to the present invention, in the image processing apparatus according to any one of (3) to (5), when the recognition processing control unit restarts the recognition by the image recognition unit, the preview acquired by the image acquisition unit last time. The recognition result of the object relation estimation unit in the image is used as an initial value, and the posture of the image recognition unit is tracked. For this reason, the processing load can be reduced, and the accuracy of posture estimation and the robustness of recognition can be improved.

（７）本発明は、（３）から（６）のいずれかの画像処理装置について、前記認識処理制御手段は、前記画像認識手段による認識を休止させているオブジェクトについての前記オブジェクト関係推定手段による認識結果に基づいて、当該オブジェクトを前記画像取得手段により取得されたプレビュー画像に投影して投影画像を作成するとともに、前記投影画像と、前記画像取得手段により取得されたプレビュー画像と、の類似度が閾値未満であれば、照合に失敗したと判定することを特徴とする画像処理装置を提案している。 (7) In the image processing apparatus according to any one of (3) to (6), the recognition processing control unit includes the object relation estimation unit for an object whose recognition by the image recognition unit is suspended. Based on the recognition result, the object is projected onto the preview image acquired by the image acquisition unit to create a projection image, and the similarity between the projection image and the preview image acquired by the image acquisition unit If it is less than the threshold value, an image processing apparatus is proposed in which it is determined that the verification has failed.

この発明によれば、（３）から（６）のいずれかの画像処理装置において、認識処理制御手段により、画像認識手段による認識を休止させているオブジェクトについてのオブジェクト関係推定手段による認識結果に基づいて、オブジェクトを画像取得手段により取得されたプレビュー画像に投影して投影画像を作成するとともに、投影画像と、画像取得手段により取得されたプレビュー画像と、の類似度が閾値未満であれば、照合に失敗したと判定することとした。このため、オブジェクト関係推定手段による認識結果が正しいか否かを判定することができる。 According to the present invention, in any one of the image processing apparatuses according to (3) to (6), the recognition processing control means is based on the recognition result by the object relation estimation means for the object whose recognition by the image recognition means is suspended. Then, an object is projected onto the preview image acquired by the image acquisition unit to create a projection image, and if the similarity between the projection image and the preview image acquired by the image acquisition unit is less than a threshold, collation is performed. It was decided that it failed. For this reason, it can be determined whether the recognition result by the object relationship estimation means is correct.

（８）本発明は、（７）の画像処理装置について、前記認識処理制御手段は、前記類似度が最大化する姿勢を反復計算により推定して、前記オブジェクト関係推定手段による認識結果を補正することを特徴とする画像処理装置を提案している。 (8) In the image processing apparatus according to (7), the recognition processing control unit estimates the posture that maximizes the similarity by iterative calculation, and corrects the recognition result by the object relationship estimation unit. An image processing apparatus characterized by this is proposed.

この発明によれば、（７）の画像処理装置において、認識処理制御手段により、類似度が最大化する姿勢を反復計算により推定して、オブジェクト関係推定手段による認識結果を補正することとした。このため、ＡＲ技術において、処理負荷をさらに軽減したり、オブジェクトの認識の頑健性をさらに向上させたりすることができる。 According to the present invention, in the image processing apparatus of (7), the recognition processing control unit estimates the posture with which the degree of similarity is maximized by iterative calculation, and corrects the recognition result by the object relationship estimation unit. For this reason, in the AR technique, the processing load can be further reduced, and the robustness of object recognition can be further improved.

（９）本発明は、（３）から（８）のいずれかの画像処理装置について、前記認識処理制御手段は、前記画像認識手段による認識を休止させているオブジェクトについての前記オブジェクト関係推定手段による認識結果に基づいて、当該オブジェクトを前記画像取得手段により取得されたプレビュー画像に投影して投影画像を作成するとともに、前記投影画像と、前記画像取得手段により取得されたプレビュー画像と、のテンプレートマッチングにより一致箇所を推定し、一致箇所における応答値が閾値未満であれば、照合に失敗したと判定することを特徴とする画像処理装置を提案している。 (9) In the image processing apparatus according to any one of (3) to (8), the recognition processing control unit includes the object relationship estimation unit for an object whose recognition by the image recognition unit is suspended. Based on the recognition result, the object is projected onto the preview image acquired by the image acquisition unit to create a projection image, and template matching between the projection image and the preview image acquired by the image acquisition unit The image processing apparatus is characterized in that the matching part is estimated by the above, and if the response value at the matching part is less than the threshold, it is determined that the matching has failed.

この発明によれば、（３）から（８）のいずれかの画像処理装置において、認識処理制御手段により、画像認識手段による認識を休止させているオブジェクトについてのオブジェクト関係推定手段による認識結果に基づいて、オブジェクトを画像取得手段により取得されたプレビュー画像に投影して投影画像を作成するとともに、投影画像と、画像取得手段により取得されたプレビュー画像と、のテンプレートマッチングにより一致箇所を推定し、一致箇所における応答値が閾値未満であれば、照合に失敗したと判定することとした。このため、オブジェクト関係推定手段による認識結果が正しいか否かを判定することができる。 According to this invention, in any one of the image processing apparatuses according to (3) to (8), the recognition processing control unit is based on the recognition result of the object relation estimation unit for the object whose recognition by the image recognition unit is suspended. Then, the object is projected onto the preview image acquired by the image acquisition unit to create a projection image, and the matching portion is estimated by template matching between the projection image and the preview image acquired by the image acquisition unit. If the response value at the location is less than the threshold, it is determined that the verification has failed. For this reason, it can be determined whether the recognition result by the object relationship estimation means is correct.

（１０）本発明は、（１）から（９）のいずれかの画像処理装置について、前記画像処理装置とは異なる第１の画像処理装置で認識されたオブジェクトの認識結果を、当該画像処理装置を基準とした認識結果に変換する協調認識処理手段（例えば、図１６の協調認識処理部６０に相当）を備え、前記仮想情報表示手段は、前記画像認識手段による認識結果と、前記オブジェクト関係推定手段による認識結果と、前記協調認識処理手段による認識結果と、に基づいて、前記画像取得手段により取得されたプレビュー画像に仮想情報を重畳させることを特徴とする画像処理装置を提案している。 (10) According to the present invention, for any one of (1) to (9), the recognition result of an object recognized by a first image processing device different from the image processing device is used as the image processing device. And a virtual recognition display unit (e.g., corresponding to the cooperative recognition processing unit 60 in FIG. 16), and the virtual information display unit includes the recognition result by the image recognition unit and the object relation estimation. An image processing apparatus is proposed in which virtual information is superimposed on the preview image acquired by the image acquisition unit based on the recognition result by the unit and the recognition result by the cooperative recognition processing unit.

この発明によれば、（１）から（９）のいずれかの画像処理装置において、画像処理装置とは異なる第１の画像処理装置で認識されたオブジェクトの認識結果を、画像処理装置を基準とした認識結果に変換する協調認識処理手段を設けることとした。また、仮想情報表示手段により、画像認識手段による認識結果と、オブジェクト関係推定手段による認識結果と、協調認識処理手段による認識結果と、に基づいて、画像取得手段により取得されたプレビュー画像に仮想情報を重畳させることとした。このため、プレビュー画像への仮想情報の重畳に、他の画像処理装置で認識結果も用いることができるので、ＡＲ技術において、処理負荷をさらに軽減したり、オブジェクトの認識の頑健性をさらに向上させたりすることができる。 According to the present invention, in any one of the image processing devices (1) to (9), the recognition result of the object recognized by the first image processing device different from the image processing device is used as a reference. It was decided to provide cooperative recognition processing means for converting to the recognized result. Further, the virtual information display means adds virtual information to the preview image acquired by the image acquisition means based on the recognition result by the image recognition means, the recognition result by the object relationship estimation means, and the recognition result by the cooperative recognition processing means. It was decided to superimpose. For this reason, the recognition result can also be used by another image processing apparatus for superimposing virtual information on the preview image. Therefore, in AR technology, the processing load can be further reduced and the robustness of object recognition can be further improved. Can be.

（１１）本発明は、（１）から（１０）のいずれかの画像処理装置について、前記オブジェクト関係推定手段は、前記画像認識手段により認識されたオブジェクト間の関係性として、当該オブジェクト同士の相対的な位置関係を示す相対姿勢を求めることを特徴とする画像処理装置を提案している。 (11) In the image processing apparatus according to any one of (1) to (10), the object relationship estimation unit may determine the relationship between the objects as the relationship between the objects recognized by the image recognition unit. Proposed is an image processing apparatus characterized by obtaining a relative posture indicating a general positional relationship.

この発明によれば、（１）から（１０）のいずれかの画像処理装置において、オブジェクト関係推定手段により、画像認識手段により認識されたオブジェクト間の関係性として、オブジェクト同士の相対的な位置関係を示す相対姿勢を求めることとした。このため、オブジェクト間の相対姿勢を用いて、同様の動きをしているオブジェクト同士といった、関係性の高いオブジェクト同士を検索することができる。 According to this invention, in the image processing apparatus according to any one of (1) to (10), the relative positional relationship between the objects as the relationship between the objects recognized by the image recognition unit by the object relationship estimation unit. The relative posture indicating For this reason, it is possible to search for highly related objects such as objects that are moving in the same manner using the relative posture between the objects.

（１２）本発明は、（１）から（１１）のいずれかの画像処理装置について、前記オブジェクト関係推定手段は、前記画像取得手段によりプレビュー画像が取得されるたびに、当該プレビュー画像内のオブジェクト間の相対姿勢を求め、予め定められた数のプレビュー画像に亘って連続して、相対姿勢のプレビュー画像間での変化量が閾値未満であるオブジェクトを、同一のグループに分類することを特徴とする画像処理装置を提案している。 (12) In the image processing apparatus according to any one of (1) to (11), the object relationship estimation unit may cause the object in the preview image to be acquired each time the preview image is acquired by the image acquisition unit. Relative posture between the images is obtained, and the objects whose change amount between the preview images of the relative posture is less than the threshold value are continuously classified over a predetermined number of preview images, and are classified into the same group. An image processing apparatus is proposed.

この発明によれば、（１）から（１１）のいずれかの画像処理装置において、オブジェクト関係推定手段により、画像取得手段によりプレビュー画像が取得されるたびに、プレビュー画像内のオブジェクト間の相対姿勢を求め、予め定められた数のプレビュー画像に亘って連続して、相対姿勢のプレビュー画像間での変化量が閾値未満であるオブジェクトを、同一のグループに分類することとした。このため、複数の連続するプレビュー画像におけるオブジェクト同士の関係性を考慮して、オブジェクトを分類することができる。 According to the present invention, in any one of the image processing apparatuses according to (1) to (11), each time the preview image is acquired by the image acquisition unit by the object relationship estimation unit, the relative posture between the objects in the preview image is acquired. The objects whose change amount between the preview images of the relative posture is less than the threshold value are continuously classified into the same group over a predetermined number of preview images. For this reason, it is possible to classify objects in consideration of the relationship between objects in a plurality of continuous preview images.

（１３）本発明は、（１２）の画像処理装置について、前記オブジェクト関係推定手段は、前記画像取得手段により取得された最新のプレビュー画像において求めた相対姿勢と、当該最新のプレビュー画像よりも前のプレビュー画像において求めた相対姿勢の平均と、の差分を前記変化量として求めることを特徴とする画像処理装置を提案している。 (13) In the image processing apparatus according to (12), the object relationship estimation unit may determine the relative posture obtained in the latest preview image acquired by the image acquisition unit and the previous preview image. An image processing apparatus is proposed in which a difference between the average of the relative postures obtained in the preview image is obtained as the amount of change.

この発明によれば、（１２）の画像処理装置において、オブジェクト関係推定手段により、画像取得手段により取得された最新のプレビュー画像において求めた相対姿勢と、最新のプレビュー画像よりも前のプレビュー画像において求めた相対姿勢の平均と、の差分を変化量として求めることとした。このため、オブジェクト同士の関係性をより考慮して、オブジェクトをより適切に分類することができる。 According to the present invention, in the image processing apparatus of (12), the relative posture determined in the latest preview image acquired by the image acquisition unit by the object relationship estimation unit and the preview image before the latest preview image. The difference between the average of the calculated relative postures and the amount of change was determined. For this reason, it is possible to classify the objects more appropriately in consideration of the relationship between the objects.

（１４）本発明は、（１）から（１３）のいずれかの画像処理装置について、前記画像認識手段は、オブジェクトごとの認識結果に、当該認識結果の認識精度の指標となる情報を付加し、前記オブジェクト関係推定手段は、前記画像認識手段により付加された認識精度の指標が閾値以上であるオブジェクト間の相対姿勢を、安定していると判定することを特徴とする画像処理装置を提案している。 (14) In the image processing apparatus according to any one of (1) to (13), the image recognition unit adds information serving as an index of recognition accuracy of the recognition result to the recognition result for each object. The object relationship estimation means proposes an image processing apparatus that determines that the relative posture between objects having a recognition accuracy index added by the image recognition means equal to or greater than a threshold value is stable. ing.

この発明によれば、（１）から（１３）のいずれかの画像処理装置において、画像認識手段により、オブジェクトごとの認識結果に、認識結果の認識精度の指標となる情報を付加し、オブジェクト関係推定手段により、画像認識手段により付加された認識精度の指標が閾値以上であるオブジェクト間の相対姿勢を、安定していると判定することとした。このため、オブジェクトの認識結果の認識精度を考慮して、オブジェクトを分類することができる。 According to the present invention, in the image processing apparatus according to any one of (1) to (13), the image recognition unit adds information serving as an index of recognition accuracy of the recognition result to the recognition result for each object, and the object relationship The estimation means determines that the relative posture between objects whose recognition accuracy index added by the image recognition means is equal to or greater than a threshold value is stable. For this reason, the objects can be classified in consideration of the recognition accuracy of the recognition result of the objects.

（１５）本発明は、（１）から（１４）のいずれかの画像処理装置について、前記画像認識手段は、オブジェクトごとの認識結果に、当該認識結果の認識精度の指標となる情報を付加し、前記オブジェクト関係推定手段は、前記画像認識手段により付加された認識精度の指標の最も高いオブジェクトを、前記主要オブジェクトに適用することを特徴とする画像処理装置を提案している。 (15) In the image processing apparatus according to any one of (1) to (14), the image recognition unit adds information serving as an index of recognition accuracy of the recognition result to the recognition result for each object. The object relation estimation means proposes an image processing apparatus that applies the object with the highest recognition accuracy index added by the image recognition means to the main object.

この発明によれば、（１）から（１４）のいずれかの画像処理装置において、画像認識手段により、オブジェクトごとの認識結果に、認識結果の認識精度の指標となる情報を付加し、オブジェクト関係推定手段により、画像認識手段により付加された認識精度の指標の最も高いオブジェクトを、主要オブジェクトに適用することとした。このため、認識精度の最も高いオブジェクトとの関係性を用いて、オブジェクト関係推定手段によりオブジェクトを認識することができるので、オブジェクトの認識の頑健性をさらに向上させることができる。 According to the present invention, in the image processing apparatus according to any one of (1) to (14), the image recognition unit adds information serving as an index of recognition accuracy of the recognition result to the recognition result for each object, and the object relationship The estimation means applies the object having the highest recognition accuracy index added by the image recognition means to the main object. For this reason, since the object can be recognized by the object relationship estimation means using the relationship with the object having the highest recognition accuracy, the robustness of the object recognition can be further improved.

（１６）本発明は、（１４）または（１５）の画像処理装置について、前記画像認識手段は、前記認識精度の指標として、オブジェクトに対する撮影距離と、オブジェクトに対する撮影角度と、のうち少なくともいずれかを用いることを特徴とする画像処理装置を提案している。 (16) In the image processing device according to (14) or (15), the image recognition unit may use at least one of a shooting distance to the object and a shooting angle to the object as an index of the recognition accuracy. Has proposed an image processing device characterized by the use of.

この発明によれば、（１４）または（１５）の画像処理装置において、画像認識手段により、認識精度の指標として、オブジェクトに対する撮影距離と、オブジェクトに対する撮影角度と、のうち少なくともいずれかを用いることとした。このため、オブジェクトに対する撮影距離や、オブジェクトに対する撮影角度を用いて、認識精度の指標を設定することができる。 According to this invention, in the image processing apparatus of (14) or (15), the image recognition means uses at least one of the shooting distance to the object and the shooting angle to the object as an index of recognition accuracy. It was. For this reason, the recognition accuracy index can be set using the shooting distance to the object and the shooting angle to the object.

（１７）本発明は、（１４）から（１６）のいずれかの画像処理装置について、前記画像認識手段は、前記認識精度の指標として、局所特徴量のマッチング数と、局所特徴量のマッチングのスコアと、のうち少なくともいずれかを用いることを特徴とする画像処理装置を提案している。 (17) In the image processing device according to any one of (14) to (16), the image recognition unit may include a local feature amount matching number and a local feature amount matching as an index of the recognition accuracy. An image processing apparatus that uses at least one of a score and a score is proposed.

この発明によれば、（１４）から（１６）のいずれかの画像処理装置において、画像認識手段により、認識精度の指標として、局所特徴量のマッチング数と、局所特徴量のマッチングのスコアと、のうち少なくともいずれかを用いることとした。このため、局所特徴量のマッチング数や、局所特徴量のマッチングのスコアを用いて、認識精度の指標を設定することができる。 According to the present invention, in the image processing device according to any one of (14) to (16), the number of local feature values matching score, the local feature value matching score, as an index of recognition accuracy by the image recognition unit, At least one of them was used. For this reason, an index of recognition accuracy can be set using the matching number of local feature quantities and the matching score of local feature quantities.

（１８）本発明は、（１４）から（１７）のいずれかの画像処理装置について、前記画像認識手段は、前記認識精度の指標として、ＳＳＤ（Sum of Squared Difference）の応答値と、ＮＣＣ（Normalized Cross Correlation）の応答値と、のうち少なくともいずれかを用いることを特徴とする画像処理装置を提案している。 (18) In the image processing apparatus according to any one of (14) to (17), the image recognition unit may use an SSD (Sum of Squared Difference) response value, an NCC ( There has been proposed an image processing apparatus using at least one of a response value of (Normalized Cross Correlation).

この発明によれば、（１４）から（１７）のいずれかの画像処理装置において、画像認識手段により、認識精度の指標として、ＳＳＤの応答値と、ＮＣＣの応答値と、のうち少なくともいずれかを用いることとした。このため、ＳＳＤの応答値や、ＮＣＣの応答値を用いて、認識精度の指標を設定することができる。 According to this invention, in the image processing apparatus according to any one of (14) to (17), at least one of an SSD response value and an NCC response value is used as an index of recognition accuracy by the image recognition means. It was decided to use. For this reason, the index of recognition accuracy can be set using the response value of SSD or the response value of NCC.

（１９）本発明は、（１）から（１８）のいずれかの画像処理装置について、前記画像認識手段は、オブジェクトごとの認識結果に、当該オブジェクトの認識に要する処理負荷の指標となる情報を付加し、前記オブジェクト関係推定手段は、前記画像認識手段により付加された処理負荷の指標が閾値未満であるオブジェクトを、前記主要オブジェクトに適用することを特徴とする画像処理装置を提案している。 (19) In the image processing apparatus according to any one of (1) to (18), the image recognition unit may add information serving as an index of a processing load required for recognition of the object to the recognition result for each object. In addition, the object relationship estimation means proposes an image processing apparatus characterized in that an object having a processing load index added by the image recognition means is less than a threshold is applied to the main object.

この発明によれば、（１）から（１８）のいずれかの画像処理装置において、画像認識手段により、オブジェクトごとの認識結果に、オブジェクトの認識に要する処理負荷の指標となる情報を付加し、オブジェクト関係推定手段により、画像認識手段により付加された処理負荷の指標が閾値未満であるオブジェクトを、主要オブジェクトに適用することとした。このため、処理負荷の低いオブジェクトとの関係性を用いて、オブジェクト関係推定手段によりオブジェクトを認識することができるので、処理負荷をさらに軽減することができる。 According to the present invention, in the image processing device according to any one of (1) to (18), the image recognition unit adds information serving as an index of the processing load required for object recognition to the recognition result for each object. The object relation estimation means applies an object whose processing load index added by the image recognition means is less than a threshold value to the main object. For this reason, since the object can be recognized by the object relationship estimating means using the relationship with the object having a low processing load, the processing load can be further reduced.

（２０）本発明は、（１９）の画像処理装置について、前記画像認識手段は、前記処理負荷の指標として、認識に要した時間を用いることを特徴とする画像処理装置を提案している。 (20) The present invention proposes the image processing apparatus according to (19), wherein the image recognition means uses a time required for recognition as an index of the processing load.

この発明によれば、（１９）の画像処理装置において、画像認識手段により、処理負荷の指標として、認識に要した時間を用いることとした。このため、認識に要した時間を用いて、処理負荷の指標を設定することができる。 According to the present invention, in the image processing apparatus of (19), the time required for recognition is used as an index of the processing load by the image recognition means. For this reason, it is possible to set a processing load index using the time required for recognition.

（２１）本発明は、（１９）の画像処理装置について、前記画像認識手段は、前記処理負荷の指標として、オブジェクトの種類に応じた値を設定することを特徴とする画像処理装置を提案している。 (21) The present invention proposes an image processing apparatus according to (19), wherein the image recognition means sets a value corresponding to the type of object as an index of the processing load. ing.

この発明によれば、（１９）の画像処理装置において、画像認識手段により、処理負荷の指標として、オブジェクトの種類に応じた値を用いることとした。このため、オブジェクトの種類に応じた値を用いて、処理負荷の指標を設定することができる。 According to the present invention, in the image processing apparatus of (19), the image recognition means uses a value corresponding to the type of object as the processing load index. Therefore, it is possible to set a processing load index using a value according to the type of object.

（２２）本発明は、画像取得手段（例えば、図１の画像取得部１０に相当）、画像認識手段（例えば、図１の画像認識部２０に相当）、オブジェクト関係推定手段（例えば、図１のオブジェクト関係推定部３０に相当）、および仮想情報表示手段（例えば、図１の仮想情報表示部７０に相当）を備え、プレビュー画像に仮想情報を重畳させる画像処理装置（例えば、図１の画像処理装置１に相当）における画像処理方法であって、前記画像取得手段が、前記プレビュー画像を取得する第１のステップと、前記画像認識手段が、前記第１のステップで取得されたプレビュー画像内のオブジェクトを認識する第２のステップと、前記オブジェクト関係推定手段が、前記第２のステップで認識されたオブジェクト間の関係性（例えば、後述のオブジェクト間の相対姿勢に相当）を推定し、推定結果に基づいてオブジェクトを分類し、同一のグループに分類したオブジェクトのうちの１つである主要オブジェクトの前記第２のステップによる認識結果に基づいて、当該グループに分類した当該主要オブジェクト以外のオブジェクトを認識する第３のステップと、前記仮想情報表示手段が、前記第２のステップによる認識結果と、前記第３のステップによる認識結果と、に基づいて、前記第１のステップで取得されたプレビュー画像に仮想情報を重畳させる第４のステップと、を備えることを特徴とする画像処理方法を提案している。 (22) The present invention provides image acquisition means (for example, equivalent to the image acquisition unit 10 in FIG. 1), image recognition means (for example, equivalent to the image recognition unit 20 in FIG. 1), object relationship estimation means (for example, FIG. 1). Image processing apparatus (for example, the image of FIG. 1), and virtual information display means (for example, the virtual information display unit 70 of FIG. 1), which superimposes virtual information on the preview image. Image processing method) in which the image acquisition unit acquires the preview image, and the image recognition unit includes the preview image acquired in the first step. A second step of recognizing the object, and the object relationship estimating means includes a relationship (for example, an object described later) between the objects recognized in the second step. Corresponding to the relative posture between the objects), classifying the object based on the estimation result, and based on the recognition result of the main object that is one of the objects classified into the same group based on the second step A third step of recognizing an object other than the main object classified into the group, and the virtual information display means based on the recognition result of the second step and the recognition result of the third step. And a fourth step of superimposing virtual information on the preview image acquired in the first step.

この発明によれば、上述した効果と同様の効果を奏することができる。 According to the present invention, the same effects as described above can be obtained.

（２３）本発明は、画像取得手段（例えば、図１の画像取得部１０に相当）、画像認識手段（例えば、図１の画像認識部２０に相当）、オブジェクト関係推定手段（例えば、図１のオブジェクト関係推定部３０に相当）、および仮想情報表示手段（例えば、図１の仮想情報表示部７０に相当）を備え、プレビュー画像に仮想情報を重畳させる画像処理装置（例えば、図１の画像処理装置１に相当）における画像処理方法を、コンピュータに実行させるためのプログラムであって、前記画像取得手段が、前記プレビュー画像を取得する第１のステップと、前記画像認識手段が、前記第１のステップで取得されたプレビュー画像内のオブジェクトを認識する第２のステップと、前記オブジェクト関係推定手段が、前記第２のステップで認識されたオブジェクト間の関係性（例えば、後述のオブジェクト間の相対姿勢に相当）を推定し、推定結果に基づいてオブジェクトを分類し、同一のグループに分類したオブジェクトのうちの１つである主要オブジェクトの前記第２のステップによる認識結果に基づいて、当該グループに分類した当該主要オブジェクト以外のオブジェクトを認識する第３のステップと、前記仮想情報表示手段が、前記第２のステップによる認識結果と、前記第３のステップによる認識結果と、に基づいて、前記第１のステップで取得されたプレビュー画像に仮想情報を重畳させる第４のステップと、をコンピュータに実行させるためのプログラムをコンピュータに実行させるためのプログラムを提案している。 (23) The present invention provides image acquisition means (for example, equivalent to the image acquisition unit 10 in FIG. 1), image recognition means (for example, equivalent to the image recognition unit 20 in FIG. 1), object relationship estimation means (for example, FIG. 1). Image processing apparatus (for example, the image of FIG. 1), and virtual information display means (for example, the virtual information display unit 70 of FIG. 1), which superimposes virtual information on the preview image. 1 is a program for causing a computer to execute an image processing method (corresponding to the processing apparatus 1), wherein the image acquisition unit acquires the preview image, and the image recognition unit includes the first step. A second step of recognizing an object in the preview image acquired in the step, and the object relationship estimating means is recognized in the second step. The relationship between objects (e.g., corresponding to a relative posture between objects described later) is estimated, the objects are classified based on the estimation result, and the main object that is one of the objects classified into the same group A third step of recognizing an object other than the main object classified into the group based on a recognition result of the second step; and the virtual information display means includes a recognition result of the second step; For causing the computer to execute a program for causing the computer to execute a fourth step of superimposing virtual information on the preview image acquired in the first step based on the recognition result of step 3 Propose a program.

この発明によれば、コンピュータを用いてプログラムを実行することで、上述した効果と同様の効果を奏することができる。 According to the present invention, the same effect as described above can be obtained by executing the program using a computer.

本発明によれば、ＡＲ技術において、処理負荷を軽減したり、オブジェクトの認識の頑健性を向上させたりすることができる。 According to the present invention, in the AR technology, it is possible to reduce the processing load and improve the robustness of object recognition.

本発明の第１実施形態に係る画像処理装置のブロック図である。1 is a block diagram of an image processing apparatus according to a first embodiment of the present invention. 本発明の第１実施形態に係る画像処理装置の利用例を示す模式図である。It is a schematic diagram which shows the usage example of the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る画像処理装置の利用例を示す模式図である。It is a schematic diagram which shows the usage example of the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る画像処理装置の利用例を示す模式図である。It is a schematic diagram which shows the usage example of the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る画像処理装置の利用例を示す模式図である。It is a schematic diagram which shows the usage example of the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る画像処理装置のフローチャートである。It is a flowchart of the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る画像処理装置のフローチャートである。It is a flowchart of the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る画像処理装置のフローチャートである。It is a flowchart of the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る画像処理装置のフローチャートである。It is a flowchart of the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第１実施形態に係る画像処理装置のフローチャートである。It is a flowchart of the image processing apparatus which concerns on 1st Embodiment of this invention. 本発明の第２実施形態に係る画像処理装置のブロック図である。It is a block diagram of the image processing apparatus which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係る画像処理装置のフローチャートである。It is a flowchart of the image processing apparatus which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係る画像処理装置のフローチャートである。It is a flowchart of the image processing apparatus which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係る画像処理装置のフローチャートである。It is a flowchart of the image processing apparatus which concerns on 2nd Embodiment of this invention. 本発明の第２実施形態に係る画像処理装置のフローチャートである。It is a flowchart of the image processing apparatus which concerns on 2nd Embodiment of this invention. 本発明の第３実施形態に係る画像処理装置のブロック図である。It is a block diagram of the image processing apparatus which concerns on 3rd Embodiment of this invention.

以下、本発明の実施の形態について図面を参照しながら説明する。なお、以下の実施形態における構成要素は適宜、既存の構成要素などとの置き換えが可能であり、また、他の既存の構成要素との組み合せを含む様々なバリエーションが可能である。したがって、以下の実施形態の記載をもって、特許請求の範囲に記載された発明の内容を限定するものではない。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. Note that the constituent elements in the following embodiments can be appropriately replaced with existing constituent elements, and various variations including combinations with other existing constituent elements are possible. Accordingly, the description of the following embodiments does not limit the contents of the invention described in the claims.

＜第１実施形態＞
［画像処理装置１の概要］
図１は、本発明の第１実施形態に係る画像処理装置１のブロック図である。画像処理装置１は、ＡＲ技術に対応している。この画像処理装置１の概要について、図２、３、４、５を用いて以下に説明する。 <First Embodiment>
[Outline of Image Processing Apparatus 1]
FIG. 1 is a block diagram of an image processing apparatus 1 according to the first embodiment of the present invention. The image processing apparatus 1 is compatible with AR technology. An outline of the image processing apparatus 1 will be described below with reference to FIGS.

図２は、第１の視点から、画像処理装置１の搭載されている端末１００が撮影している例を示す模式図である。図２では、テーブルＡＡの上に３つのオブジェクトＭ１、Ｍ２、Ｍ３が直線状に配置されている。端末１００は、内蔵しているカメラで、オブジェクトＭ１の側からテーブルＡＡ上を撮影している。 FIG. 2 is a schematic diagram illustrating an example in which the terminal 100 in which the image processing apparatus 1 is mounted is capturing from the first viewpoint. In FIG. 2, three objects M1, M2, and M3 are linearly arranged on the table AA. The terminal 100 is photographing the table AA from the object M1 side with a built-in camera.

図３は、図２における端末１００の表示画面１１０ａを示す図である。表示画面１１０ａには、下方（図３において下方）から上方（図３において上方）に向かってオブジェクトＭ１、Ｍ２、Ｍ３の順番にオブジェクトＭ１からＭ３が表示されている。また、オブジェクトＭ１の右方（図３において右方）には、オブジェクトＭ１に紐付けられた仮想情報Ｃ１が重畳されている。また、オブジェクトＭ２の右方（図３において右方）には、オブジェクトＭ２に紐付けられた仮想情報Ｃ２が重畳されている。また、オブジェクトＭ３の右方（図３において右方）には、オブジェクトＭ３に紐付けられた仮想情報Ｃ３が重畳されている。このため、端末１００の所有者は、表示画面１１０ａを通して、ＡＲ空間に存在する仮想情報Ｃ１からＣ３を認識することができる。 FIG. 3 is a diagram showing the display screen 110a of the terminal 100 in FIG. On the display screen 110a, objects M1 to M3 are displayed in the order of objects M1, M2, and M3 from the lower side (lower side in FIG. 3) to the upper side (upper side in FIG. 3). Also, virtual information C1 associated with the object M1 is superimposed on the right side of the object M1 (right side in FIG. 3). Also, virtual information C2 associated with the object M2 is superimposed on the right side of the object M2 (right side in FIG. 3). Also, virtual information C3 associated with the object M3 is superimposed on the right side of the object M3 (right side in FIG. 3). For this reason, the owner of the terminal 100 can recognize the virtual information C1 to C3 existing in the AR space through the display screen 110a.

図４は、第２の視点から、画像処理装置１の搭載されている端末１００が撮影している例を示す模式図である。端末１００は、内蔵しているカメラで、オブジェクトＭ３の側からテーブルＡＡ上を撮影している。 FIG. 4 is a schematic diagram illustrating an example in which the terminal 100 on which the image processing apparatus 1 is mounted is shooting from the second viewpoint. The terminal 100 is photographing the table AA from the object M3 side with a built-in camera.

図５は、図４における端末１００の表示画面１１０ｂを示す図である。表示画面１１０ｂには、上方（図５において上方）から下方（図５において下方）に向かってオブジェクトＭ１、Ｍ２、Ｍ３の順番にオブジェクトＭ１からＭ３が表示されている。また、オブジェクトＭ１の左方（図５において左方）には、オブジェクトＭ１に紐付けられた仮想情報Ｃ１が重畳されている。また、オブジェクトＭ２の左方（図５において左方）には、オブジェクトＭ２に紐付けられた仮想情報Ｃ２が重畳されている。また、オブジェクトＭ３の左方（図５において左方）には、オブジェクトＭ３に紐付けられた仮想情報Ｃ３が重畳されている。このため、端末１００の所有者は、表示画面１１０ｂを通しても、ＡＲ空間に存在する仮想情報Ｃ１からＣ３を認識することができる。 FIG. 5 is a diagram showing a display screen 110b of the terminal 100 in FIG. On the display screen 110b, objects M1 to M3 are displayed in the order of the objects M1, M2, and M3 from the upper side (upper side in FIG. 5) to the lower side (lower side in FIG. 5). Also, virtual information C1 associated with the object M1 is superimposed on the left side of the object M1 (left side in FIG. 5). Also, virtual information C2 associated with the object M2 is superimposed on the left side of the object M2 (left side in FIG. 5). Also, virtual information C3 associated with the object M3 is superimposed on the left side of the object M3 (left side in FIG. 5). Therefore, the owner of the terminal 100 can recognize the virtual information C1 to C3 existing in the AR space through the display screen 110b.

なお、表示画面１１０ｂでは、仮想情報Ｃ１からＣ３のそれぞれは、表示画面１１０ａに表示されている仮想情報Ｃ１からＣ３を１８０度回転させた状態で表示されている。これは、表示画面１１０ｂを表示している際の端末１００が、表示画面１１０ａを表示している際の端末１００と１８０度反対の方向から、オブジェクトＭ１からＭ３のそれぞれを撮影しているためである。このため、端末１００の所有者は、表示画面１１０ｂを通して、仮想情報Ｃ１からＣ３に対する視点を図２における視点から１８０度回転させると、視点の回転に追随して、ＡＲ空間において仮想情報Ｃ１からＣ３も１８０度回転したと認識することができる。 On the display screen 110b, each of the virtual information C1 to C3 is displayed with the virtual information C1 to C3 displayed on the display screen 110a rotated by 180 degrees. This is because the terminal 100 displaying the display screen 110b captures each of the objects M1 to M3 from a direction 180 degrees opposite to the terminal 100 displaying the display screen 110a. is there. Therefore, when the owner of the terminal 100 rotates the viewpoint for the virtual information C1 to C3 by 180 degrees from the viewpoint in FIG. 2 through the display screen 110b, the virtual information C1 to C3 in the AR space follows the rotation of the viewpoint. Can also be recognized as being rotated 180 degrees.

図２から図５では、端末を動かした場合を示しているが、オブジェクトを動かした場合にも、オブジェクトの動きに追随して、仮想情報も端末１００の表示画面上で動く。なお、端末やオブジェクトを動かした際に、仮想情報は、オブジェクトとの相対的な位置関係を保持した状態で動く。また、仮想情報Ｃ１からＣ３のそれぞれは、現実空間には存在しておらず、オブジェクトＭ１からＭ３のそれぞれと紐付けて端末１００に記憶されている。 2 to 5 show the case where the terminal is moved, but virtual information also moves on the display screen of the terminal 100 following the movement of the object even when the object is moved. Note that when the terminal or the object is moved, the virtual information moves while maintaining a relative positional relationship with the object. Further, each of the virtual information C1 to C3 does not exist in the real space, but is stored in the terminal 100 in association with each of the objects M1 to M3.

ここで、画像認識処理におけるオブジェクトの認識精度は、オブジェクトとカメラとの距離が離れるに従って低下する。また、オクルージョンや光の反射などによって、オブジェクトがプレビュー画像に一時的に写らなくなり、このオブジェクトの認識に一時的に失敗してしまうことがある。このような理由により、例えば、オブジェクトＭ１について、図２の第１の視点では認識できるが、図４の第２の視点では認識できないといった状況が起こり得る。 Here, the recognition accuracy of the object in the image recognition process decreases as the distance between the object and the camera increases. Further, due to occlusion or light reflection, the object may not be temporarily displayed in the preview image, and recognition of this object may temporarily fail. For this reason, for example, a situation may occur in which the object M1 can be recognized from the first viewpoint in FIG. 2, but cannot be recognized from the second viewpoint in FIG.

そこで、まず、図２から５を用いて上述したＡＲ空間を、上述の特許文献１の技術で実現する場合について、以下に説明する。この場合において、上述の状況が起こると、オブジェクトＭ１についての姿勢の追跡処理を行っても、オブジェクトを認識できないので、オブジェクトＭ１の認識処理を、初期姿勢の推定処理からやり直す必要がある。しかし、初期姿勢の推定処理の、オブジェクトとカメラとの距離や角度に対する頑健性は低い。このため、初期姿勢の推定処理からやり直しても、オブジェクトＭ１の認識を再開できない可能性がある。 Therefore, first, the case where the AR space described above with reference to FIGS. 2 to 5 is realized by the technique of Patent Document 1 described above will be described below. In this case, if the above situation occurs, the object cannot be recognized even if the posture tracking process for the object M1 is performed. Therefore, the object M1 recognition process needs to be repeated from the initial posture estimation process. However, the robustness of the initial posture estimation process with respect to the distance and angle between the object and the camera is low. For this reason, there is a possibility that the recognition of the object M1 cannot be resumed even if the initial posture estimation process is repeated.

次に、図２から５を用いて上述したＡＲ空間を、本実施形態に係る画像処理装置１で実現する場合について、以下に説明する。この場合、画像処理装置１は、オブジェクトＭ１を認識するために、オブジェクトＭ１と他のオブジェクトとの相対的な位置関係を示す相対姿勢を予め推定しておく。オブジェクトＭ１の認識に失敗すると、推定しておいた相対姿勢を用いて、他のオブジェクトの認識結果を、オブジェクトＭ１の認識結果に変換する。これによれば、端末１００がオブジェクトＭ１を直接認識できなくても、他のオブジェクトの認識結果を変換して、オブジェクトＭ１を認識することができる。このため、オブジェクトの認識の頑健性を向上させることができ、端末１００の表示画面１１０に仮想情報Ｃ１を表示することができるので、 Next, a case where the AR space described above with reference to FIGS. 2 to 5 is realized by the image processing apparatus 1 according to the present embodiment will be described below. In this case, in order to recognize the object M1, the image processing apparatus 1 preliminarily estimates a relative posture indicating a relative positional relationship between the object M1 and another object. If the recognition of the object M1 fails, the recognition result of another object is converted into the recognition result of the object M1 using the estimated relative posture. According to this, even if the terminal 100 cannot directly recognize the object M1, the recognition result of other objects can be converted and the object M1 can be recognized. For this reason, the robustness of object recognition can be improved, and the virtual information C1 can be displayed on the display screen 110 of the terminal 100.

［画像処理装置１の構成］
以上の画像処理装置１について、以下に詳述する。図１に戻って、画像処理装置１は、デスクトップＰＣといった据え置き型のコンピュータや、ラップトップＰＣ、携帯電話機、携帯ゲーム機、ＨＭＤなどの携帯型の情報端末に搭載可能である。この画像処理装置１は、画像取得部１０、画像認識部２０、オブジェクト関係推定部３０、および仮想情報表示部７０を備える。 [Configuration of Image Processing Apparatus 1]
The above image processing apparatus 1 will be described in detail below. Returning to FIG. 1, the image processing apparatus 1 can be mounted on a stationary computer such as a desktop PC, or a portable information terminal such as a laptop PC, a mobile phone, a portable game machine, or an HMD. The image processing apparatus 1 includes an image acquisition unit 10, an image recognition unit 20, an object relationship estimation unit 30, and a virtual information display unit 70.

［画像取得部１０の構成および動作］
画像取得部１０は、ＷＥＢカメラやカメラモジュールといった撮像装置で撮影された画像を連続的に取得する。本実施形態では、画像取得部１０は、６０ｆｐｓのフレームレートで画像を取得するものとする。なお、画像を連続的に撮影する撮像装置は、画像処理装置１の内部に設けられるものであってもよいし、画像処理装置１の外部に設けられるものであってもよい。 [Configuration and Operation of Image Acquisition Unit 10]
The image acquisition unit 10 continuously acquires images taken by an imaging device such as a WEB camera or a camera module. In the present embodiment, the image acquisition unit 10 acquires an image at a frame rate of 60 fps. Note that the imaging device that continuously captures images may be provided inside the image processing device 1 or may be provided outside the image processing device 1.

［画像認識部２０の構成および動作］
画像認識部２０は、画像取得部１０により取得された画像（以降、プレビュー画像とする）を入力とする。この画像認識部２０は、入力されたプレビュー画像内のオブジェクトを識別し、識別した各オブジェクトの姿勢を推定して、識別した各オブジェクトを認識する。この画像認識部２０は、オブジェクト識別部２１、初期姿勢推定部２２、および姿勢追跡部２３を備える。 [Configuration and Operation of Image Recognition Unit 20]
The image recognition unit 20 receives an image acquired by the image acquisition unit 10 (hereinafter referred to as a preview image). The image recognition unit 20 identifies an object in the input preview image, estimates the posture of each identified object, and recognizes each identified object. The image recognition unit 20 includes an object identification unit 21, an initial posture estimation unit 22, and a posture tracking unit 23.

オブジェクト識別部２１は、画像取得部１０により取得されたプレビュー画像を入力とする。このオブジェクト識別部２１は、入力されたプレビュー画像内のオブジェクトの識別処理を行う。識別処理では、プレビュー画像から局所特徴量を検出し、特徴量データベース（辞書）に予め登録されているオブジェクトごとの局所特徴量と照合して、オブジェクトを識別する。 The object identification unit 21 receives the preview image acquired by the image acquisition unit 10 as an input. The object identifying unit 21 performs an object identifying process in the input preview image. In the identification processing, a local feature amount is detected from the preview image, and an object is identified by comparing with a local feature amount for each object registered in advance in a feature amount database (dictionary).

なお、オブジェクトの識別処理は、例えば外部サーバで行われるものとしてもよい。この場合には、オブジェクト識別部２１は、プレビュー画像を外部サーバに送信し、外部サーバから識別処理の結果を受け取ることになる。これによれば、識別処理をアウトソースすることができるので、大規模なオブジェクトや多数のオブジェクトを扱う場合に好適である。 The object identification process may be performed by, for example, an external server. In this case, the object identification unit 21 transmits the preview image to the external server and receives the result of the identification process from the external server. According to this, since the identification process can be outsourced, it is suitable for handling a large-scale object or a large number of objects.

一方、オブジェクトの数が少数である場合には、画像認識部２０からオブジェクト識別部２１を省くことが可能である。 On the other hand, when the number of objects is small, the object identification unit 21 can be omitted from the image recognition unit 20.

初期姿勢推定部２２は、画像取得部１０により取得されたプレビュー画像を入力とする。この初期姿勢推定部２２は、入力されたプレビュー画像に含まれる、オブジェクト識別部２１により識別されたオブジェクトについて、姿勢を推定し、推定結果を姿勢の初期値とする。初期姿勢推定部２２は、後述の姿勢追跡部２３によるオブジェクトの姿勢の追跡を開始する際と、姿勢追跡部２３によるオブジェクトの姿勢の追跡を行わなくなった場合と、において上述の姿勢の推定を行う。 The initial posture estimation unit 22 receives the preview image acquired by the image acquisition unit 10 as an input. The initial posture estimation unit 22 estimates the posture of the object identified by the object identification unit 21 included in the input preview image, and sets the estimation result as the initial value of the posture. The initial posture estimation unit 22 estimates the posture described above when the posture tracking unit 23 described later starts tracking the posture of the object and when the posture tracking unit 23 stops tracking the posture of the object. .

本実施形態では、オブジェクトの姿勢を六自由度の姿勢行列（４行４列）で表現する。姿勢行列は、画像取得部１０が取得するプレビュー画像を撮影する撮像装置と、オブジェクトと、の相対的な位置関係を示す情報を有するものであり、三次元特殊ユークリッド群ＳＥ（３）に属し、ともに三自由度の三次元回転行列および三次元並進ベクトルで表される。姿勢行列を用いる場合、プレビュー画像中におけるオブジェクトのピクセル座標と、初期姿勢推定部２２に予め登録されているこのオブジェクト上の座標と、の関係は、以下の数式（１）で表すことができる。 In the present embodiment, the posture of the object is expressed by a posture matrix of 6 degrees of freedom (4 rows and 4 columns). The posture matrix has information indicating the relative positional relationship between the imaging device that captures the preview image acquired by the image acquisition unit 10 and the object, and belongs to the three-dimensional special Euclidean group SE (3). Both are represented by a three-dimensional rotation matrix with three degrees of freedom and a three-dimensional translation vector. When the posture matrix is used, the relationship between the pixel coordinates of the object in the preview image and the coordinates on the object registered in advance in the initial posture estimation unit 22 can be expressed by the following formula (1).

数式（１）において、Ａは、撮像装置の内部パラメータを示す。撮像装置の内部パラメータは、予めカメラキャリブレーションによって求めておくことが好ましい。ただし、撮像装置の内部パラメータは、実際の値とずれていたとしても、最終的に推定した姿勢行列と打ち消し合うため、仮想情報を重畳する位置には影響しない。このため、撮像装置の内部パラメータには、一般的なカメラの内部パラメータを代用することが可能である。 In Equation (1), A indicates an internal parameter of the imaging device. It is preferable that the internal parameters of the imaging apparatus are obtained in advance by camera calibration. However, even if the internal parameters of the imaging apparatus deviate from the actual values, they cancel each other out with the estimated posture matrix, so that the position where the virtual information is superimposed is not affected. For this reason, a general camera internal parameter can be substituted for the internal parameter of the imaging apparatus.

数式（１）において、Ｒは、三次元空間内の回転を表すパラメータを示す。Ｒにおける各パラメータは、オイラー角といった表現により三パラメータで表現することが可能である。 In Expression (1), R represents a parameter representing rotation in the three-dimensional space. Each parameter in R can be expressed by three parameters by expression such as Euler angle.

数式（１）において、ｔは、三次元空間内の平行移動を表すパラメータを示す。また、Ｘ、Ｙ、Ｚのそれぞれは、初期姿勢推定部２２に予め登録されているオブジェクト上のＸ座標、Ｙ座標、Ｚ座標のそれぞれを示す。また、ｕ、ｖは、プレビュー画像中のｕ座標およびｖ座標を示す。 In Equation (1), t represents a parameter representing the parallel movement in the three-dimensional space. Each of X, Y, and Z represents an X coordinate, a Y coordinate, and a Z coordinate on the object registered in advance in the initial posture estimation unit 22. U and v represent the u coordinate and the v coordinate in the preview image.

なお、本実施形態では、姿勢行列の推定を、画像内の自然特徴を用いて行うものとする。自然特徴とは、画像間の点対応の取得やマッチングを行うために、画像の局所領域から算出される特徴のことであり、画像内のエッジやコーナーなどの、対応付けの容易な局所領域から抽出される。自然特徴の代表例としては、ＳＩＦＴ（Scale Invariant Feature Transform）やＳＵＲＦ（Speed Up Robust Features）などの、高精度な対応付けが可能な局所特徴量があり、これらを用いて姿勢行列を算出する手法は一般に知られている。 In the present embodiment, the posture matrix is estimated using natural features in the image. A natural feature is a feature that is calculated from a local region of an image in order to obtain or match a point correspondence between images. From a local region that can be easily matched, such as an edge or a corner in the image. Extracted. Typical examples of natural features include local feature quantities that can be associated with high accuracy, such as SIFT (Scale Invariant Feature Transform) and SURF (Speed Up Robust Features), and a method of calculating a posture matrix using these features Is generally known.

オブジェクトの姿勢は、オブジェクトや撮像装置が動くことによって、画像取得部１０により連続的に取得されるプレビュー画像中において刻々と変化する。このため、初期姿勢推定部２２には、上述のオブジェクト識別部２１と比べて処理速度が求められる。したがって、画像取得部１０は、画像処理装置１の内部に設けられる必要があり、非特許文献２に開示されているように処理負荷の小さいアルゴリズムを用いることが望ましい。 The posture of the object changes every moment in the preview image continuously acquired by the image acquisition unit 10 as the object and the imaging apparatus move. For this reason, the initial posture estimation unit 22 is required to have a processing speed as compared with the object identification unit 21 described above. Therefore, the image acquisition unit 10 needs to be provided inside the image processing apparatus 1, and it is desirable to use an algorithm with a small processing load as disclosed in Non-Patent Document 2.

姿勢追跡部２３は、画像取得部１０により取得されたプレビュー画像と、初期姿勢推定部２２により推定されたオブジェクトの姿勢の初期値と、を入力とする。この姿勢追跡部２３は、入力されたプレビュー画像およびオブジェクトの姿勢の初期値に基づいて、オブジェクトの姿勢の追跡処理を行ってオブジェクトの姿勢を推定し、オブジェクトを認識する。 The posture tracking unit 23 receives the preview image acquired by the image acquisition unit 10 and the initial value of the object posture estimated by the initial posture estimation unit 22 as inputs. The attitude tracking unit 23 performs object attitude tracking processing based on the input preview image and the initial value of the object attitude, estimates the object attitude, and recognizes the object.

姿勢追跡部２３は、オブジェクトの姿勢の追跡に成功した場合、すなわちオブジェクトの認識に成功した場合には、認識に成功したオブジェクトの識別子（ＩＤ）と、認識に成功したオブジェクトの姿勢の推定値と、を認識結果として出力する。また、この認識結果を、画像取得部１０により取得された次フレームのプレビュー画像において追跡処理を行う際の初期値として用いる。このため、オブジェクトの姿勢の追跡に成功している間は、このオブジェクトに対して初期姿勢推定部２２による処理を行う必要がない。 When the posture tracking unit 23 succeeds in tracking the posture of the object, that is, when the recognition of the object is successful, the identifier (ID) of the object that has been successfully recognized, the estimated value of the posture of the object that has been successfully recognized, Are output as recognition results. Further, the recognition result is used as an initial value when the tracking process is performed on the preview image of the next frame acquired by the image acquisition unit 10. For this reason, while the tracking of the posture of the object is successful, it is not necessary to perform processing by the initial posture estimation unit 22 on the object.

また、オブジェクトの姿勢の追跡に成功している間は、このオブジェクトに対する追跡処理を、画像取得部１０によりプレビュー画像が取得されるたびに行う必要がある。このため、姿勢追跡部２３には、上述の初期姿勢推定部２２と比べて処理速度が求められる。したがって、姿勢追跡部２３は、画像処理装置１の内部に設けられる必要があるとともに、オブジェクトの姿勢の追跡処理を最低でもリアルタイムで行うことができる必要があり、非特許文献２に開示されているように処理負荷の小さい姿勢追跡アルゴリズムを用いることが望ましい。 Further, while the tracking of the posture of the object is successful, it is necessary to perform tracking processing for the object every time the preview image is acquired by the image acquisition unit 10. For this reason, the posture tracking unit 23 is required to have a processing speed as compared with the above-described initial posture estimation unit 22. Therefore, the posture tracking unit 23 needs to be provided inside the image processing apparatus 1 and must be able to perform tracking processing of the posture of the object in real time at least, and is disclosed in Non-Patent Document 2. Thus, it is desirable to use a posture tracking algorithm with a small processing load.

非特許文献２に開示されている手法では、姿勢の初期値から線形予測によって姿勢の予測値を推定し、オブジェクト内の特徴点の移動量を、予測値からの探索によって推定する。探索は、特徴点の周囲から、局所領域の相関値が最も高い箇所を求めることで行われる。各特徴点の移動量から、上述の数式（１）を満たすように、姿勢の推定値を予測値から更新する。姿勢の追跡処理が反復してリアルタイムで行われる場合、フレーム間の特徴点の移動量は小さいため、特徴点の探索幅を限定することで、処理負荷の軽減を実現できる。 In the method disclosed in Non-Patent Document 2, a predicted value of posture is estimated by linear prediction from an initial value of posture, and a movement amount of a feature point in an object is estimated by searching from the predicted value. The search is performed by obtaining a point having the highest correlation value of the local region from around the feature point. The estimated value of the posture is updated from the predicted value so as to satisfy the above formula (1) from the movement amount of each feature point. When posture tracking processing is repeatedly performed in real time, the amount of movement of feature points between frames is small. Therefore, the processing load can be reduced by limiting the search range of feature points.

また、カメラやオブジェクトが急速に移動するなどによって、特徴点の移動量が探索幅を超えた場合には、上述の手法では姿勢の追跡に失敗する。この場合、上述の手法は、追跡に失敗したことを特徴点の探索状況によって検知することが可能である。例えば、探索に成功した（相関値が閾値以上の箇所が求められた）特徴点の数が閾値以下となった場合には、姿勢の推定値の更新を中断して、追跡に失敗したことを示す信号を出力することが可能である。追跡に失敗したことを示す信号が出力された場合、姿勢追跡部２３による追跡処理を中断し、次フレーム以降において初期姿勢推定部２２による姿勢の初期値を推定する処理を実行すればよい。 Further, when the movement amount of the feature point exceeds the search width due to the rapid movement of the camera or the object, the posture tracking fails in the above-described method. In this case, the above-described method can detect that the tracking has failed based on the feature point search status. For example, if the number of feature points that succeeded in the search (where the correlation value is greater than or equal to the threshold value) is less than or equal to the threshold value, the update of the estimated posture value is interrupted, and the tracking failure It is possible to output the indicated signal. When a signal indicating that the tracking has failed is output, the tracking processing by the posture tracking unit 23 may be interrupted, and processing for estimating the initial value of the posture by the initial posture estimation unit 22 may be executed in and after the next frame.

なお、オブジェクトの姿勢の追跡処理をリアルタイムで行うために、追跡可能なオブジェクト数の上限を予め設定しておき、上限を超えたオブジェクトについては追跡処理を行わないようにしてもよい。これによれば、追跡するオブジェクト数が上限に達している場合には、オブジェクト識別部２１による識別処理と、初期姿勢推定部２２による姿勢の初期値を推定する処理と、を休止することになる。なお、上述の上限は、画像処理装置１の処理能力に応じて設定されることが好ましい。 In order to perform tracking processing of the posture of an object in real time, an upper limit of the number of objects that can be tracked may be set in advance, and tracking processing may not be performed for an object that exceeds the upper limit. According to this, when the number of objects to be tracked has reached the upper limit, the identification processing by the object identification unit 21 and the processing for estimating the initial value of the posture by the initial posture estimation unit 22 are suspended. . Note that the above upper limit is preferably set according to the processing capability of the image processing apparatus 1.

以上の画像認識部２０は、上述のオブジェクトの姿勢の推定を、オブジェクトごとに行う。オブジェクトごとの姿勢の推定処理は、互いに独立であるため並列に実施してもよいし、順番に実施してもよい。 The above image recognition unit 20 performs the above-described estimation of the posture of the object for each object. Since the posture estimation processing for each object is independent of each other, it may be performed in parallel or sequentially.

また、ＡＲ空間内に仮想情報を固定配置して重畳させる場合には、画像認識部２０は、オブジェクトの認識に加えて、基準マーカの認識も行う。オブジェクトを認識する場合と同様の処理で基準マーカを認識できる場合には、画像認識部２０は、オブジェクトと基準マーカとを区別することなく認識を行う。一方、基準マーカが、非特許文献１の手法で認識可能なＡＲマーカである場合や、非特許文献３の手法で認識可能な復元された空間である場合には、基準マーカをオブジェクトと区別して、基準マーカのみ、対応する認識手法で認識を行う。ＡＲ空間内に固定配置して重畳させる仮想情報がない場合や、そもそも基準マーカが存在しない場合には、画像認識部２０は、オブジェクトの認識のみ行う。 When virtual information is fixedly arranged and superimposed in the AR space, the image recognition unit 20 recognizes a reference marker in addition to recognizing an object. When the reference marker can be recognized by the same processing as that for recognizing the object, the image recognition unit 20 performs recognition without distinguishing between the object and the reference marker. On the other hand, when the reference marker is an AR marker that can be recognized by the method of Non-Patent Document 1 or when it is a restored space that can be recognized by the method of Non-Patent Document 3, the reference marker is distinguished from an object. Only the reference marker is recognized by the corresponding recognition method. When there is no virtual information that is fixedly arranged and superimposed in the AR space, or when there is no reference marker in the first place, the image recognition unit 20 performs only object recognition.

いずれにせよ、画像認識部２０が行うことは、オブジェクト（存在する場合には基準マーカも）の姿勢の推定である。なお、基準マーカの有無、基準マーカの種類、および姿勢の推定に用いる認識手法は、上述の手法に限定されるものではない。 In any case, what the image recognition unit 20 performs is the estimation of the posture of the object (and the reference marker if it exists). In addition, the recognition method used for the presence / absence of the reference marker, the type of the reference marker, and the posture is not limited to the above-described method.

［オブジェクト関係推定部３０の構成および動作］
オブジェクト関係推定部３０は、画像認識部２０による認識結果を入力とする。このオブジェクト関係推定部３０は、画像認識部２０により認識されたオブジェクト間の関係性を推定し、推定結果に基づいてオブジェクトを分類する。また、認識に失敗したオブジェクトについては、このオブジェクトと同一のグループに分類したオブジェクトのうちの１つである主要オブジェクトの画像認識部２０による認識結果に基づいて、認識する。このオブジェクト関係推定部３０は、オブジェクト関係推定処理部３１および姿勢変換処理部３２を備える。 [Configuration and Operation of Object Relationship Estimation Unit 30]
The object relationship estimation unit 30 receives the recognition result from the image recognition unit 20 as an input. The object relationship estimation unit 30 estimates the relationship between objects recognized by the image recognition unit 20, and classifies the objects based on the estimation result. Further, an object that has failed to be recognized is recognized based on the recognition result by the image recognition unit 20 of the main object that is one of the objects classified into the same group as this object. The object relationship estimation unit 30 includes an object relationship estimation processing unit 31 and a posture conversion processing unit 32.

オブジェクト関係推定処理部３１は、画像認識部２０による認識結果を入力とする。このオブジェクト関係推定処理部３１は、画像認識部２０により認識されたオブジェクト間の関係性を推定して分類する。具体的には、画像認識部２０により認識されたオブジェクト間の相対姿勢を算出し、相対姿勢の変動に応じて各オブジェクトを分類する。なお、本実施形態では、相対姿勢も上述の姿勢行列で表現するものとする。このオブジェクト関係推定処理部３１について、以下に詳述する。 The object relationship estimation processing unit 31 receives the recognition result from the image recognition unit 20 as an input. The object relationship estimation processing unit 31 estimates and classifies the relationship between objects recognized by the image recognition unit 20. Specifically, a relative posture between the objects recognized by the image recognition unit 20 is calculated, and each object is classified according to a change in the relative posture. In the present embodiment, the relative posture is also expressed by the above-described posture matrix. The object relationship estimation processing unit 31 will be described in detail below.

オブジェクト関係推定処理部３１は、画像認識部２０により認識されたオブジェクトの数が２つ以上である場合に動作する。ここで、例えば、第１の視点におけるオブジェクトＡの姿勢行列のことを姿勢行列Ｗ_Ａ１とし、第１の視点におけるオブジェクトＢの姿勢行列のことを姿勢行列Ｗ_Ｂ１とする。すると、オブジェクトＡとオブジェクトＢとの間の相対姿勢Ｗ_ＡＢは、以下の数式（２）により求めることができる。 The object relationship estimation processing unit 31 operates when the number of objects recognized by the image recognition unit 20 is two or more. Here, for example, the posture matrix of the object A at the first viewpoint is the posture matrix W _A1, and the posture matrix of the object B at the first viewpoint is the posture matrix W _B1 . Then, relative orientation W _AB between objects A and B can be obtained by the following equation (2).

ここで、仮に、カメラが移動して第１の視点から第２の視点に移動している間、オブジェクトＡ、Ｂの双方が固定配置されていて移動しない場合には、以下の数式（３）に示すように相対姿勢Ｗ_ＡＢは変動しない。また、オブジェクトＡ、Ｂの双方が同一の剛体オブジェクトに固定配置されている場合にも、剛体オブジェクトが動いても、相対姿勢Ｗ_ＡＢは変動しない。 Here, if both the objects A and B are fixedly arranged and do not move while the camera moves and moves from the first viewpoint to the second viewpoint, the following formula (3) relative orientation as shown in W _AB does not change. Also, the object A, even if both the B is fixedly disposed on the same rigid object, even moving rigid object, relative orientation W _AB does not change.

そこで、オブジェクト関係推定処理部３１は、まず、画像取得部１０により取得されたプレビュー画像がフレームごとに、画像認識部２０により認識されたオブジェクト間の相対姿勢を推定する。次に、オブジェクト間ごとに、相対姿勢のフレーム間での変化量を相対姿勢変化量として求め、相対姿勢変化量が閾値β未満であるオブジェクト間について、相対姿勢が１フレーム安定したと判定する。次に、αフレーム以上に亘って連続して相対姿勢の安定しているオブジェクト間について、これらオブジェクトを同一のグループに分類する。 Therefore, the object relationship estimation processing unit 31 first estimates the relative posture between the objects in which the preview image acquired by the image acquisition unit 10 is recognized by the image recognition unit 20 for each frame. Next, for each object, a change amount of the relative posture between frames is obtained as a relative posture change amount, and it is determined that the relative posture is stable for one frame between objects whose relative posture change amount is less than the threshold value β. Next, these objects are classified into the same group between objects whose relative postures are continuously stable over α frame or more.

なお、上述の相対姿勢変化量は、以下のようにして求められる。前フレームで相対姿勢が安定していないと判定した場合には、前フレームにおいて推定した相対姿勢から、現フレームにおいて推定した相対姿勢までの変化量を、上述の相対姿勢変化量として求める。一方、ｎフレーム（ただし、ｎは、ｎ＞１を満たす任意の整数）に亘って連続して相対姿勢が安定していると判定している場合には、ｎフレーム前から前フレームまでの間に推定した相対姿勢の平均（平均相対姿勢）を求め、この平均相対姿勢と、現フレームにおいて推定した相対姿勢と、の差分を、上述の相対姿勢変化量として求める。 The above-described relative posture change amount is obtained as follows. When it is determined that the relative posture is not stable in the previous frame, a change amount from the relative posture estimated in the previous frame to the relative posture estimated in the current frame is obtained as the above-described relative posture change amount. On the other hand, when it is determined that the relative posture is continuously stable over n frames (where n is an arbitrary integer satisfying n> 1), the interval between n frames before and the previous frame is determined. The average of the estimated relative postures (average relative posture) is obtained, and the difference between the average relative posture and the relative posture estimated in the current frame is obtained as the above-described relative posture change amount.

また、上述の相対姿勢変化量は、並進と回転とに分けて、それぞれ独立した閾値と比較してもよい。 The relative posture change amount described above may be divided into translation and rotation, and may be compared with independent threshold values.

また、オブジェクト関係推定処理部３１が動作を開始した段階、すなわちグループ分けの初期状態では、画像認識部２０により認識された全てのオブジェクトは、それぞれ別々のグループに属している状態（オブジェクト数＝グループ数）となる。また、同一のグループには、３つ以上のオブジェクトを分類することが可能であり、例えばオブジェクトＡとオブジェクトＢとの間の相対姿勢Ｗ_ＡＢと、オブジェクトＡとオブジェクトＣとの間の相対姿勢Ｗ_ＡＣと、がそれぞれαフレーム以上に亘って連続して安定している場合には、オブジェクトＡ、Ｂ、Ｃの３つを同一のグループに分類することができる。 Further, at the stage when the object relationship estimation processing unit 31 starts operation, that is, in the initial state of grouping, all objects recognized by the image recognition unit 20 belong to different groups (number of objects = groups). Number). In addition, it is possible to classify three or more objects into the same group. For example, the relative posture W _AB between the object A and the object B and the relative posture W between the object A and the object C can be classified. _{When AC} is continuously stable over α frames or more, the three objects A, B, and C can be classified into the same group.

また、同一のグループに分類されていたオブジェクト間の相対姿勢について、安定していないと判定した場合には、これらオブジェクトのグループ化を解除して、これらオブジェクトを別々のグループに分ける。オブジェクトのグループ化を解除する場合としては、例えば、これらオブジェクトのうち少なくともいずれかが動いた場合が想定される。 Further, when it is determined that the relative posture between the objects classified into the same group is not stable, the grouping of these objects is canceled and these objects are divided into different groups. As a case where the grouping of objects is canceled, for example, a case where at least one of these objects moves is assumed.

また、オクルージョンや光の反射などによって、このオブジェクトの認識に一時的に失敗してしまうことがある。このような場合には、オブジェクト間の相対姿勢を推定することができないが、同一のグループに分類されているオブジェクト間の相対姿勢は、変化していないことが想定されるため、グループ化の解除は行わない。 In addition, recognition of this object may temporarily fail due to occlusion or light reflection. In such a case, it is impossible to estimate the relative posture between objects, but it is assumed that the relative posture between objects classified in the same group has not changed. Do not do.

ただし、オブジェクト間の相対姿勢を推定できないフレームが連続して発生し続けると、相対姿勢が変化していないこと（相対姿勢の不変性）を保証できなくなってしまう。また、オブジェクト間の相対姿勢を推定できない原因がオクルージョンや光の反射などの一時的なものである場合には、この原因が解消して、オブジェクト間の相対姿勢の推定を再開できると考えられる。そこで、θフレーム以上に亘って連続してオブジェクト間の相対姿勢を推定できない場合には、グループ化を解除する。 However, if frames in which the relative posture between objects cannot be estimated continue to occur, it cannot be guaranteed that the relative posture has not changed (relative posture invariance). In addition, when the reason why the relative posture between objects cannot be estimated is a temporary cause such as occlusion or light reflection, it is considered that this cause can be resolved and the estimation of the relative posture between objects can be resumed. Therefore, when the relative posture between the objects cannot be estimated continuously over the θ frame or more, the grouping is canceled.

ここで、オブジェクト間の相対姿勢を推定できないのは、認識できなかったオブジェクトとの間の相対姿勢である。すなわち、認識できていないオブジェクトについては、このオブジェクトと他のオブジェクトとの間の相対姿勢を推定できない。このため、θフレーム以上に亘って連続して認識できていないオブジェクトについては、他のオブジェクトとの間の相対姿勢をθフレーム以上に亘って連続して推定できないことになり、グループ化が解除されることになる。 Here, the relative posture between the objects that cannot be recognized cannot be estimated. That is, for an object that cannot be recognized, the relative posture between this object and another object cannot be estimated. For this reason, for objects that cannot be recognized continuously over the θ frame or more, the relative posture with other objects cannot be continuously estimated over the θ frame or more, and the grouping is canceled. Will be.

姿勢変換処理部３２は、画像認識部２０による認識結果と、オブジェクト関係推定処理部３１による分類結果および相対姿勢の推定結果と、を入力とする。この姿勢変換処理部３２は、画像認識部２０による認識に失敗したオブジェクトについて、同一のグループに分類されている他のオブジェクトについての画像認識部２０による認識結果を、この他のオブジェクトとの間の相対姿勢を用いて認識する。姿勢変換処理部３２について、以下に詳述する。 The posture conversion processing unit 32 receives the recognition result by the image recognition unit 20 and the classification result and the relative posture estimation result by the object relationship estimation processing unit 31 as inputs. The posture conversion processing unit 32 recognizes the recognition result by the image recognition unit 20 for other objects classified into the same group with respect to the object that has failed to be recognized by the image recognition unit 20, between the other objects. Recognize using relative posture. The posture conversion processing unit 32 will be described in detail below.

姿勢変換処理部３２は、同一のグループに分類されているオブジェクトが１組以上存在している場合、すなわち２つ以上のオブジェクトが同一のグループに分類されている場合に、動作する。ここで、例えば、オブジェクトＥの認識には画像認識部２０が失敗しているとともに、オブジェクトＥと同一のグループに分類されているオブジェクトＦの認識には画像認識部２０が成功しており、オブジェクトＥとオブジェクトＦとの間の相対姿勢Ｗ_ＥＦが推定されているものとする。また、画像認識部２０による成功した認識により、第１の視点におけるオブジェクトＦの姿勢行列Ｗ_Ｆ１が得られているものとする。すると、オブジェクトＦを上述の主要オブジェクトとして、第１の視点におけるオブジェクトＥの姿勢行列Ｗ_Ｅ１を、以下の数式（４）により求めることができる。 The posture conversion processing unit 32 operates when there are one or more sets of objects classified into the same group, that is, when two or more objects are classified into the same group. Here, for example, the image recognition unit 20 has failed to recognize the object E, and the image recognition unit 20 has succeeded in recognizing the object F classified into the same group as the object E. It is assumed that the relative posture W _EF between E and the object F is estimated. Further, the recognition has succeeded by the image recognition unit 20, it is assumed that the orientation matrix W _F1 objects F in a first aspect is obtained. Then, with the object F as the main object described above, the posture matrix W _E1 of the object E at the first viewpoint can be obtained by the following equation (4).

［仮想情報表示部７０の構成および動作］
仮想情報表示部７０は、画像取得部１０により取得されたプレビュー画像と、画像認識部２０による認識結果と、オブジェクト関係推定部３０による認識結果と、を入力とする。この仮想情報表示部７０は、プレビュー画像に、画像認識部２０およびオブジェクト関係推定部３０による認識結果に基づいて仮想情報を重畳させる。なお、仮想情報を重畳させる際に、仮想情報表示部７０は、撮像装置の内部パラメータ行列（画角といった情報を含む）と、重畳させる仮想情報が紐付けられているオブジェクトの姿勢行列と、を用いて、３Ｄレンダリングによって対応する位置にこの仮想情報を重畳させる。また、仮想情報を重畳させる際に、仮想情報表示部７０は、統合認識結果に基づいて仮想情報の位置や向きを補正する。 [Configuration and Operation of Virtual Information Display Unit 70]
The virtual information display unit 70 receives as input the preview image acquired by the image acquisition unit 10, the recognition result by the image recognition unit 20, and the recognition result by the object relationship estimation unit 30. The virtual information display unit 70 superimposes virtual information on the preview image based on the recognition results by the image recognition unit 20 and the object relationship estimation unit 30. When superimposing virtual information, the virtual information display unit 70 includes an internal parameter matrix (including information such as an angle of view) of the imaging device and an attitude matrix of an object associated with the virtual information to be superimposed. Used to superimpose this virtual information at the corresponding position by 3D rendering. Moreover, when superimposing virtual information, the virtual information display part 70 correct | amends the position and direction of virtual information based on an integrated recognition result.

なお、仮想情報表示部７０は、有線ケーブルや無線ネットワークを介して自端末と接続された外部モニタや、自端末に搭載されているディスプレイ（網膜投影型を含む）や、プロジェクタなどの、映像をユーザに掲示するための表示装置を制御するものである。この表示装置が、例えば、光学シースルー型のＨＭＤや、プロジェクタを用いて視界に直接付加情報を重畳するものである場合には、プレビュー画像は表示させず、仮想情報のみを表示させることとしてもよい。 The virtual information display unit 70 displays images from an external monitor connected to the terminal via a wired cable or a wireless network, a display (including a retina projection type) mounted on the terminal, a projector, or the like. A display device for posting to a user is controlled. If this display device is an optical see-through type HMD or a projector that superimposes additional information directly on the field of view, only the virtual information may be displayed without displaying the preview image. .

［画像処理装置１の動作］
以上の構成を備える画像処理装置１の動作について、図６から１０を用いて以下に説明する。 [Operation of Image Processing Apparatus 1]
The operation of the image processing apparatus 1 having the above configuration will be described below with reference to FIGS.

図６は、画像処理装置１のフローチャートである。 FIG. 6 is a flowchart of the image processing apparatus 1.

ステップＳ１００において、画像処理装置１は、画像取得部１０によりプレビュー画像を取得し、ステップＳ１０１に処理を移す。 In step S100, the image processing apparatus 1 acquires a preview image by the image acquisition unit 10, and proceeds to step S101.

ステップＳ１０１において、画像処理装置１は、画像認識部２０およびオブジェクト関係推定処理部３１により第１の画像認識処理を行って、ステップＳ１００で取得したプレビュー画像内の各オブジェクトを認識し、ステップＳ１０２に処理を移す。なお、第１の画像認識処理の詳細については、図７を用いて後述する。 In step S101, the image processing apparatus 1 performs a first image recognition process using the image recognition unit 20 and the object relationship estimation processing unit 31, recognizes each object in the preview image acquired in step S100, and proceeds to step S102. Move processing. Details of the first image recognition process will be described later with reference to FIG.

ステップＳ１０２において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１０１において現フレームで認識した全てのオブジェクトの中から２つを選択し、ステップＳ１０３に処理を移す。 In step S102, the image processing apparatus 1 causes the object relationship estimation processing unit 31 to select two objects from all the objects recognized in the current frame in step S101, and moves the process to step S103.

ステップＳ１０３において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１０２またはステップＳ１０７で選択した２つのオブジェクトに対する相対姿勢連続推定回数カウンタのカウンタ値がゼロであるか否かを判別する。ゼロであると判別した場合には、ステップＳ１０４に処理を移し、ゼロではないと判別した場合には、ステップＳ１０５に処理を移す。なお、相対姿勢連続推定回数カウンタは、ステップＳ１０２またはステップＳ１０７で選択した２つのオブジェクトごとに設けられ、２つのオブジェクトごとに、何フレームに亘って連続して相対姿勢を求めたかを計数するためのものである。 In step S103, the image processing apparatus 1 uses the object relationship estimation processing unit 31 to determine whether or not the counter value of the relative posture continuous estimation number counter for the two objects selected in step S102 or step S107 is zero. If it is determined that the value is zero, the process proceeds to step S104. If it is determined that the value is not zero, the process proceeds to step S105. Note that the relative posture continuous estimation number counter is provided for each of the two objects selected in step S102 or step S107, and counts how many frames the relative posture is continuously obtained for each of the two objects. Is.

ステップＳ１０４において、画像処理装置１は、オブジェクト関係推定処理部３１により、相対姿勢算出処理を行って、ステップＳ１０６に処理を移す。なお、相対姿勢算出処理の詳細については、図８を用いて後述する。 In step S104, the image processing apparatus 1 performs a relative orientation calculation process using the object relationship estimation processing unit 31, and the process proceeds to step S106. The details of the relative posture calculation process will be described later with reference to FIG.

ステップＳ１０５において、画像処理装置１は、オブジェクト関係推定処理部３１により、分類処理を行って、ステップＳ１０６に処理を移す。なお、分類処理の詳細については、図９を用いて後述する。 In step S105, the image processing apparatus 1 performs a classification process by the object relationship estimation processing unit 31, and moves the process to step S106. Details of the classification process will be described later with reference to FIG.

ステップＳ１０６において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１０１において現フレームで認識した全てのオブジェクトについて、２つで１つの組として、全ての組み合わせを現フレームで選択したか否かを判別する。選択したと判別した場合には、ステップＳ１０８に処理を移し、選択していないと判別した場合には、ステップＳ１０７に処理を移す。 In step S106, the image processing apparatus 1 uses the object relationship estimation processing unit 31 to determine whether all the combinations recognized in the current frame in step S101 are selected as one set of two in the current frame. Is determined. If it is determined that it has been selected, the process proceeds to step S108. If it is determined that it has not been selected, the process proceeds to step S107.

ステップＳ１０７において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１０１において現フレームで認識した全てのオブジェクトのうち、選択していない組み合わせを構成する２つのオブジェクトを選択し、ステップＳ１０３に処理を戻す。 In step S107, the image processing apparatus 1 uses the object relationship estimation processing unit 31 to select two objects constituting a combination that has not been selected among all objects recognized in the current frame in step S101, and then proceeds to step S103. Return processing.

ステップＳ１０８において、画像処理装置１は、姿勢変換処理部３２および仮想情報表示部７０により、第１の重畳表示処理を行って、図６に示した処理を終了する。なお、第１の重畳表示処理の詳細については、図１０を用いて後述する。 In step S108, the image processing apparatus 1 performs the first superimposed display process using the orientation conversion processing unit 32 and the virtual information display unit 70, and ends the process illustrated in FIG. Details of the first superimposed display process will be described later with reference to FIG.

図７は、画像処理装置１が行う上述の第１の画像認識処理のフローチャートである。 FIG. 7 is a flowchart of the first image recognition process described above performed by the image processing apparatus 1.

ステップＳ１１０において、画像処理装置１は、姿勢追跡部２３により、ステップＳ１００で取得したプレビュー画像中に、前フレームで認識したオブジェクトが含まれているか否かを判別する。含まれていると判別した場合には、ステップＳ１１１に処理を移し、含まれていないと判別した場合には、ステップＳ１１６に処理を移す。 In step S110, the image processing apparatus 1 uses the posture tracking unit 23 to determine whether or not the preview image acquired in step S100 includes the object recognized in the previous frame. If it is determined that it is included, the process proceeds to step S111. If it is determined that it is not included, the process proceeds to step S116.

ステップＳ１１１において、画像処理装置１は、姿勢追跡部２３により、ステップＳ１１０において前フレームで認識したと判別した各オブジェクトについて、前フレームでの姿勢を初期値として姿勢の追跡処理を行って認識し、ステップＳ１１２に処理を移す。 In step S111, the image processing apparatus 1 recognizes each object determined by the posture tracking unit 23 as having been recognized in the previous frame in step S110 by performing posture tracking processing using the posture in the previous frame as an initial value, The process moves to step S112.

ステップＳ１１２において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１１１において現フレームで姿勢を追跡できた各オブジェクトについて、対応する姿勢追跡連続失敗回数カウンタをリセットし、ステップＳ１１３に処理を移す。これによれば、画像取得部１０により取得されたプレビュー画像に含まれている全てのオブジェクトのうち、現フレームで認識できたオブジェクトに対する姿勢追跡連続失敗回数カウンタが、リセットされることになる。なお、姿勢追跡連続失敗回数カウンタは、オブジェクトごとに設けられ、オブジェクトごとに、何フレームに亘って連続して姿勢の追跡に失敗したかを計数するためのものである。 In step S112, the image processing apparatus 1 causes the object relationship estimation processing unit 31 to reset the corresponding posture tracking consecutive failure number counter for each object whose posture has been tracked in the current frame in step S111, and performs the process in step S113. Move. According to this, the posture tracking consecutive failure number counter for the object recognized in the current frame among all the objects included in the preview image acquired by the image acquisition unit 10 is reset. The posture tracking continuous failure count counter is provided for each object, and counts the number of frames for which posture tracking has failed continuously for each object.

ステップＳ１１３において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１１１において現フレームで姿勢を追跡できなかった各オブジェクトについて、対応する姿勢追跡連続失敗回数カウンタをインクリメントし、ステップＳ１１４に処理を移す。これによれば、画像取得部１０により取得されたプレビュー画像に含まれている全てのオブジェクトのうち、現フレームで認識できなかったオブジェクトに対する姿勢追跡連続失敗回数カウンタが、１だけ加算されることになる。 In step S113, the image processing apparatus 1 causes the object relationship estimation processing unit 31 to increment the corresponding posture tracking consecutive failure number counter for each object whose posture could not be tracked in the current frame in step S111, and the process proceeds to step S114. Move. According to this, among all the objects included in the preview image acquired by the image acquisition unit 10, the posture tracking continuous failure number counter for the object that could not be recognized in the current frame is incremented by one. Become.

ステップＳ１１４において、画像処理装置１は、オブジェクト関係推定処理部３１により、姿勢追跡連続失敗回数カウンタの値が閾値θ以上であるオブジェクトについて、姿勢追跡連続失敗回数カウンタをリセットするとともに、グループ化を解除して、ステップＳ１１５に処理を移す。これによれば、画像取得部１０により取得されたプレビュー画像に含まれている全てのオブジェクトのうち、θフレーム以上に亘って連続して認識できなかったオブジェクトについて、姿勢追跡連続失敗回数カウンタがリセットされるとともに、グループ化が解除されることになる。 In step S114, the image processing apparatus 1 causes the object relationship estimation processing unit 31 to reset the posture tracking continuous failure count counter and cancel grouping for objects whose posture tracking continuous failure count counter is equal to or greater than the threshold θ. Then, the process proceeds to step S115. According to this, among the all objects included in the preview image acquired by the image acquisition unit 10, the posture tracking consecutive failure counter is reset for objects that could not be recognized continuously over the θ frame or more. At the same time, the grouping is canceled.

ステップＳ１１５において、画像処理装置１は、姿勢追跡部２３により、追跡中のオブジェクトの数が、予め定められた上限値に達したか否かを判別する。達した場合には、図７に示した処理を終了し、達していない場合には、ステップＳ１１６に処理を移す。 In step S115, the image processing apparatus 1 determines whether or not the number of objects being tracked has reached a predetermined upper limit value by the posture tracking unit 23. If it has reached, the process shown in FIG. 7 is terminated. If it has not reached, the process proceeds to step S116.

ステップＳ１１６において、画像処理装置１は、オブジェクト識別部２１により、ステップＳ１００で取得したプレビュー画像内のオブジェクトを識別し、ステップＳ１１７に処理を移す。 In step S116, the image processing apparatus 1 uses the object identification unit 21 to identify the object in the preview image acquired in step S100, and the process proceeds to step S117.

ステップＳ１１７において、画像処理装置１は、初期姿勢推定部２２により、ステップＳ１１０で取得したプレビュー画像に含まれるステップＳ１１６で識別したオブジェクトのうち、前フレームで認識していないオブジェクトのそれぞれと、現フレームで姿勢の追跡に失敗したオブジェクトのそれぞれと、について、姿勢の初期値を求めて認識し、ステップＳ１１８に処理を移す。 In step S117, the image processing apparatus 1 uses the initial posture estimation unit 22 to recognize each of the objects not recognized in the previous frame among the objects identified in step S116 included in the preview image acquired in step S110, and the current frame. The initial value of the posture is obtained and recognized for each of the objects for which tracking of the posture has failed, and the process proceeds to step S118.

ステップＳ１１８において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１１７で姿勢の初期値を求めた各オブジェクトに対して姿勢追跡連続失敗回数カウンタを新たに設け、新たに設けた姿勢追跡連続失敗回数カウンタをリセットし、図７に示した処理を終了する。 In step S118, the image processing apparatus 1 uses the object relationship estimation processing unit 31 to newly provide a posture tracking continuous failure counter for each object for which the initial value of the posture has been obtained in step S117. The continuous failure frequency counter is reset, and the process shown in FIG.

図８は、画像処理装置１が行う上述の第１の相対姿勢算出処理のフローチャートである。 FIG. 8 is a flowchart of the above-described first relative posture calculation process performed by the image processing apparatus 1.

ステップＳ１２０において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１０２またはステップＳ１０７で選択した２つのオブジェクト間の、現フレームにおける相対姿勢を求め、図８に示した処理を終了する。 In step S120, the image processing apparatus 1 obtains the relative posture in the current frame between the two objects selected in step S102 or step S107 by the object relationship estimation processing unit 31, and ends the process shown in FIG.

図９は、画像処理装置１が行う上述の分類処理のフローチャートである。 FIG. 9 is a flowchart of the above-described classification process performed by the image processing apparatus 1.

ステップＳ１３０において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１０２またはステップＳ１０７で選択した２つのオブジェクト間の、現フレームにおける相対姿勢を求め、ステップＳ１３１に処理を移す。 In step S130, the image processing apparatus 1 obtains the relative posture in the current frame between the two objects selected in step S102 or step S107 by the object relationship estimation processing unit 31, and moves the process to step S131.

ステップＳ１３１において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１０２またはステップＳ１０７で選択した２つのオブジェクト間の相対姿勢について、相対姿勢連続推定回数カウンタのカウンタ値を「１」にインクリメントしたフレーム（以降、平均相対姿勢推定開始フレームとする）から前フレームまでの各フレームで推定したものの平均値を、平均相対姿勢として求め、ステップＳ１３２に処理を移す。 In step S131, the image processing apparatus 1 causes the object relationship estimation processing unit 31 to increment the counter value of the relative posture continuous estimation number counter to “1” for the relative posture between the two objects selected in step S102 or step S107. The average value of those estimated in each frame from the obtained frame (hereinafter referred to as the average relative posture estimation start frame) to the previous frame is obtained as the average relative posture, and the process proceeds to step S132.

なお、平均相対姿勢は、前フレームにおいて計算した平均相対姿勢（平均相対姿勢推定開始フレームから２フレーム前のフレームまでの平均相対姿勢）と、前フレームにおいて計算した相対姿勢Ｗ_Ｎ−１と、から加重平均として求めることができる。前フレームにおいて計算した平均相対姿勢は、以下の数式（５）により求めることができ、加重平均は、以下の数式（６）により求めることができる。上述の相対姿勢Ｗ_Ｎ−１や数式（５）、（６）において、Ｎは、平均相対姿勢を推定する際の相対姿勢連続推定回数カウンタのカウンタ値を表すものとする。 The average relative posture is calculated from the average relative posture calculated in the previous frame (average relative posture from the average relative posture estimation start frame to the frame two frames before) and the relative posture W _N−1 calculated in the previous frame. It can be obtained as a weighted average. The average relative attitude calculated in the previous frame can be obtained by the following equation (5), and the weighted average can be obtained by the following equation (6). In the above-described relative posture W _N-1 and equations (5) and (6), N represents the counter value of the relative posture continuous estimation number counter when estimating the average relative posture.

ステップＳ１３２において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１０２またはステップＳ１０７で選択した２つのオブジェクト間について、平均相対姿勢と、現フレームで推定した相対姿勢と、の差分の絶対値を、相対姿勢変化量として求め、ステップＳ１３３に処理を移す。 In step S132, the image processing apparatus 1 uses the object relationship estimation processing unit 31 to calculate the absolute difference between the average relative posture and the relative posture estimated in the current frame between the two objects selected in step S102 or step S107. The value is obtained as a relative posture change amount, and the process proceeds to step S133.

ステップＳ１３３において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１３２により求めた相対姿勢変化量が閾値β未満であるか否かを判別する。相対姿勢変化量が閾値β未満であると判別した場合には、ステップＳ１３４に処理を移し、相対姿勢変化量が閾値β以上である場合には、ステップＳ１３７に処理を移す。 In step S133, the image processing apparatus 1 uses the object relationship estimation processing unit 31 to determine whether or not the relative posture change amount obtained in step S132 is less than the threshold value β. If it is determined that the relative posture change amount is less than the threshold value β, the process proceeds to step S134. If the relative posture change amount is equal to or greater than the threshold value β, the process proceeds to step S137.

ステップＳ１３４において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１３３で相対姿勢変化量が閾値β未満であると判別した２つのオブジェクトに対する相対姿勢連続推定回数カウンタをインクリメントし、ステップＳ１３５に処理を移す。 In step S134, the image processing apparatus 1 increments the relative posture continuous estimation number counter for the two objects that have been determined in step S133 that the relative posture change amount is less than the threshold β by the object relationship estimation processing unit 31, and step S135. Move processing to.

ステップＳ１３５において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１３３で相対姿勢変化量が閾値β未満であると判別した２つのオブジェクトに対する相対姿勢連続推定回数カウンタのカウンタ値が閾値α以上であるか否かを判別する。カウンタ値が閾値α以上であると判別した場合には、ステップＳ１３６に処理を移し、カウンタ値が閾値α未満であると判別した場合には、図９に示した処理を終了する。 In step S135, the image processing apparatus 1 determines that the counter value of the relative orientation continuous estimation number counter for the two objects determined by the object relationship estimation processing unit 31 in step S133 that the relative orientation change amount is less than the threshold value β is the threshold value α. It is determined whether or not this is the case. If it is determined that the counter value is greater than or equal to the threshold value α, the process proceeds to step S136. If it is determined that the counter value is less than the threshold value α, the process illustrated in FIG. 9 ends.

ステップＳ１３６において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１３５でカウンタ値が閾値α以上であると判別した２つのオブジェクトを、同一のグループに分類し、図９に示した処理を終了する。 In step S136, the image processing apparatus 1 classifies the two objects whose counter value is determined to be greater than or equal to the threshold value α in step S135 by the object relationship estimation processing unit 31 into the same group, and performs the processing illustrated in FIG. Exit.

ステップＳ１３７において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１３３で相対姿勢変化量が閾値β以上であると判別した２つのオブジェクトに対する相対姿勢連続推定回数カウンタをリセットし、ステップＳ１３８に処理を移す。 In step S137, the image processing apparatus 1 uses the object relationship estimation processing unit 31 to reset the relative posture continuous estimation number counter for the two objects that have been determined in step S133 that the relative posture change amount is equal to or greater than the threshold value β, step S138. Move processing to.

ステップＳ１３８において、画像処理装置１は、オブジェクト関係推定処理部３１により、ステップＳ１３３で相対姿勢変化量が閾値β以上であると判別した２つのオブジェクトは、同一のグループに分類されているか否かを判別する。同一のグループに分類されていると判別した場合には、ステップＳ１３９に処理を移し、同一のグループに分類されていないと判別した場合には、図９に示した処理を終了する。 In step S138, the image processing apparatus 1 determines whether or not the two objects determined by the object relationship estimation processing unit 31 that the relative posture change amount is equal to or larger than the threshold β in step S133 are classified into the same group. Determine. If it is determined that they are classified into the same group, the process proceeds to step S139. If it is determined that they are not classified into the same group, the process shown in FIG. 9 is terminated.

ステップＳ１３９において、画像処理装置１は、ステップＳ１３３で相対姿勢変化量が閾値β以上であると判別した２つのオブジェクトについて、同一のグループに分類されているのを解除し、図９に示した処理を終了する。 In step S139, the image processing apparatus 1 cancels that the two objects determined in step S133 that the relative posture change amount is equal to or larger than the threshold value β are classified into the same group, and performs the processing illustrated in FIG. Exit.

図１０は、画像処理装置１が行う上述の第１の重畳表示処理のフローチャートである。 FIG. 10 is a flowchart of the above-described first superimposed display process performed by the image processing apparatus 1.

ステップＳ１４０において、画像処理装置１は、姿勢変換処理部３２により、前フレームで認識したオブジェクトの中に、現フレームで認識していないオブジェクト（消失オブジェクト）が存在しているか否かを判別する。存在していると判別した場合には、ステップＳ１４１に処理を移し、存在していないと判別した場合には、ステップＳ１４３に処理を移す。 In step S 140, the image processing apparatus 1 determines whether or not the object recognized in the previous frame includes an object (disappeared object) not recognized in the current frame by the posture conversion processing unit 32. If it is determined that it exists, the process proceeds to step S141. If it is determined that it does not exist, the process proceeds to step S143.

ステップＳ１４１において、画像処理装置１は、姿勢変換処理部３２により、消失オブジェクトと前フレームで同一のグループに分類されていたオブジェクトが存在しているか否かを判別する。存在していると判別した場合には、ステップＳ１４２に処理を移し、存在していないと判別した場合には、ステップＳ１４３に処理を移す。 In step S 141, the image processing apparatus 1 determines whether or not the posture conversion processing unit 32 includes an object that has been classified into the same group in the previous frame as the lost object. If it is determined that it exists, the process proceeds to step S142. If it is determined that it does not exist, the process proceeds to step S143.

ステップＳ１４２において、画像処理装置１は、姿勢変換処理部３２により、ステップＳ１４１において消失オブジェクトと前フレームで同一のグループに分類されていたと判別したオブジェクト（上述の主要オブジェクト）の現フレームにおける姿勢を、消失オブジェクトと主要オブジェクトとの間の前フレームで推定した相対姿勢を用いて、消失オブジェクトの現フレームにおける姿勢に変換し、ステップＳ１４３に処理を移す。 In step S142, the image processing apparatus 1 uses the posture conversion processing unit 32 to determine the posture in the current frame of the object (the main object described above) that has been classified in the same group in the previous frame as the lost object in step S141. Using the relative posture estimated in the previous frame between the lost object and the main object, the lost object is converted into the posture in the current frame, and the process proceeds to step S143.

ステップＳ１４３において、画像処理装置１は、仮想情報表示部７０により、ステップＳ１１１やステップＳ１１７における認識結果と、ステップＳ１４２により推定した姿勢と、を用いて、ステップＳ１００で取得したプレビュー画像に仮想情報を重畳させ、図１０に示した処理を終了する。 In step S143, the image processing apparatus 1 uses the virtual information display unit 70 to add virtual information to the preview image acquired in step S100 using the recognition result in step S111 or step S117 and the posture estimated in step S142. The process shown in FIG.

以上の画像処理装置１によれば、以下の効果を奏することができる。 According to the image processing apparatus 1 described above, the following effects can be obtained.

画像処理装置１は、画像認識部２０により認識できなかったオブジェクトが存在する場合に、オブジェクト関係推定部３０により、このオブジェクトと同一のグループに分類されているとともに画像認識部２０により認識できたオブジェクトを主要オブジェクトとして適用して、画像認識部２０では認識できなかったオブジェクトを、主要オブジェクトの認識結果に基づいて認識する。このため、画像認識部２０では認識できなかったオブジェクトを、このオブジェクトと関係性の高いオブジェクトの認識結果に基づいて認識することができる。したがって、画像認識部２０では認識できなかったオブジェクトを認識することができるので、ＡＲ技術において、オブジェクトの認識の頑健性を向上させることができる。 When there is an object that cannot be recognized by the image recognition unit 20, the image processing apparatus 1 is classified into the same group as the object by the object relationship estimation unit 30 and can be recognized by the image recognition unit 20. Is applied as a main object, and an object that cannot be recognized by the image recognition unit 20 is recognized based on the recognition result of the main object. For this reason, an object that could not be recognized by the image recognition unit 20 can be recognized based on the recognition result of an object that is highly related to the object. Therefore, since the object that could not be recognized by the image recognition unit 20 can be recognized, the robustness of object recognition can be improved in the AR technology.

また、画像処理装置１は、オブジェクト関係推定部３０により、画像認識部２０により認識されたオブジェクト間の関係性として、オブジェクト間の相対姿勢を求める。このため、オブジェクト間の相対姿勢を用いて、同様の動きをしているオブジェクト同士といった、関係性の高いオブジェクト同士を検索することができる。 In the image processing apparatus 1, the object relationship estimation unit 30 obtains the relative posture between the objects as the relationship between the objects recognized by the image recognition unit 20. For this reason, it is possible to search for highly related objects such as objects that are moving in the same manner using the relative posture between the objects.

また、画像処理装置１は、オブジェクト関係推定部３０により、画像取得部１０によりプレビュー画像が取得されるたびに、プレビュー画像内のオブジェクト間の相対姿勢を求め、αフレームに亘って連続して、相対姿勢のプレビュー画像間での変化量が閾値未満であるオブジェクトを、同一のグループに分類する。このため、複数の連続するプレビュー画像におけるオブジェクト同士の関係性を考慮して、オブジェクトを分類することができる。 In addition, the image processing apparatus 1 obtains the relative posture between objects in the preview image every time the preview image is acquired by the image acquisition unit 10 by the object relationship estimation unit 30, and continuously over the α frame, Objects whose relative posture changes between preview images are less than the threshold are classified into the same group. For this reason, it is possible to classify objects in consideration of the relationship between objects in a plurality of continuous preview images.

また、画像処理装置１は、オブジェクト関係推定部３０により、画像取得部１０により取得された最新のプレビュー画像において求めた相対姿勢と、最新のプレビュー画像よりも前のプレビュー画像において求めた相対姿勢の平均と、の差分を変化量として求める。このため、オブジェクト同士の関係性をより考慮して、オブジェクトをより適切に分類することができる。 In addition, the image processing apparatus 1 uses the object relationship estimation unit 30 to calculate the relative posture obtained in the latest preview image acquired by the image acquisition unit 10 and the relative posture obtained in the preview image before the latest preview image. The difference from the average is obtained as the amount of change. For this reason, it is possible to classify the objects more appropriately in consideration of the relationship between the objects.

また、画像処理装置１は、平均相対姿勢を、前フレームにおいて計算した平均相対姿勢と、前フレームにおいて計算した相対姿勢Ｗ_Ｎ−１と、から加重平均として求める。このため、前フレームにおいて計算した平均相対姿勢を用いて、現フレームの平均相対姿勢を更新する形になる。したがって、フレームが切り替わるたびに、平均相対姿勢推定開始フレームから前フレームまでの各フレームで推定した相対姿勢の平均値を初めから計算し直して平均相対姿勢を求める場合と比べて、処理負荷を軽減することができるとともに、各フレームにおいて計算した平均姿勢を記憶しておく必要がなくなるため、必要とするメモリ容量を小さくすることができる。 Further, the image processing apparatus 1 obtains the average relative posture as a weighted average from the average relative posture calculated in the previous frame and the relative posture W _N−1 calculated in the previous frame. Therefore, the average relative attitude of the current frame is updated using the average relative attitude calculated in the previous frame. Therefore, each time a frame is switched, the processing load is reduced compared to the case where the average relative posture is calculated by recalculating the average value of the relative posture estimated from each frame from the average relative posture estimation start frame to the previous frame. In addition, since it is not necessary to store the average attitude calculated in each frame, the required memory capacity can be reduced.

＜第２実施形態＞
［画像処理装置１Ａの概要］
図１１は、本発明の第２実施形態に係る画像処理装置１Ａのブロック図である。画像処理装置１Ａは、図１に示した本発明の第１実施形態に係る画像処理装置１とは、認識処理制御部４０を備える点で異なる。なお、画像処理装置１Ａにおいて、画像処理装置１と同一の構成要件については、同一符号を付し、その説明を省略する。 Second Embodiment
[Outline of Image Processing Apparatus 1A]
FIG. 11 is a block diagram of an image processing apparatus 1A according to the second embodiment of the present invention. The image processing apparatus 1A is different from the image processing apparatus 1 according to the first embodiment of the present invention shown in FIG. In the image processing apparatus 1A, the same components as those of the image processing apparatus 1 are denoted by the same reference numerals, and the description thereof is omitted.

ここで、まず、図２から５を用いて上述したＡＲ空間を、上述の特許文献１の技術で実現する場合について、以下に説明する。この場合、端末１００は、オブジェクトＭ１からＭ３をそれぞれリアルタイムで認識し続ける必要があり、処理負荷が高くなってしまう。 Here, first, a case where the AR space described above with reference to FIGS. 2 to 5 is realized by the technique of Patent Document 1 described above will be described below. In this case, the terminal 100 needs to continue to recognize each of the objects M1 to M3 in real time, which increases the processing load.

次に、図２から５を用いて上述したＡＲ空間を、本実施形態に係る画像処理装置１Ａで実現する場合について、以下に説明する。この場合、端末１００は、例えばオブジェクトＭ１からＭ３を同一のグループに分類していれば、これらオブジェクトＭ１からＭ３のうち、１つだけ追跡処理により認識すれば、他の２つについては相対姿勢を用いて認識することができる。これによれば、端末１００が姿勢追跡部２３による追跡処理を行わなくてはならないオブジェクトの数が減少するので、端末１００の処理負荷を軽減することができる。 Next, the case where the AR space described above with reference to FIGS. 2 to 5 is realized by the image processing apparatus 1A according to the present embodiment will be described below. In this case, if the terminal 100 classifies the objects M1 to M3 into the same group, for example, if only one of the objects M1 to M3 is recognized by the tracking process, the relative attitude is set for the other two. Can be recognized. According to this, since the number of objects that the terminal 100 has to perform the tracking process by the posture tracking unit 23 decreases, the processing load on the terminal 100 can be reduced.

［画像処理装置１Ａの構成］
以上の画像処理装置１Ａについて、以下に詳述する。 [Configuration of Image Processing Apparatus 1A]
The above image processing apparatus 1A will be described in detail below.

［認識処理制御部４０の構成および動作］
図１１に戻って、画像処理装置１Ａに設けられた認識処理制御部４０は、オブジェクト関係推定処理部３１によるオブジェクトの分類結果と、画像認識部２０による認識結果と、を入力とする。この認識処理制御部４０は、認識処理制御処理部４１および姿勢照合処理部４２を備える。 [Configuration and Operation of Recognition Processing Control Unit 40]
Returning to FIG. 11, the recognition processing control unit 40 provided in the image processing apparatus 1 A receives the object classification result by the object relationship estimation processing unit 31 and the recognition result by the image recognition unit 20. The recognition processing control unit 40 includes a recognition processing control processing unit 41 and a posture matching processing unit 42.

認識処理制御処理部４１は、オブジェクト関係推定処理部３１によるオブジェクトの分類結果と、画像認識部２０による認識結果と、を入力とする。この認識処理制御処理部４１は、同一のグループに分類されているオブジェクトの中に、認識に成功したオブジェクトが２つ以上存在している場合に、これら認識に成功したオブジェクトのうち、１つを主要オブジェクトとして登録し、残りを認識休止オブジェクトとして登録する。また、認識休止オブジェクトについて、姿勢追跡部２３による姿勢の追跡処理を休止させる。 The recognition processing control processing unit 41 receives the classification result of the object by the object relationship estimation processing unit 31 and the recognition result by the image recognition unit 20 as inputs. When there are two or more objects that have been successfully recognized among the objects classified into the same group, the recognition process control processing unit 41 selects one of these objects that have been successfully recognized. Register as a main object and register the rest as recognition pause objects. In addition, the posture tracking process by the posture tracking unit 23 is paused for the recognition pause object.

ここで、オクルージョンや光の反射などによってオブジェクトの認識に失敗してしまうのが、一時的なものであれば、オブジェクトの認識の失敗の解消時には、姿勢追跡部２３による姿勢の追跡処理に成功することが想定される。一般的に、初期姿勢推定部２２による姿勢の初期値の推定よりも、正確な姿勢の初期値を用いた姿勢追跡部２３による姿勢の追跡処理の方が、処理負荷や、姿勢推定の精度や、認識の頑健性に優れる。そこで、画像認識部２０による認識に失敗したオブジェクトについては、本来であれば、初期姿勢推定部２２による姿勢の初期値の推定からやり直すが、画像認識部２０による認識に失敗したオブジェクトが主要オブジェクトと同一のグループに分類されている場合には、認識処理制御処理部４１は、画像認識部２０による認識に失敗したオブジェクトについての姿勢を姿勢変換処理部３２により求め、その結果を姿勢の初期値として姿勢追跡部２３による追跡処理を行わせる。 Here, if the object recognition failure due to occlusion or light reflection is temporary, the posture tracking unit 23 succeeds in the posture tracking processing when the object recognition failure is resolved. It is assumed that In general, the posture tracking process by the posture tracking unit 23 using the accurate initial value of the posture is more accurate than the estimation of the initial value of the posture by the initial posture estimation unit 22. Excellent recognition robustness. Therefore, the object that has failed to be recognized by the image recognizing unit 20 is redoed from the initial posture estimation by the initial posture estimating unit 22, but the object that has failed to be recognized by the image recognizing unit 20 is the main object. If they are classified into the same group, the recognition processing control processing unit 41 obtains the posture of the object that has failed to be recognized by the image recognition unit 20 by the posture conversion processing unit 32, and uses the result as the initial value of the posture. The tracking process by the attitude tracking unit 23 is performed.

姿勢照合処理部４２は、画像取得部１０により取得されたプレビュー画像と、姿勢変換処理部３２による変換により得られた認識休止オブジェクトの認識結果と、を入力とする。ここで、認識休止オブジェクトについては、姿勢追跡部２３による姿勢の追跡処理を休止するので、主要オブジェクトとの間の相対姿勢を推定することができない。このため、認識休止オブジェクトが動いた場合に、相対姿勢を適切に更新することができず、その結果、姿勢変換処理部３２による変換により得られた認識休止オブジェクトの認識結果が適切ではなく、認識休止オブジェクトに対して正しい位置に仮想情報を重畳させることができなくなってしまう。そこで、姿勢照合処理部４２は、認識休止オブジェクトについて、姿勢変換処理部３２の変換により得られた認識結果が正しいかどうか、すなわち相対姿勢が変動しているかどうかの照合を行う。 The posture collation processing unit 42 receives the preview image acquired by the image acquisition unit 10 and the recognition result of the recognition pause object obtained by the conversion by the posture conversion processing unit 32 as inputs. Here, for the recognition pause object, the posture tracking process by the posture tracking unit 23 is paused, so that the relative posture with respect to the main object cannot be estimated. For this reason, when the recognition pause object moves, the relative posture cannot be appropriately updated. As a result, the recognition result of the recognition pause object obtained by the conversion by the posture conversion processing unit 32 is not appropriate, and the recognition pause object is recognized. It becomes impossible to superimpose virtual information at the correct position with respect to the pause object. Therefore, the posture matching processing unit 42 checks whether the recognition result obtained by the conversion of the posture conversion processing unit 32 is correct, that is, whether the relative posture is fluctuated, for the recognition pause object.

具体的には、姿勢照合処理部４２は、ＳＳＤ（Sum of Squared Difference）やＮＣＣ（Normalized Cross Correlation）などの画像の類似度を評価する手法を用いて、高速に姿勢を照合する。この際、画像認識部２０が用いるオブジェクトの参照モデル（平面オブジェクトの場合には画像、三次元オブジェクトの場合には３Ｄモデル）を、姿勢変換処理部３２による変換により得られた認識結果で投影してテンプレート画像を生成し、プレビュー画像とマッチングすることで、類似度を評価する。姿勢変換処理部３２による変換により得られた認識結果が正しい場合には、ＳＳＤやＮＣＣの応答値（類似度）が高くなることが想定されるため、姿勢照合処理部４２は、類似度が閾値γを下回る場合に、このオブジェクトを認識休止オブジェクトから除外する。 Specifically, the posture collation processing unit 42 collates postures at high speed using a technique for evaluating the similarity of images such as SSD (Sum of Squared Difference) and NCC (Normalized Cross Correlation). At this time, the reference model of the object used by the image recognition unit 20 (an image in the case of a planar object, a 3D model in the case of a three-dimensional object) is projected with a recognition result obtained by conversion by the posture conversion processing unit 32. A template image is generated, and the similarity is evaluated by matching with the preview image. If the recognition result obtained by the conversion by the posture conversion processing unit 32 is correct, it is assumed that the response value (similarity) of the SSD or NCC is high. If it falls below γ, this object is excluded from the recognition pause object.

また、相対姿勢や主要オブジェクトの姿勢の誤差が大きい場合には、姿勢変換処理部３２による変換により得られた認識結果には、誤差が含まれることがあり得る。そこで、姿勢照合処理部４２は、ＳＳＤやＮＣＣなどの画像の類似度の評価の際に、テンプレート画像を上下左右に一定範囲内でスライドさせて、最も類似度の高い箇所をテンプレートマッチングによって推定し、最も類似度の高い箇所における応答値を類似度としてもよい。 In addition, when the error in the relative posture or the posture of the main object is large, the recognition result obtained by the conversion by the posture conversion processing unit 32 may include an error. Therefore, the posture collation processing unit 42 estimates the location with the highest similarity by sliding the template image vertically and horizontally within a certain range when evaluating the similarity of images such as SSD and NCC. The response value at the location with the highest similarity may be used as the similarity.

また、姿勢変換処理部３２による変換により得られた認識結果に含まれる誤差を補正するために、姿勢照合処理部４２は、ＳＳＤやＮＣＣなどの画像の類似度の評価の際に、姿勢におけるＳＳＤやＮＣＣのヤコビアンを計算し、反復計算により、類似度を最大化する姿勢を推定してもよい。例えば、非特許文献４には、平面オブジェクトに対して、ＳＳＤを最小化する姿勢を効率的な反復計算により推定する手法（ＥＳＭ：Efficient Second Order Minimization）が開示されており、この際の初期値に、姿勢変換処理部３２により得られた姿勢を用いることで、少ない反復回数で姿勢を最小化して、ＡＲ技術において、処理負荷をさらに軽減したり、オブジェクトの認識の頑健性をさらに向上させたりすることができる。 In addition, in order to correct an error included in the recognition result obtained by the conversion by the posture conversion processing unit 32, the posture matching processing unit 42 performs the SSD in the posture when evaluating the similarity of images such as SSD and NCC. Alternatively, the posture that maximizes the similarity may be estimated by calculating the Jacobian of NCC and iterative calculation. For example, Non-Patent Document 4 discloses a method (ESM: Efficient Second Order Minimization) for estimating an attitude for minimizing SSD with respect to a planar object by efficient iterative calculation. In addition, by using the posture obtained by the posture conversion processing unit 32, the posture can be minimized with a small number of iterations, and the AR technology can further reduce the processing load and further improve the robustness of object recognition. can do.

［画像処理装置１Ａの動作］
以上の構成を備える画像処理装置１Ａの動作について、図１２から１５を用いて以下に説明する。 [Operation of Image Processing Apparatus 1A]
The operation of the image processing apparatus 1A having the above configuration will be described below with reference to FIGS.

図１２は、画像処理装置１Ａのフローチャートである。 FIG. 12 is a flowchart of the image processing apparatus 1A.

ステップＳ２００において、画像処理装置１Ａは、画像取得部１０によりプレビュー画像を取得し、ステップＳ２０１に処理を移す。 In step S200, the image processing apparatus 1A acquires a preview image by the image acquisition unit 10, and moves the process to step S201.

ステップＳ２０１において、画像処理装置１Ａは、画像認識部２０とオブジェクト関係推定処理部３１と認識処理制御部４０とにより第２の画像認識処理を行って、ステップＳ２００で取得したプレビュー画像内の各オブジェクトを認識し、ステップＳ２０２に処理を移す。なお、第２の画像認識処理の詳細については、図１３を用いて後述する。 In step S201, the image processing apparatus 1A performs a second image recognition process using the image recognition unit 20, the object relationship estimation processing unit 31, and the recognition process control unit 40, and each object in the preview image acquired in step S200. Is recognized, and the process proceeds to step S202. Details of the second image recognition process will be described later with reference to FIG.

ステップＳ２０２において、画像処理装置１Ａは、オブジェクト関係推定処理部３１により、ステップＳ２０１において現フレームで認識した全てのオブジェクトのうち、前フレームで主要オブジェクトに登録したオブジェクトを１つ選択するとともに、前フレームで他のオブジェクトと同一のグループに分類されていないオブジェクトを１つ選択して、ステップＳ２０３に処理を移す。 In step S202, the image processing apparatus 1A causes the object relationship estimation processing unit 31 to select one object registered in the main object in the previous frame from all the objects recognized in the current frame in step S201, and In step S203, one object that is not classified into the same group as other objects is selected, and the process proceeds to step S203.

ステップＳ２０３において、画像処理装置１Ａは、オブジェクト関係推定処理部３１により、ステップＳ２０２またはステップＳ２０７で選択した２つのオブジェクトに対する相対姿勢連続推定回数カウンタのカウンタ値がゼロであるか否かを判別する。ゼロであると判別した場合には、ステップＳ２０４に処理を移し、ゼロではないと判別した場合には、ステップＳ２０５に処理を移す。 In step S203, the image processing apparatus 1A causes the object relationship estimation processing unit 31 to determine whether or not the counter value of the relative posture continuous estimation number counter for the two objects selected in step S202 or step S207 is zero. If it is determined that it is zero, the process proceeds to step S204, and if it is determined that it is not zero, the process proceeds to step S205.

ステップＳ２０４において、画像処理装置１Ａは、オブジェクト関係推定処理部３１により、図８に示した相対姿勢算出処理を行って、ステップＳ２０６に処理を移す。 In step S 204, the image processing apparatus 1 A performs the relative orientation calculation process illustrated in FIG. 8 by the object relationship estimation processing unit 31, and moves the process to step S 206.

ステップＳ２０５において、画像処理装置１Ａは、オブジェクト関係推定処理部３１により、図９に示した分類処理を行って、ステップＳ２０６に処理を移す。 In step S 205, the image processing apparatus 1 A performs the classification process illustrated in FIG. 9 using the object relationship estimation processing unit 31, and moves the process to step S 206.

ステップＳ２０６において、画像処理装置１Ａは、オブジェクト関係推定処理部３１により、ステップＳ２０１において現フレームで認識した全てのオブジェクトのうち、前フレームで主要オブジェクトに登録したオブジェクトと、前フレームで他のオブジェクトと同一のグループに分類されていないオブジェクトと、について、全ての組み合わせを現フレームで選択したか否かを判別する。選択したと判別した場合には、ステップＳ２０８に処理を移し、選択していないと判別した場合には、ステップＳ２０７に処理を移す。 In step S206, the image processing apparatus 1A causes the object relationship estimation processing unit 31 to recognize among the objects recognized in the current frame in step S201, the object registered in the main object in the previous frame, and other objects in the previous frame. It is determined whether or not all combinations of objects not classified in the same group have been selected in the current frame. If it is determined that it has been selected, the process proceeds to step S208. If it is determined that it has not been selected, the process proceeds to step S207.

ステップＳ２０７において、画像処理装置１Ａは、オブジェクト関係推定処理部３１により、ステップＳ２０１において現フレームで認識した全てのオブジェクトのうち、前フレームで主要オブジェクトに登録したオブジェクトと、前フレームで他のオブジェクトと同一のグループに分類されていないオブジェクトと、について、選択していない組み合わせを構成する２つのオブジェクトを選択し、ステップＳ２０３に処理を戻す。 In step S207, the image processing apparatus 1A causes the object relationship estimation processing unit 31 to recognize among the objects recognized in the current frame in step S201, the object registered in the main object in the previous frame, and other objects in the previous frame. For objects that are not classified into the same group, two objects that constitute a combination that has not been selected are selected, and the process returns to step S203.

ステップＳ２０８において、画像処理装置１Ａは、姿勢変換処理部３２および仮想情報表示部７０により、第２の重畳表示処理を行って、図１２に示した処理を終了する。なお、第２の重畳表示処理の詳細については、図１５を用いて後述する。 In step S208, the image processing apparatus 1A performs the second superimposed display process using the orientation conversion processing unit 32 and the virtual information display unit 70, and ends the process illustrated in FIG. Details of the second superimposed display process will be described later with reference to FIG.

図１３および図１４は、画像処理装置１Ａが行う上述の第２の画像認識処理のフローチャートである。 13 and 14 are flowcharts of the above-described second image recognition process performed by the image processing apparatus 1A.

ステップＳ２１０において、画像処理装置１Ａは、姿勢追跡部２３により、ステップＳ２００で取得したプレビュー画像中に、前フレームで認識したオブジェクトが含まれているか否かを判別する。含まれていると判別した場合には、ステップＳ２１１に処理を移し、含まれていないと判別した場合には、ステップＳ２２５に処理を移す。 In step S210, the image processing apparatus 1A determines whether or not the preview image acquired in step S200 includes the object recognized in the previous frame by the posture tracking unit 23. If it is determined that it is included, the process proceeds to step S211. If it is determined that it is not included, the process proceeds to step S225.

ステップＳ２１１において、画像処理装置１Ａは、認識処理制御処理部４１により、ステップＳ２００で取得したプレビュー画像中に、同一のグループに分類されており前フレームにおいて認識に成功したオブジェクトが２つ以上存在しているか否かを判別する。存在していると判別した場合には、ステップＳ２１２に処理を移し、存在していないと判別した場合には、ステップＳ２２０に処理を移す。 In step S211, the image processing apparatus 1A includes two or more objects classified into the same group and successfully recognized in the previous frame in the preview image acquired in step S200 by the recognition processing control processing unit 41. It is determined whether or not. If it is determined that it exists, the process proceeds to step S212. If it is determined that it does not exist, the process proceeds to step S220.

ステップＳ２１２において、画像処理装置１Ａは、認識処理制御処理部４１により、ステップＳ２１１で認識に成功したと判別した２つ以上のオブジェクトのうち、１つを主要オブジェクトとして登録し、残りを認識休止オブジェクトとして登録し、ステップＳ２１３に処理を移す。 In step S212, the image processing apparatus 1A registers one of the two or more objects determined to have been successfully recognized in step S211 by the recognition processing control processing unit 41 as a main object, and the rest as a recognition pause object. And the process proceeds to step S213.

ステップＳ２１３において、画像処理装置１Ａは、姿勢追跡部２３により、ステップＳ２１２で主要オブジェクトに登録した各オブジェクトについて、前フレームでの姿勢を初期値として姿勢の追跡処理を行って認識し、ステップＳ２１４に処理を移す。 In step S213, the image processing apparatus 1A causes the posture tracking unit 23 to recognize and recognize each object registered in the main object in step S212 by performing posture tracking processing using the posture in the previous frame as an initial value, and then proceeds to step S214. Move processing.

ステップＳ２１４において、画像処理装置１Ａは、認識処理制御処理部４１により、ステップＳ２１３で追跡に失敗した主要オブジェクトがあるか否かを判別する。あると判別した場合には、ステップＳ２１５に処理を移し、ないと判別した場合には、ステップＳ２２０に処理を移す。 In step S214, the image processing apparatus 1A uses the recognition process control processing unit 41 to determine whether there is a main object that has failed to be tracked in step S213. If it is determined that there is, the process proceeds to step S215. If it is determined that there is not, the process proceeds to step S220.

ステップＳ２１５において、画像処理装置１Ａは、認識処理制御処理部４１により、ステップＳ２１３で追跡に失敗した主要オブジェクトと同一のグループに分類されているオブジェクトを、全て認識休止オブジェクトから除外し、ステップＳ２１６に処理を移す。 In step S215, the image processing apparatus 1A causes the recognition process control processing unit 41 to exclude all objects classified into the same group as the main object that failed to be tracked in step S213 from the recognition pause object, and then proceeds to step S216. Move processing.

ステップＳ２１６において、画像処理装置１Ａは、姿勢照合処理部４２により、ステップＳ２１３で追跡に成功した主要オブジェクトがあるか否かを判別する。あると判別した場合には、ステップＳ２１７に処理を移し、ないと判別した場合には、ステップＳ２２０に処理を移す。 In step S216, the image processing apparatus 1A determines whether or not there is a main object that has been successfully tracked in step S213 by the posture matching processing unit 42. If it is determined that there is, the process proceeds to step S217, and if it is determined that there is no, the process proceeds to step S220.

ステップＳ２１７において、画像処理装置１Ａは、姿勢変換処理部３２により、ステップＳ２１３で追跡に成功した主要オブジェクトの姿勢を、同一のグループに分類されている認識休止オブジェクトの姿勢に変換し、姿勢照合処理部４２により、変換した姿勢を照合し、ステップＳ２１８に処理を移す。 In step S217, the image processing apparatus 1A causes the posture conversion processing unit 32 to convert the posture of the main object that has been successfully tracked in step S213 into the posture of the recognition paused object classified in the same group, and the posture matching process. The converted posture is collated by the unit 42, and the process proceeds to step S218.

ステップＳ２１８において、画像処理装置１Ａは、姿勢照合処理部４２により、ステップＳ２１７の照合に失敗したオブジェクトがあるか否か、すなわち照合によって求められた類似度が閾値γを下回るオブジェクトがあるか否かを判別する。あると判別した場合には、ステップＳ２１９に処理を移し、ないと判別した場合には、ステップＳ２２０に処理を移す。 In step S218, the image processing apparatus 1A causes the posture matching processing unit 42 to determine whether there is an object for which the matching in step S217 has failed, that is, whether there is an object whose similarity obtained by the matching is lower than the threshold γ. Is determined. If it is determined that there is, the process proceeds to step S219. If it is determined that there is not, the process proceeds to step S220.

ステップＳ２１９において、画像処理装置１Ａは、姿勢照合処理部４２により、ステップＳ２１８で照合に失敗したオブジェクトを、認識休止オブジェクトから除外し、ステップＳ２２０に処理を移す。 In step S219, the image processing apparatus 1A causes the posture matching processing unit 42 to exclude the object that failed to be matched in step S218 from the recognition pause object, and moves the process to step S220.

ステップＳ２２０において、画像処理装置１Ａは、姿勢追跡部２３により、前フレームにおいて認識したが同一グループに他のオブジェクトが分類されていないオブジェクトと、ステップＳ２１９で認識休止オブジェクトから除外されたオブジェクトと、について、前フレームでの姿勢を初期値として姿勢を追跡処理を行って認識し、ステップＳ２２１に処理を移す。 In step S220, the image processing apparatus 1A uses the posture tracking unit 23 to recognize an object that has been recognized in the previous frame but is not classified into another group in the same group, and an object that has been excluded from the recognition pause object in step S219. The posture is recognized by performing a tracking process with the posture in the previous frame as an initial value, and the process proceeds to step S221.

ステップＳ２２１からＳ２２７のそれぞれにおいて、画像処理装置１Ａは、図７のステップＳ１１２からＳ１１８のそれぞれにおいて画像処理装置１が行う処理と同様の処理を行う。 In each of steps S221 to S227, the image processing apparatus 1A performs the same process as the process performed by the image processing apparatus 1 in each of steps S112 to S118 of FIG.

図１５は、画像処理装置１Ａが行う上述の第２の重畳表示処理のフローチャートである。 FIG. 15 is a flowchart of the above-described second superimposed display process performed by the image processing apparatus 1A.

ステップＳ２３０において、画像処理装置１Ａは、仮想情報表示部７０により、ステップＳ２１３やステップＳ２２０における認識結果と、ステップＳ１４２により推定した姿勢と、を用いて、ステップＳ２００で取得したプレビュー画像に仮想情報を重畳させ、図１５に示した処理を終了する。 In step S230, the image processing apparatus 1A uses the virtual information display unit 70 to add virtual information to the preview image acquired in step S200 using the recognition result in step S213 or step S220 and the posture estimated in step S142. The process shown in FIG. 15 is terminated.

以上の画像処理装置１Ａによれば、画像処理装置１が奏することのできる上述の効果に加えて、以下の効果を奏することができる。 According to the above image processing apparatus 1A, in addition to the above-described effects that the image processing apparatus 1 can exhibit, the following effects can be achieved.

画像処理装置１Ａは、オブジェクト関係推定部３０により、各グループからオブジェクトを１つずつ主要オブジェクトとして選択し、認識処理制御部４０により、主要オブジェクトと同一のグループに分類された主要オブジェクト以外のオブジェクトについて、画像認識部２０による認識を休止させる。また、オブジェクト関係推定部３０により、画像認識部２０による認識を休止させているオブジェクトを、主要オブジェクトの認識結果に基づいて認識する。このため、主要オブジェクトと同一のグループに分類された主要オブジェクト以外のオブジェクトについて、画像認識部２０による認識を休止させてもオブジェクト関係推定部３０により認識することができるので、画像認識部２０により認識するオブジェクトの数を減少させることができる。したがって、ＡＲ技術において、処理負荷を軽減することができる。 In the image processing apparatus 1A, the object relationship estimation unit 30 selects one object from each group as a main object, and the recognition processing control unit 40 selects objects other than the main object classified into the same group as the main object. The recognition by the image recognition unit 20 is suspended. Further, the object relation estimation unit 30 recognizes the object whose recognition by the image recognition unit 20 is suspended based on the recognition result of the main object. For this reason, objects other than the main object classified into the same group as the main object can be recognized by the object relationship estimation unit 30 even if the recognition by the image recognition unit 20 is suspended. The number of objects to be reduced can be reduced. Therefore, the processing load can be reduced in the AR technology.

また、画像処理装置１Ａは、認識処理制御部４０により、画像認識部２０による認識を休止させているオブジェクトについてのオブジェクト関係推定部３０による認識結果を、画像取得部１０により取得されたプレビュー画像と照合し、照合に失敗すれば、オブジェクトについて画像認識部２０による認識を再開させる。このため、オブジェクト関係推定部３０による認識結果が正しいか否かを判定することができる。 In addition, the image processing apparatus 1 A uses the recognition processing control unit 40 to recognize the recognition result by the object relationship estimation unit 30 for the object for which the recognition by the image recognition unit 20 is suspended, as the preview image acquired by the image acquisition unit 10. If the collation fails and the collation fails, the recognition of the object by the image recognition unit 20 is resumed. For this reason, it can be determined whether the recognition result by the object relationship estimation part 30 is correct.

また、画像処理装置１Ａは、認識処理制御部４０により、画像認識部２０による認識を再開させる際に、画像取得部１０により前フレームにおけるオブジェクト関係推定部３０による認識結果を初期値として、画像認識部２０に姿勢を追跡させる。このため、処理負荷を軽減することができるとともに、姿勢推定の精度および認識の頑健性を向上させることができる。 In addition, when the recognition processing control unit 40 restarts the recognition by the image recognition unit 20, the image processing apparatus 1A uses the recognition result by the object relationship estimation unit 30 in the previous frame as an initial value by the image acquisition unit 10 to perform image recognition. The unit 20 is made to track the posture. For this reason, the processing load can be reduced, and the accuracy of posture estimation and the robustness of recognition can be improved.

また、画像処理装置１Ａは、認識処理制御部４０により、画像認識部２０による認識を休止させているオブジェクトについてのオブジェクト関係推定部３０による認識結果に基づいて、オブジェクトを画像取得部１０により取得されたプレビュー画像に投影して投影画像を作成するとともに、投影画像と、画像取得部１０により取得されたプレビュー画像と、の類似度が閾値γ未満であれば、照合に失敗したと判定する。このため、オブジェクト関係推定部３０による認識結果が正しいか否かを判定することができる。 In the image processing apparatus 1 A, the object is acquired by the image acquisition unit 10 by the recognition processing control unit 40 based on the recognition result by the object relationship estimation unit 30 for the object whose recognition by the image recognition unit 20 is suspended. The projected image is projected to create a projected image, and if the similarity between the projected image and the preview image acquired by the image acquisition unit 10 is less than the threshold γ, it is determined that the matching has failed. For this reason, it can be determined whether the recognition result by the object relationship estimation part 30 is correct.

＜第３実施形態＞
［画像処理装置１Ｂの概要］
図１６は、本発明の第３実施形態に係る画像処理装置１Ｂのブロック図である。画像処理装置１Ｂは、図１１に示した本発明の第２実施形態に係る画像処理装置１Ａとは、認識結果共有処理部５０および協調認識処理部６０を備える点で異なる。なお、画像処理装置１Ｂにおいて、画像処理装置１Ａと同一の構成要件については、同一符号を付し、その説明を省略する。 <Third Embodiment>
[Outline of Image Processing Apparatus 1B]
FIG. 16 is a block diagram of an image processing apparatus 1B according to the third embodiment of the present invention. The image processing apparatus 1B is different from the image processing apparatus 1A according to the second embodiment of the present invention illustrated in FIG. 11 in that it includes a recognition result sharing processing unit 50 and a cooperative recognition processing unit 60. In the image processing apparatus 1B, the same components as those in the image processing apparatus 1A are denoted by the same reference numerals, and the description thereof is omitted.

本実施形態では、画像処理装置１Ｂを搭載した端末として、自端末および他端末を想定し、これら２つの端末が同一のＡＲ空間を共有することを想定している。この場合、自端末と他端末とでは、オブジェクトからカメラまでの距離や、オブジェクトに対するカメラの位置や向きが異なるため、一方の端末では認識できるオブジェクトを他方の端末では認識できないといったことが起こり得る。そこで、自端末と他端末との間でも、オブジェクトの認識結果を共有する。 In the present embodiment, it is assumed that the terminal and other terminals are assumed as terminals equipped with the image processing apparatus 1B, and that these two terminals share the same AR space. In this case, since the distance from the object to the camera and the position and orientation of the camera with respect to the object are different between the own terminal and the other terminal, an object that can be recognized by one terminal may not be recognized by the other terminal. Therefore, the object recognition result is shared between the own terminal and other terminals.

［認識結果共有処理部５０の構成および動作］
認識結果共有処理部５０は、自端末の画像認識部２０による認識結果と、他端末の画像認識部２０による認識結果と、を入力とするとともに、入力された自端末の画像認識部２０による認識結果を他端末の画像認識部２０に送信する。これによれば、自端末と他端末との間で、画像認識部２０による認識結果を共有することができる。 [Configuration and Operation of Recognition Result Sharing Processing Unit 50]
The recognition result sharing processing unit 50 receives as input the recognition result by the image recognition unit 20 of the own terminal and the recognition result by the image recognition unit 20 of the other terminal, and the recognition by the image recognition unit 20 of the input own terminal. The result is transmitted to the image recognition unit 20 of another terminal. According to this, the recognition result by the image recognition part 20 can be shared between the own terminal and another terminal.

他端末の画像認識部２０との認識結果の送受信は、アドホック通信で実現される。これによれば、同一ＬＡＮ内の他端末と通信を行うことができる。また、アクセスポイントが存在しない場合でも、Ｗｉ−ＦｉＤｉｒｅｃｔやＢｌｕｅｔｏｏｔｈ（登録商標）を用いて近接する端末間で通信を行うことが可能である。アドホック通信に必要なペアリング機能、ディスカバリ機能などを備えたソフトウェア（ライブラリ）は一般に公開されており、このようなライブラリを利用することで本機能の実現は容易に可能である。ただし、本機能は、無線ネットワークや有線ケーブルを介して、一般的な通信プロトコルを利用して実現することも可能である。 Transmission / reception of the recognition result with the image recognition unit 20 of another terminal is realized by ad hoc communication. According to this, it is possible to communicate with other terminals in the same LAN. Further, even when there is no access point, it is possible to perform communication between adjacent terminals using Wi-Fi Direct or Bluetooth (registered trademark). Software (libraries) having a pairing function, a discovery function, and the like necessary for ad hoc communication are publicly available, and this function can be easily realized by using such a library. However, this function can also be realized using a general communication protocol via a wireless network or a wired cable.

なお、認識結果共有処理部５０による処理は、自端末と他端末とで同期する必要がないため、自端末の画像認識部２０による認識結果を他端末の画像認識部２０に送信する処理と、他端末の画像認識部２０による認識結果を自端末の画像認識部２０で受信する処理と、は独立に実行することが可能である。また、認識結果の送受信のための通信処理では、一般的に遅延が発生するため、他端末の画像認識部２０との認識結果の送信処理および受信処理は、他の処理とは独立に（プログラム上の別スレッドで）実行することが可能である。 In addition, since the process by the recognition result sharing process part 50 does not need to synchronize with an own terminal and another terminal, the process which transmits the recognition result by the image recognition part 20 of an own terminal to the image recognition part 20 of an other terminal, It can be executed independently of the process of receiving the recognition result by the image recognition unit 20 of the other terminal by the image recognition unit 20 of the own terminal. In addition, since a delay generally occurs in the communication processing for transmitting and receiving the recognition result, the transmission processing and the reception processing of the recognition result with the image recognition unit 20 of the other terminal are independent of other processing (program It can be executed in another thread above.

［協調認識処理部６０の構成および動作］
協調認識処理部６０は、自端末の画像認識部２０による認識結果と、他端末の画像認識部２０による認識結果と、を入力とする。協調認識処理部６０は、他端末での認識結果を、自端末を基準とした認識結果に変換し、自端末での認識結果と統合する。この協調認識処理部６０は、相対姿勢推定部６１および姿勢変換部６２を備える。 [Configuration and Operation of Cooperative Recognition Processing Unit 60]
The cooperative recognition processing unit 60 receives the recognition result by the image recognition unit 20 of the own terminal and the recognition result by the image recognition unit 20 of another terminal. The cooperative recognition processing unit 60 converts the recognition result at the other terminal into the recognition result based on the own terminal, and integrates the recognition result at the own terminal. The cooperative recognition processing unit 60 includes a relative posture estimation unit 61 and a posture conversion unit 62.

相対姿勢推定部６１は、自端末の画像認識部２０による認識結果と、他端末の画像認識部２０による認識結果と、を入力とする。この相対姿勢推定部６１は、自端末での認識結果と、他端末での認識結果と、に基づいて、自端末と他端末との相対的な位置関係を示す姿勢（相対姿勢）を推定する。なお、以降では、画像処理装置１Ｂが搭載された自端末のことを自端末Ｓとし、画像処理装置１Ｂが搭載された他端末のことを他端末Ｔとする。 The relative posture estimation unit 61 receives the recognition result by the image recognition unit 20 of the own terminal and the recognition result by the image recognition unit 20 of another terminal. The relative attitude estimation unit 61 estimates an attitude (relative attitude) indicating a relative positional relationship between the own terminal and the other terminal based on the recognition result at the own terminal and the recognition result at the other terminal. . Hereinafter, the own terminal on which the image processing apparatus 1B is mounted is referred to as the own terminal S, and the other terminal on which the image processing apparatus 1B is mounted is referred to as the other terminal T.

相対姿勢の推定は、自端末Ｓでの認識結果および他端末Ｔでの認識結果の双方に、同一のオブジェクトについての認識結果が含まれている場合に、実行可能である。なお、同一のオブジェクトは、基準マーカであってもよい。 The estimation of the relative posture can be executed when the recognition result for the same object is included in both the recognition result at the own terminal S and the recognition result at the other terminal T. Note that the same object may be a reference marker.

ここで、以降では、上述の同一のオブジェクトのことをオブジェクトａとする。また、自端末Ｓの姿勢追跡部２３により推定されたオブジェクトａの姿勢行列のことを姿勢行列Ｗ_Ｓａとし、他端末Ｔの姿勢追跡部２３により推定されたオブジェクトａの姿勢行列のことを姿勢行列Ｗ_Ｔａとする。すると、以下の数式（７）により、自端末Ｓと他端末Ｔとの相対姿勢Ｗ_ＳＴを求めることができる。 Hereafter, the same object is referred to as object a. In addition, the posture matrix of the object a estimated by the posture tracking unit 23 of the own terminal S is defined as a posture matrix W _Sa, and the posture matrix of the object a estimated by the posture tracking unit 23 of the other terminal T is represented as a posture matrix. _Let W _Ta . Then, the relative attitude W _ST between the terminal S and the other terminal T can be obtained by the following mathematical formula (7).

なお、上述の同一のオブジェクトとして基準マーカが存在する場合には、上述のオブジェクトａとして基準マーカを用いることが好ましい。これは、基準マーカが、一般的に容易に認識できるようにデザインされており、他のオブジェクトと比べて画像認識部２０による認識精度が高いためである。 In addition, when a reference marker exists as the above-mentioned same object, it is preferable to use a reference marker as the above-mentioned object a. This is because the reference marker is generally designed to be easily recognized, and the recognition accuracy by the image recognition unit 20 is higher than that of other objects.

一方、上述の同一のオブジェクトとして基準マーカが存在しない場合には、自端末および他端末の双方で認識できているオブジェクトを、上述のオブジェクトａとして用いればよい。上述の同一のオブジェクトとして基準マーカが存在しない場合としては、画像取得部１０により取得されたプレビュー画像内にそもそも基準マーカが存在しない場合や、画像取得部１０により取得されたプレビュー画像内に基準マーカは存在しているものの自端末および他端末のうち少なくともいずれかで認識できていない場合が考えられる。 On the other hand, when the reference marker does not exist as the same object, an object that can be recognized by both the own terminal and the other terminal may be used as the object a. When the reference marker does not exist as the same object as described above, the reference marker does not exist in the preview image acquired by the image acquisition unit 10 in the first place, or the reference marker does not exist in the preview image acquired by the image acquisition unit 10. May exist but cannot be recognized by at least one of its own terminal and other terminals.

なお、数式（７）を用いて上述した相対姿勢の推定は、自端末Ｓおよび他端末Ｔの２台の端末が存在している場合である。端末が３台以上存在している場合には、以下のようにして相対姿勢を推定することもできる。ここで、例えば、３台の端末を、自端末Ｓ、他端末Ｔ、他端末Ｕとし、自端末Ｓと他端末Ｔとの相対姿勢Ｗ_ＳＴと、他端末Ｔと他端末Ｕとの相対姿勢Ｗ_ＴＵと、を求めることができているものとする。この場合、自端末Ｓと他端末Ｕとの相対姿勢Ｗ_ＳＵは、以下の数式（８）により求めることができる。 In addition, the estimation of the relative attitude described above using Equation (7) is a case where there are two terminals, that is, the own terminal S and the other terminal T. When there are three or more terminals, the relative posture can be estimated as follows. Here, for example, the three terminals are the own terminal S, the other terminal T, and the other terminal U, the relative attitude W _ST between the own terminal S and the other terminal T, and the relative attitude between the other terminal T and the other terminal U. It is assumed that _WTU can be obtained. In this case, the relative posture W _SU between the terminal S and the other terminal U can be obtained by the following mathematical formula (8).

このため、自端末Ｓおよび他端末Ｕの双方で認識できているオブジェクトが存在していない場合でも、数式（７）の代わりに数式（８）を用いることで、自端末Ｓと他端末Ｕとの相対姿勢Ｗ_ＳＵを求めることができる。ただし、この場合には、協調認識処理部６０に、他端末Ｔと他端末Ｕとの相対姿勢Ｗ_ＴＵが、他端末Ｔまたは他端末Ｕの少なくともいずれかから入力される必要がある。 Therefore, even when there is no object that can be recognized by both the own terminal S and the other terminal U, by using the equation (8) instead of the equation (7), the own terminal S and the other terminal U Relative posture W _SU can be obtained. However, in this case, the relative attitude W _TU between the other terminal T and the other terminal U needs to be input to the cooperative recognition processing unit 60 from at least one of the other terminal T and the other terminal U.

姿勢変換部６２は、他端末の画像認識部２０による認識結果と、相対姿勢推定部６１により推定された相対姿勢Ｗ_ＳＴと、を入力とする。この姿勢変換部６２は、相対姿勢Ｗ_ＳＴを用いて、他端末での認識結果を、自端末を基準とした認識結果に変換する。 Posture changing unit 62 has an input and recognition result of the image recognition unit 20 of another terminal, the relative orientation W _ST estimated by the relative posture estimation unit 61, a. This posture conversion unit 62 converts the recognition result at the other terminal into a recognition result based on the own terminal using the relative posture _WST .

ここで、自端末Ｓが認識できていないオブジェクトｂについての認識結果が、他端末Ｔでの認識結果に含まれており、他端末Ｔの姿勢追跡部２３により推定されたオブジェクトｂの姿勢行列が姿勢行列Ｗ_Ｔｂで表されているものとする。すると、以下の数式（９）により、他端末Ｔの姿勢追跡部２３により推定されたオブジェクトｂの姿勢行列Ｗ_Ｔｂを、自端末Ｓにおけるオブジェクトｂの姿勢行列Ｗ_Ｓｂに変換し、自端末Ｓにおけるオブジェクトｂの認識結果とすることができる。 Here, the recognition result of the object b that the terminal S cannot recognize is included in the recognition result of the other terminal T, and the posture matrix of the object b estimated by the posture tracking unit 23 of the other terminal T is It is assumed that it is represented by the attitude matrix W _Tb . Then, the posture matrix W _Tb of the object b estimated by the posture tracking unit 23 of the other terminal T is converted into the posture matrix W _Sb of the object b in the own terminal S by the following mathematical formula (9). This can be the recognition result of the object b.

これによれば、自端末Ｓの姿勢変換部６２は、自端末Ｓの画像認識部２０により認識されていないオブジェクトｂについても、他端末Ｔの画像認識部２０による認識結果と、自端末Ｓと他端末Ｔとの相対姿勢と、に基づいて認識することができる。 According to this, the orientation conversion unit 62 of the own terminal S also recognizes the recognition result by the image recognition unit 20 of the other terminal T, the own terminal S, and the object b that is not recognized by the image recognition unit 20 of the own terminal S. It can be recognized based on the relative posture with the other terminal T.

また、姿勢変換部６２は、この自端末Ｓにおけるオブジェクトｂの認識結果と、自端末Ｓの画像認識部２０による認識結果（自端末Ｓにおけるオブジェクトａの認識結果）と、を統合し、統合認識結果とする。これによれば、姿勢変換部６２は、オブジェクトａおよびオブジェクトｂについて、自端末Ｓにおける認識結果を得ることができる。 Further, the posture conversion unit 62 integrates the recognition result of the object b in the own terminal S and the recognition result (recognition result of the object a in the own terminal S) by the image recognition unit 20 of the own terminal S, and performs integrated recognition. As a result. According to this, the attitude | position conversion part 62 can obtain the recognition result in the own terminal S about the object a and the object b.

なお、上述のように相対姿勢を用いることで、他端末での認識結果に含まれる全てのオブジェクトについて、他端末での認識結果から、自端末を基準とした認識結果に変換することができる。ただし、他端末での認識結果に含まれる全てのオブジェクトのうち、相対姿勢を求める際に用いたオブジェクトについては、この相対姿勢を用いて自端末における認識結果に変換すると、自端末におけるこのオブジェクトの認識結果に一致することになる。このため、他端末での認識結果に含まれる全てのオブジェクトのうち、相対姿勢を求める際に用いたオブジェクトについては、相対姿勢を用いて変換することに意味はない。 In addition, by using a relative posture as described above, it is possible to convert all objects included in the recognition result at the other terminal from the recognition result at the other terminal into a recognition result based on the own terminal. However, among all the objects included in the recognition results at other terminals, the object used when obtaining the relative posture is converted to the recognition result at the own terminal using this relative posture, and this object It matches the recognition result. For this reason, it is meaningless to convert the object used when obtaining the relative posture among all the objects included in the recognition result at the other terminal using the relative posture.

また、自端末および他端末の双方で認識できているオブジェクトについては、自端末での認識結果と、他端末での認識結果を相対姿勢を用いて変換したものと、のいずれかを用いることができる。ただし、本実施形態では、自端末での認識結果を優先的に用い、自端末で認識していないオブジェクトについてのみ、他端末での認識結果を相対姿勢を用いて変換したものを用いるものとする。なお、自端末で認識していないオブジェクトとは、自端末で認識処理を行ったが認識に失敗してしまったオブジェクトと、そもそも自端末で認識処理が行われていないオブジェクトと、のことである。 For objects that can be recognized by both the own terminal and the other terminal, either the recognition result of the own terminal or the result of converting the recognition result of the other terminal using a relative posture may be used. it can. However, in this embodiment, the recognition result at the own terminal is preferentially used, and only the object that is not recognized at the own terminal is obtained by converting the recognition result at the other terminal using the relative posture. . Note that the objects that are not recognized by the own terminal are objects that have been recognized by the own terminal but failed to be recognized, and objects that have not been recognized by the own terminal in the first place. .

仮想情報表示部７０は、プレビュー画像に、画像認識部２０およびオブジェクト関係推定部３０による認識結果に加えて、姿勢変換部６２による認識結果に基づいて、仮想情報を重畳させる。 The virtual information display unit 70 superimposes virtual information on the preview image based on the recognition result by the posture conversion unit 62 in addition to the recognition result by the image recognition unit 20 and the object relationship estimation unit 30.

以上の画像処理装置１Ｂによれば、画像処理装置１Ａが奏することのできる上述の効果に加えて、以下の効果を奏することができる。 According to the image processing apparatus 1B described above, in addition to the above-described effects that can be achieved by the image processing apparatus 1A, the following effects can be achieved.

画像処理装置１Ｂは、協調認識処理部６０により、他の画像処理装置で認識されたオブジェクトの認識結果を、画像処理装置１Ｂを基準とした認識結果に変換する。また、仮想情報表示部７０により、画像認識部２０による認識結果と、オブジェクト関係推定部３０による認識結果と、協調認識処理部６０による認識結果と、に基づいて、画像取得部１０により取得されたプレビュー画像に仮想情報を重畳させる。このため、プレビュー画像への仮想情報の重畳に、他の画像処理装置で認識結果も用いることができるので、ＡＲ技術において、処理負荷をさらに軽減したり、オブジェクトの認識の頑健性をさらに向上させたりすることができる。 In the image processing apparatus 1B, the cooperative recognition processing unit 60 converts the recognition result of the object recognized by the other image processing apparatus into a recognition result based on the image processing apparatus 1B. Further, the virtual information display unit 70 acquired by the image acquisition unit 10 based on the recognition result by the image recognition unit 20, the recognition result by the object relationship estimation unit 30, and the recognition result by the cooperative recognition processing unit 60. Virtual information is superimposed on the preview image. For this reason, the recognition result can also be used by another image processing apparatus for superimposing virtual information on the preview image. Therefore, in AR technology, the processing load can be further reduced and the robustness of object recognition can be further improved. Can be.

なお、本発明の画像処理装置１、１Ａ、１Ｂの処理を、コンピュータ読み取り可能な非一時的な記録媒体に記録し、この記録媒体に記録されたプログラムを画像処理装置１、１Ａ、１Ｂに読み込ませ、実行することによって、本発明を実現できる。 The processing of the image processing apparatus 1, 1A, 1B of the present invention is recorded on a computer-readable non-transitory recording medium, and the program recorded on this recording medium is read into the image processing apparatus 1, 1A, 1B. The present invention can be realized by executing.

ここで、上述の記録媒体には、例えば、ＥＰＲＯＭやフラッシュメモリといった不揮発性のメモリ、ハードディスクといった磁気ディスク、ＣＤ−ＲＯＭなどを適用できる。また、この記録媒体に記録されたプログラムの読み込みおよび実行は、画像処理装置１、１Ａ、１Ｂに設けられたプロセッサによって行われる。 Here, for example, a nonvolatile memory such as an EPROM or a flash memory, a magnetic disk such as a hard disk, a CD-ROM, or the like can be applied to the above-described recording medium. Further, reading and execution of the program recorded on the recording medium is performed by a processor provided in the image processing apparatus 1, 1A, 1B.

また、上述のプログラムは、このプログラムを記憶装置などに格納した画像処理装置１、１Ａ、１Ｂから、伝送媒体を介して、あるいは、伝送媒体中の伝送波により他のコンピュータシステムに伝送されてもよい。ここで、プログラムを伝送する「伝送媒体」は、インターネットなどのネットワーク（通信網）や電話回線などの通信回線（通信線）のように情報を伝送する機能を有する媒体のことをいう。 Further, the above-described program may be transmitted from the image processing apparatuses 1, 1A, 1B storing the program in a storage device or the like to another computer system via a transmission medium or by a transmission wave in the transmission medium. Good. Here, the “transmission medium” for transmitting the program refers to a medium having a function of transmitting information, such as a network (communication network) such as the Internet or a communication line (communication line) such as a telephone line.

また、上述のプログラムは、上述の機能の一部を実現するためのものであってもよい。さらに、上述の機能を画像処理装置１、１Ａ、１Ｂにすでに記録されているプログラムとの組み合せで実現できるもの、いわゆる差分ファイル（差分プログラム）であってもよい。 Further, the above-described program may be for realizing a part of the above-described function. Furthermore, what can implement | achieve the above-mentioned function in combination with the program already recorded on the image processing apparatus 1, 1A, 1B, what is called a difference file (difference program) may be sufficient.

以上、この発明の実施形態につき、図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計なども含まれる。 The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes a design that does not depart from the gist of the present invention.

例えば、上述の各実施形態では、オブジェクトとして、図２から４では二次元バーコードを記載したが、これに限らず、任意の図や文字や物体などであってもよい。 For example, in each of the above-described embodiments, the two-dimensional barcode is described as an object in FIGS. 2 to 4, but the present invention is not limited to this, and any figure, character, object, or the like may be used.

また、上述の第２実施形態において、認識処理制御部４０により、画像認識部２０による主要オブジェクトの認識が失敗すると、主要オブジェクトと同一のグループに分類された主要オブジェクト以外のオブジェクトについて、画像認識部２０による認識を再開させることとしてもよい。これによれば、オブジェクト関係推定部３０による認識ができなくなった場合には、画像認識部２０による認識を再開させることができ、オブジェクトの認識の頑健性をさらに向上させることができる。 Further, in the second embodiment described above, when the recognition processing control unit 40 fails to recognize the main object by the image recognition unit 20, the image recognition unit performs an object other than the main object classified into the same group as the main object. The recognition by 20 may be resumed. According to this, when recognition by the object relationship estimation part 30 becomes impossible, recognition by the image recognition part 20 can be restarted, and the robustness of object recognition can be improved further.

また、上述の各実施形態において、画像認識部２０により、オブジェクトごとの認識結果に、認識結果の認識精度の指標となる情報を付加し、オブジェクト関係推定部３０により、画像認識部２０により付加された認識精度の指標が閾値以上であるオブジェクト間の相対姿勢を、安定していると判定することとしてもよい。さらに、認識精度の最も高いオブジェクトを、主要オブジェクトとしてもよい。これらによれば、オブジェクトの認識結果の認識精度を考慮して、オブジェクトを分類することができる。なお、上述の認識精度の指標としては、例えば、オブジェクトに対する撮影距離や撮影角度を採用したり、局所特徴量のマッチング数やマッチングのスコアを採用したり、ＳＳＤ（Sum of Squared Difference）やＮＣＣ（Normalized Cross Correlation）といったテンプレートマッチングの手法を用いる場合にはＳＳＤやＮＣＣの応答値をそのまま採用したりすることができる。 In each of the above-described embodiments, the image recognition unit 20 adds information serving as a recognition accuracy index of the recognition result to the recognition result for each object, and the object relationship estimation unit 30 adds the information to the recognition result. It is also possible to determine that the relative posture between objects whose recognition accuracy index is equal to or greater than a threshold value is stable. Furthermore, the object with the highest recognition accuracy may be used as the main object. According to these, the objects can be classified in consideration of the recognition accuracy of the recognition result of the objects. As the above-described recognition accuracy index, for example, a shooting distance and a shooting angle with respect to an object, a matching number of local features and a matching score, SSD (Sum of Squared Difference), NCC ( When a template matching method such as Normalized Cross Correlation is used, the response value of SSD or NCC can be adopted as it is.

また、上述の各実施形態において、画像認識部２０により、オブジェクトごとの認識結果に、オブジェクトの認識に要する処理負荷の指標となる情報を付加し、オブジェクト関係推定部３０により、画像認識部２０により付加された処理負荷の指標が閾値未満であるオブジェクトを、主要オブジェクトとしてもよい。これによれば、処理負荷の低いオブジェクトとの関係性を用いて、オブジェクト関係推定部３０によりオブジェクトを認識することができるので、処理負荷をさらに軽減することができる。なお、上述の処理負荷の指標としては、例えば、認識に要した時間を採用したり、オブジェクトの種類に応じた値を設定したりすることができる。 In each of the above-described embodiments, the image recognition unit 20 adds information serving as an index of the processing load required for object recognition to the recognition result for each object, and the object relationship estimation unit 30 causes the image recognition unit 20 to An object for which the added processing load index is less than the threshold may be set as the main object. According to this, since the object relationship estimation unit 30 can recognize the object using the relationship with the object having a low processing load, the processing load can be further reduced. As the above-described processing load index, for example, the time required for recognition can be adopted, or a value corresponding to the type of object can be set.

１、１Ａ、１Ｂ；画像処理装置
１０；画像取得部
２０；画像認識部
３０；オブジェクト関係推定部
４０；認識処理制御部
５０；認識結果共有処理部
６０；協調認識処理部
７０；仮想情報表示部
Ｃ１、Ｃ２、Ｃ３；仮想情報
Ｍ１、Ｍ２、Ｍ３；オブジェクト DESCRIPTION OF SYMBOLS 1, 1A, 1B; Image processing apparatus 10; Image acquisition part 20; Image recognition part 30; Object relationship estimation part 40; Recognition process control part 50; Recognition result sharing process part 60; C1, C2, C3; virtual information M1, M2, M3; object

Claims

An image processing device for superimposing virtual information on a preview image,
Image acquisition means for acquiring the preview image;
Image recognition means for recognizing an object in the preview image acquired by the image acquisition means;
The relationship between objects recognized by the image recognition means is estimated, the objects are classified based on the estimation results, and the main object that is one of the objects classified into the same group is recognized by the image recognition means. Based on the result, object relationship estimation means for recognizing objects other than the main object classified into the group,
Virtual information display means for superimposing virtual information on the preview image acquired by the image acquisition means based on the recognition result by the image recognition means and the recognition result by the object relationship estimation means. An image processing apparatus.

The object relationship estimation means includes:
Applying the object recognized by the image recognition means as the main object,
The object that is classified into the same group as the main object and that cannot be recognized by the image recognition unit is applied as an object other than the main object classified into the same group as the main object. Item 8. The image processing apparatus according to Item 1.

The object relationship estimation means includes:
Select one object from each group as the main object,
The image processing apparatus according to claim 1, further comprising a recognition processing control unit that pauses recognition by the image recognition unit for an object other than the main object classified into the same group as the main object. .

When the recognition of the main object by the image recognition unit fails, the recognition processing control unit restarts the recognition by the image recognition unit for objects other than the main object classified into the same group as the main object. The image processing apparatus according to claim 3.

The recognition processing control unit collates the recognition result by the object relation estimation unit for the object whose recognition by the image recognition unit is suspended with the preview image acquired by the image acquisition unit, and if the verification fails 5. The image processing apparatus according to claim 3, wherein recognition of the object is resumed by the image recognition means.

When the recognition processing control unit restarts the recognition by the image recognition unit, the recognition result by the object relationship estimation unit in the preview image acquired last time by the image acquisition unit is set as an initial value to the image recognition unit. The image processing apparatus according to claim 3, wherein the image processing apparatus is tracked.

The recognition processing control means includes
Based on the recognition result by the object relation estimation unit for the object whose recognition by the image recognition unit is suspended, the object is projected onto the preview image acquired by the image acquisition unit, and a projection image is created.
The collation is determined to have failed if the similarity between the projection image and the preview image acquired by the image acquisition unit is less than a threshold value. Image processing device.

The image processing apparatus according to claim 7, wherein the recognition processing control unit estimates an attitude that maximizes the similarity by iterative calculation and corrects a recognition result by the object relationship estimation unit.

The recognition processing control means includes
Based on the recognition result by the object relation estimation unit for the object whose recognition by the image recognition unit is suspended, the object is projected onto the preview image acquired by the image acquisition unit, and a projection image is created.
A matching part is estimated by template matching between the projection image and the preview image acquired by the image acquisition unit, and if a response value at the matching part is less than a threshold, it is determined that the matching has failed. The image processing apparatus according to claim 3.

Coordinate recognition processing means for converting a recognition result of an object recognized by a first image processing device different from the image processing device into a recognition result based on the image processing device,
The virtual information display means is a preview image acquired by the image acquisition means based on a recognition result by the image recognition means, a recognition result by the object relationship estimation means, and a recognition result by the cooperative recognition processing means. The image processing apparatus according to claim 1, wherein virtual information is superimposed on the image processing apparatus.

11. The object relationship estimation unit obtains a relative posture indicating a relative positional relationship between the objects as the relationship between the objects recognized by the image recognition unit. An image processing apparatus according to 1.

The object relationship estimation means includes:
Each time a preview image is acquired by the image acquisition means, a relative posture between objects in the preview image is obtained,
12. The objects of which the amount of change between the preview images of the relative posture is less than a threshold value continuously over a predetermined number of preview images are classified into the same group. The image processing apparatus according to any one of the above.

The object relationship estimation means calculates the difference between the relative attitude obtained in the latest preview image acquired by the image acquisition means and the average of the relative attitude obtained in the preview image before the latest preview image. The image processing apparatus according to claim 12, wherein the image processing apparatus calculates the amount of change.

The image recognizing unit adds information serving as an index of recognition accuracy of the recognition result to the recognition result for each object,
14. The object relation estimation unit determines that the relative posture between objects having a recognition accuracy index added by the image recognition unit equal to or greater than a threshold value is stable. An image processing apparatus according to claim 1.

The image recognizing unit adds information serving as an index of recognition accuracy of the recognition result to the recognition result for each object,
15. The image processing apparatus according to claim 1, wherein the object relation estimation unit applies the object having the highest recognition accuracy index added by the image recognition unit to the main object. .

The image processing apparatus according to claim 14, wherein the image recognition unit uses at least one of a shooting distance with respect to an object and a shooting angle with respect to the object as an index of the recognition accuracy.

The image recognition means uses at least one of a matching number of local feature values and a matching score of local feature values as an index of the recognition accuracy. An image processing apparatus according to 1.

The image recognition means uses at least one of a response value of SSD (Sum of Squared Difference) and a response value of NCC (Normalized Cross Correlation) as an index of the recognition accuracy. The image processing device according to any one of 14 to 17.

The image recognizing unit adds information serving as an index of a processing load required for recognition of the object to the recognition result for each object,
The image according to any one of claims 1 to 18, wherein the object relation estimation unit applies an object having a processing load index added by the image recognition unit less than a threshold to the main object. Processing equipment.

The image processing apparatus according to claim 19, wherein the image recognition unit uses a time required for recognition as an index of the processing load.

The image processing apparatus according to claim 19, wherein the image recognition unit sets a value corresponding to a type of an object as the processing load index.

An image processing method in an image processing apparatus comprising image acquisition means, image recognition means, object relationship estimation means, and virtual information display means, and superimposing virtual information on a preview image,
A first step in which the image acquisition means acquires the preview image;
A second step in which the image recognition means recognizes an object in the preview image acquired in the first step;
The object relationship estimation means estimates the relationship between the objects recognized in the second step, classifies the objects based on the estimation result, and is one of the objects classified into the same group A third step of recognizing an object other than the main object classified into the group based on the recognition result of the object in the second step;
The virtual information display means superimposes virtual information on the preview image acquired in the first step based on the recognition result in the second step and the recognition result in the third step. An image processing method comprising the steps of:

A program for causing a computer to execute an image processing method in an image processing apparatus that includes image acquisition means, image recognition means, object relationship estimation means, and virtual information display means, and superimposes virtual information on a preview image,
A first step in which the image acquisition means acquires the preview image;
A second step in which the image recognition means recognizes an object in the preview image acquired in the first step;
The object relationship estimation means estimates the relationship between the objects recognized in the second step, classifies the objects based on the estimation result, and is one of the objects classified into the same group A third step of recognizing an object other than the main object classified into the group based on the recognition result of the object in the second step;
The virtual information display means superimposes virtual information on the preview image acquired in the first step based on the recognition result in the second step and the recognition result in the third step. Program for causing a computer to execute the above steps.