JPH07220090A

JPH07220090A - Object recognition method

Info

Publication number: JPH07220090A
Application number: JP6010806A
Authority: JP
Inventors: Masakazu Matsugi; 優和真継; Katsumi Iijima; 克己飯島
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1994-02-02
Filing date: 1994-02-02
Publication date: 1995-08-18
Anticipated expiration: 2015-08-21
Also published as: JP3078166B2

Abstract

PURPOSE:To perform pattern recognition based on the limited number of feature elements and the relative arranging information of the feature element. CONSTITUTION:An inputted image is held by recording at an image input part S11, and a local feature element in the image is extracted at a local feature element extraction part S12, and the arranging information of the local feature element is generated at an extracted feature element arranging data generating part S13, and the combination arranging information of the local feature element is stored in a local feature element model arranging data storage part S14 as storage information, and the generated arranging information of the local feature element and storage information are judged by collating at a matching processing part S15, then, an existence area in the image of judged recognition information can be detected and extracted at an adaptive image area extraction part S16.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は特定被写体を中心とした
撮像および画像の編集を行うためのパターン認識方法に
関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a pattern recognition method for picking up an image of a specific subject and editing the image.

【０００２】[0002]

【従来の技術】従来の図形パターンなどの表示方法の一
つとして、例えば特公平５−３６８３０号の図形入力方
式などのように、幾何学的図形シンボルのストロークの
屈曲部分をあらかじめ用意した種々の屈曲パターン（形
状プリミティブ）のいずれかで表現し、曲線部分を円弧
によって近似する手法が知られており、幾何学的に簡単
な図形の認識への応用が可能である。2. Description of the Related Art As one of conventional display methods of graphic patterns, there are various methods such as the graphic input method of Japanese Examined Patent Publication No. 5-36830 in which a curved portion of a stroke of a geometric graphic symbol is prepared in advance. A method is known in which a curved portion is represented by one of bending patterns (shape primitives) and a curved portion is approximated by an arc, and it can be applied to geometrically simple figure recognition.

【０００３】また、物体認識の一方法として特公平５−
２３４６３号の物体認識装置においては、認識物体の輪
郭を追跡し、直線部あるいは円弧部などの形状プリミテ
ィブに分割し、それぞれの属性と各頂点の属性とを辞書
としてメモリに登録し、辞書メモリをもとに、未知物体
の各形状プリミティブを検索することによって認識を行
う。Further, as a method of object recognition, Japanese Patent Publication No. 5-
In the object recognition device of No. 23463, the contour of the recognized object is traced, divided into shape primitives such as a straight line portion or a circular arc portion, and the respective attributes and the attributes of each vertex are registered in the memory as a dictionary, and the dictionary memory is stored. First, recognition is performed by searching each shape primitive of the unknown object.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、上記従
来例では形状プリミティブの空間配置関係を認識のため
に抽出している訳ではないので、同一対象であっても視
点位置によって異なる画像パターンになる場合、あるい
は何らかの要因により形状またはサイズの変化が生じた
場合には、同一対象について膨大な数の２次元パターン
情報を記憶し、かつパターン認識時にはその膨大な数の
パターン情報とのマッチングを行う必要があり、演算コ
ストも多大になるという問題点があった。However, since the spatial arrangement relationship of the shape primitives is not extracted for recognition in the above-mentioned conventional example, even if the same object has different image patterns depending on the viewpoint position. Or, if the shape or size changes due to some factor, it is necessary to store a huge number of two-dimensional pattern information for the same object and to perform matching with the huge number of pattern information at the time of pattern recognition. However, there is a problem in that the calculation cost becomes large.

【０００５】また、一般的に画像中に複数の物体が存在
する場合、あらかじめ領域分割を適切に行う必要があ
り、一つの領域に一つの対象のみが存在するように領域
分割をしてから認識処理を適用していた。Further, generally, when a plurality of objects exist in an image, it is necessary to appropriately perform area division beforehand, and the area division is performed so that only one object exists in one area before recognition. The treatment was applied.

【０００６】かかる領域分割と対象パターンの認識とは
表裏一体であり、自動的に完全に行うことは非常に困難
であった。The area division and the recognition of the target pattern are two sides of the same face, and it is very difficult to perform them completely automatically.

【０００７】本発明は、このような点に鑑み、限定され
た数の特徴要素と、特徴要素の相対的配置情報に基づく
パターン認識を行うことを目的とする。In view of such a point, an object of the present invention is to perform pattern recognition based on a limited number of characteristic elements and relative arrangement information of the characteristic elements.

【０００８】[0008]

【課題を解決するための手段】本発明の物体認識方法
は、入力された画像を記録して保持し、前記画像中の局
所的特徴要素を抽出し、前記局所的特徴要素の配置情報
を生成し、認識すべき物体の、前記局所的特徴要素の組
み合せ配置情報を記憶情報として記憶し、生成した前記
局所的特徴要素の前記配置情報と前記記憶情報とを照合
して判定し、判定した認識情報の前記画像中の存在領域
を決定して抽出する。According to an object recognition method of the present invention, an input image is recorded and held, local feature elements in the image are extracted, and arrangement information of the local feature elements is generated. Then, the combination arrangement information of the local feature elements of the object to be recognized is stored as storage information, and the arrangement information of the generated local feature elements and the storage information are collated and determined, and the determined recognition is performed. The existing area of information in the image is determined and extracted.

【０００９】本発明の物体認識方法は、入力された画像
を記録して保持し、前記画像中の局所的特徴要素を抽出
し、前記局所的特徴要素近傍領域の色、局所空間周波
数、強度などの領域ベース情報を抽出し、前記局所的特
徴要素と前記領域ベース情報との配置情報を生成し、認
識すべき物体の、前記局所的特徴要素の組み合せ配置情
報を記憶情報として記憶し、生成した前記配置情報と前
記記憶情報とを照合して判定する。The object recognition method of the present invention records and holds an input image, extracts local feature elements in the image, and extracts the color, local spatial frequency, intensity, etc. of the local feature element neighborhood region. Region-based information is extracted, placement information of the local feature element and the region-based information is generated, and combined placement information of the local feature element of the object to be recognized is stored as storage information and generated. The placement information and the stored information are collated and determined.

【００１０】本発明の物体認識方法は、入力された画像
を記録して保持し、前記画像中の局所的特徴要素を抽出
し、認識すべき物体の、前記局所的特徴要素のモデル図
形要素を第１の記憶情報として記憶し、抽出した前記局
所的特徴要素と前記第１の記憶情報とから前記局所的特
徴要素の中間的図形要素を抽出し、前記中間的図形要素
の配置情報を生成し、認識すべき物体の、前記モデル図
形要素の組み合せ配置情報を第２の記憶情報として記憶
し、生成した前記中間的図形要素の前記配置情報と前記
第２の記憶情報とを照合して判定する。The object recognition method of the present invention records and holds an input image, extracts a local feature element in the image, and extracts a model graphic element of the local feature element of the object to be recognized. An intermediate graphic element of the local feature element is extracted from the extracted local feature element and the first stored information stored as the first memory information, and the arrangement information of the intermediate graphic element is generated. , Storing the combination layout information of the model graphic elements of the object to be recognized as second storage information, and determining by comparing the generated layout information of the intermediate graphic elements with the second storage information. .

【００１１】前記局所的特徴要素としては、複数方向の
エッジセグメントの交差パターン、曲率一定の曲線の全
部またはその一部、およびエッジセグメントを抽出す
る。As the local feature elements, an intersection pattern of edge segments in a plurality of directions, all or a part of a curve having a constant curvature, and edge segments are extracted.

【００１２】前記局所的特徴要素の前記配置情報は、前
記局所的特徴要素に所定の方法で離散化した数値を割り
当てられた数値化要素の、２次元配列または３次元配列
として表わす。The arrangement information of the local feature elements is represented as a two-dimensional array or a three-dimensional array of digitized elements to which numerical values discretized by a predetermined method are assigned to the local feature elements.

【００１３】前記局所的特徴要素の前記組み合せ配置情
報は、抽出した前記局所的特徴要素を所定サイズおよび
所定形状単位で構成される格子空間上に再配列して得ら
れる特徴要素のパターンによって表わす。The combination arrangement information of the local feature elements is represented by a pattern of feature elements obtained by rearranging the extracted local feature elements on a lattice space constituted by a predetermined size and a predetermined shape unit.

【００１４】前記局所的特徴要素を抽出する処理は、複
数の異なる大きさのスケーリングパラメータごとに行
う。The process of extracting the local feature element is performed for each of a plurality of scaling parameters having different sizes.

【００１５】[0015]

[Action]

（ａ）入力画像中の局所的特徴要素を抽出して配置情報
を生成し、あらかじめ記憶した認識すべき物体の局所的
特徴要素の組み合せ配置情報と照合して認識情報を判定
し、入力画像中の、認識情報の存在領域を決定し抽出し
ている。その際に、局所的特徴要素として、複数方向の
エッジセグメントの交差パターン、曲率一定の曲線の全
部またはその一部、およびエッジセグメントを、複数の
異なる大きさのスケーリングパラメータごとに抽出して
いる。また、局所的特徴要素の配置情報を局所的特徴要
素の離散化した数値化要素の２次元配列として表わして
いる。さらに、局所的特徴要素の組み合せ配置情報を、
抽出した局所的特徴要素を所定サイズおよび所定形状単
位で構成される格子空間上に再配列して得られる特徴要
素のパターンによって表わしている。上記の方法によ
り、認識対象画像データに要するメモリ容量を節減し、
認識処理の効率を向上することができる。(A) A local feature element in the input image is extracted to generate layout information, and the recognition information is determined by collating it with the combination layout information of the local feature elements of the object to be recognized, which is stored in advance. The area in which the recognition information exists is determined and extracted. At this time, as local feature elements, an intersection pattern of edge segments in a plurality of directions, all or part of a curve having a constant curvature, and edge segments are extracted for each of a plurality of scaling parameters having different sizes. Also, the arrangement information of the local feature elements is represented as a two-dimensional array of digitized numerical elements of the local feature elements. Furthermore, the combination arrangement information of the local feature elements,
The extracted local feature elements are represented by a pattern of feature elements obtained by rearranging the extracted local feature elements on a lattice space composed of a predetermined size and a predetermined shape unit. By the above method, the memory capacity required for the recognition target image data is saved,
The efficiency of recognition processing can be improved.

【００１６】（ｂ）局所的特徴要素の配置情報を数値化
要素の３次元配列に拡張することにより、画像に対する
視点位置の変化に対応した同一物体の任意の視点位置か
らの物体認識、および撮像時の照明条件の変化に対応し
た物体認識の際に、抽出する局所的特徴要素の種類は敏
感に変化せず、画像中の物体の変形の影響を受け難い物
体認識ができる。(B) Object information from an arbitrary viewpoint position of the same object corresponding to a change in the viewpoint position with respect to an image and image pickup by expanding the arrangement information of the local feature elements into a three-dimensional array of digitizing elements At the time of object recognition corresponding to changes in illumination conditions, the types of local feature elements to be extracted do not change sensitively, and it is possible to perform object recognition less susceptible to the deformation of objects in the image.

【００１７】（ｃ）局所的特徴要素近傍領域の色、局所
空間周波数、強度などの領域ベース情報を抽出し、局所
的特徴要素と領域ベース情報との配置情報を生成するこ
とにより、画像中に複数の物体が存在し、複数の物体の
一部が互いに重なり合ったり接触するなどして物体の本
来の形が欠ける、隠れるなどの強い変形が存在する場合
でも、あらかじめ領域分割を行うことなくロバストな認
識を行うことができる。(C) Local feature element By extracting region base information such as color, local spatial frequency, intensity, etc. of a region near the local feature element and generating arrangement information of the local feature element and the region base information, Even if there are multiple objects and there is a strong deformation such as the original shape of the object being lost or hidden due to parts of the multiple objects overlapping or touching each other, robust without region division in advance. Can recognize.

【００１８】これにより画像中のどの位置にどの認識す
べき対象があるかを出力し、その位置を中心とした撮
像、あるいは対象画像を中心とした部分画像を原画像か
ら抽出し、特定対象を中心とした撮像、あるいは特定対
象を含む画像と他の画像と合成するなどの画像編集を、
効率良く、かつロバストに行うために必要な情報を出力
することができる。As a result, which position in the image is to be recognized and which target should be recognized are output, and an image centering on that position or a partial image centering on the target image is extracted from the original image to identify the specific target. Image editing such as centering imaging or combining images containing specific objects with other images,
It is possible to output information necessary for efficient and robust operation.

【００１９】（ｄ）局所的特徴要素の中間的図形要素を
抽出し、中間的図形要素の配置情報を生成することによ
り、階層的特徴抽出に基づく認識を行うことができ、複
数の物体が互いに重なり合うなどして撮像された画像に
おいても、その影響を受け難いロバストな認識ができ
る。(D) By extracting the intermediate graphic element of the local feature element and generating the arrangement information of the intermediate graphic element, recognition based on the hierarchical feature extraction can be performed, and a plurality of objects can be recognized by each other. Even in images captured by overlapping, robust recognition that is not easily affected by the influence can be performed.

【００２０】[0020]

【実施例】本発明の実施例について、図面を用いて説明
する。Embodiments of the present invention will be described with reference to the drawings.

【００２１】図１は、本発明の第１実施例における処理
部の構成図である。図１において、画像入力部Ｓ₁₁は、
撮像手段により得られる画像データを所定記録媒体に記
録し保持する。局所的特徴要素抽出部Ｓ₁₂は、画像中各
領域において、スケーリングパラメータσによりあらか
じめ設定した、複数のスケール（サイズ）の有限個の局
所的特徴要素、例えば種々のエッジセグメントの交差パ
ターン（Ｌ型、Ｔ型、Ｘ型、Ｙ型交差など）、種々の曲
率（一定とする）および向きを有する曲線セグメントな
どの線分（曲線分）で構成される局所的特徴要素パター
ンを抽出し、抽出した局所的特徴要素以外はデータとし
て保持しない。抽出特徴要素配列データ生成部Ｓ₁₃は、
Ｓ₁₂で抽出した局所的特徴要素の画像データから、あら
かじめ設定した２次元配列構造（セルアレイ）上に各局
所的特徴要素を所定のデータフォーマットにより変換し
て、およその配置関係を保った配列データを生成する。
また、局所的特徴要素モデル配列データ記憶部Ｓ₁₄は、
認識すべき画像の局所的特徴要素パターンとして、モデ
ル配列データ（複数可能）を記憶する。Ｓ₁₄において記
憶したモデル配列データは、マッチング処理部Ｓ₁₅で一
種のテンプレートとして使われる。Ｓ₁₅は、Ｓ₁₃の配列
データとＳ₁₄のモデル配列データとの差の２乗和などに
代表される誤差量を評価し、誤差量が閾値以下となるよ
うなモデル配列データを、認識パターンとして判定す
る。さらに適合画像領域抽出部Ｓ₁₆は、認識パターンの
原画像中の存在領域を決定し、抽出する。FIG. 1 is a block diagram of the processing unit in the first embodiment of the present invention. In FIG. 1, the image input unit S ₁₁ is
Image data obtained by the image pickup means is recorded and held in a predetermined recording medium. The local feature element extraction unit S ₁₂ determines a finite number of local feature elements of a plurality of scales (sizes) preset by the scaling parameter σ in each region in the image, for example, an intersection pattern of various edge segments (L type). , T-shaped, X-shaped, Y-shaped intersections, etc.), and local feature element patterns composed of line segments (curve segments) such as curved line segments having various curvatures (constant) and orientations are extracted and extracted. Data other than local feature elements are not stored. The extraction feature element array data generation unit S ₁₃
From the image data of the local feature elements extracted in S ₁₂ , each local feature element is converted into a predetermined two-dimensional array structure (cell array) by a predetermined data format, and the array data maintaining an approximate arrangement relationship. To generate.
In addition, the local feature element model array data storage unit S ₁₄
Model array data (plural possible) is stored as a local feature element pattern of an image to be recognized. The model array data stored in S ₁₄ is used as a kind of template in the matching processing unit S ₁₅ . S ₁₅ evaluates the error amount represented by the sum of squares of the difference between the S ₁₃ sequence data and the S ₁₄ model sequence data, and recognizes the model sequence data such that the error amount is equal to or less than a threshold as a recognition pattern. Is determined as. Further, the matching image area extraction unit S ₁₆ determines and extracts the existing area of the recognition pattern in the original image.

【００２２】Ｓ₁₂以降の各処理部での処理内容につい
て、以下に説明する。The processing contents of each processing unit after S ₁₂ will be described below.

【００２３】図２は、抽出した局所的特徴要素パターン
例を示す図である。Ｓ₁₂で抽出するべき局所的特徴要素
であるエッジセグメントの交差パターン抽出方式として
は、Deriche, R., Giraudon, G. (1993) (Internationa
l Journal of Computer Vision, Vol.10, 101-124)、Ro
hr, K. and Schnoerr, C. (1993) (Image and VisionCo
mputing, Vol.11, 273-277)、磯、志沢 (1993) (信学技
報、Vol.IE92-125, pp.33-40)などの方式が挙げられる
が、ここでは特に限定するものではない。図２において
は、Ｌ型交差として、向きの異なる有限個（Ｌ₁、Ｌ₂、
……、FIG. 2 is a diagram showing an example of the extracted local feature element pattern. As a method of extracting a crossing pattern of edge segments which are local feature elements to be extracted in S ₁₂ , Deriche, R., Giraudon, G. (1993) (Internationa
l Journal of Computer Vision, Vol.10, 101-124), Ro
hr, K. and Schnoerr, C. (1993) (Image and VisionCo
mputing, Vol.11, 273-277), Iso, Shizawa (1993) (Technical Bulletin, Vol.IE92-125, pp.33-40). is not. In FIG. 2, a finite number of different directions (L ₁ , L ₂ ,
......,

【００２４】[0024]

【外１】の要素（ここでは８個）に限る。交差角度βは０°＜β
＜１８０°とし、交差角度でＬ型交差の種別を分けるの
ではなく、Ｌ型交差の向き（交差の２等分角度線方向）
で８種類に分けている。Ｌ型交差の組み合せによって得
られるＴ型、Ｘ型、Ｙ型、アロー型の交差についても、
上記のRohr, K. and Schnoerr, C. (1993)などに提示さ
れる方式により抽出することができる。[Outer 1] Elements (8 here). Crossing angle β is 0 ° <β
<180 °, not the type of L-shaped intersection depending on the intersection angle, but the direction of the L-shaped intersection (diagonal angle line direction of the intersection)
Are divided into 8 types. For T-shaped, X-shaped, Y-shaped, and arrow-shaped intersections obtained by combining L-shaped intersections,
It can be extracted by the method presented in Rohr, K. and Schnoerr, C. (1993).

【００２５】また、他の局所的特徴要素である曲率一定
の曲線要素の抽出方式としては、Koenderink, J. and R
ichards, W. (1988) (J. Opt. Soc. Am. A, Vol.5, pp.
1136-1141)、Li, S. Z. (1990) (International Journa
l of Computer Vision, Vol.5, pp.161-194)などに説明
されている。図２においては、曲率一定の曲線要素、す
なわち円弧の方向をその中点での内向き法線ベクトルの
方向により有限個（Ｃ _v1、Ｃ_v2、……、In addition, another local characteristic element, which is constant curvature,
Koenderink, J. and R
ichards, W. (1988) (J. Opt. Soc. Am. A, Vol.5, pp.
1136-1141), Li, S. Z. (1990) (International Journa
l of Computer Vision, Vol.5, pp.161-194) etc.
Has been done. In FIG. 2, a curved element with a constant curvature,
That is, the direction of the arc is the inward normal vector at its midpoint.
Finite number (C _v1, C_v2, ……,

【００２６】[0026]

【外２】の要素（ここでは８個）に限る。[Outside 2] Elements (8 here).

【００２７】さらに、上記の交差パターンまたは曲率要
素抽出時のスケーリングパラメータσを離散的に有限個
（例えばσ＝２、４、８、１６、３２画素の５個）設定
し、各スケーリングパラメータごとに局所的特徴要素の
抽出を行う。このσは、前記交差パターンあるいは曲率
要素の抽出の際に行われるスムージング（例えばFurther, a finite number of scaling parameters σ (for example, σ = 2, 4, 8, 16, 32 pixels) at the time of extracting the intersection pattern or the curvature element are set, and each scaling parameter is set. Extract local feature elements. This σ is the smoothing (for example, the smoothing performed when the intersection pattern or the curvature element is extracted).

【００２８】[0028]

【数１】のガウシアン関数との畳み込み演算などによる）の程度
を表わす。Ｓ₁₂は、あらかじめ設定した局所的特徴要素
のうち最も近いものを抽出して符号化する処理までを含
む。[Equation 1] Of the Gaussian function and the convolution operation)). S ₁₂ includes up process of encoding by extracting the closest among the local feature elements set in advance.

【００２９】図３は、図２の局所的特徴要素を用いた顔
画像の符号化例を示す図であり、Ｓ ₁₂により顔画像をあ
るスケーリングパラメータσで符号化している。FIG. 3 shows a face using the local feature elements of FIG.
It is a figure which shows the example of encoding of an image, S ₁₂The face image
It is encoded with the scaling parameter σ.

【００３０】次に、Ｓ₁₃では、符号化した局所的特徴要
素の空間配置関係を、あらかじめ設定したサイズおよび
形状のセルからなる格子空間上にマッピングすることに
より表現する。図４は、符号化した局所的特徴要素配列
表示用格子空間例を示す図である。図４においては、格
子空間をＮ、Ｅ、Ｗ、Ｓ、ＮＥ、ＮＷ、ＳＷ、ＳＥの８
方向に区分し、矩形の格子サイズをスケーリングパラメ
ータσと同程度に設定する。このようにして局所的特徴
要素間の大まかな配置関係に再構成し直すことにより、
元の画像の変形に対して不変な画像表現を得る。さらに
各スケーリングパラメータσごとにこのような画像の不
変表現形式を抽出することにより、認識すべき画像を、
符号化した局所的特徴要素間の相対配置関係の空間サイ
ズによらない共通な局所的特徴要素パターンモデルとし
て、あらかじめ記憶させることができる。Next, in S ₁₃ , the spatial arrangement relationship of the coded local feature elements is expressed by mapping on a lattice space composed of cells of a preset size and shape. FIG. 4 is a diagram showing an example of the encoded local feature element array display grid space. In FIG. 4, the lattice space is set to 8 of N, E, W, S, NE, NW, SW, and SE.
The grid size of the rectangle is set to the same degree as the scaling parameter σ. In this way, by reconstructing the rough arrangement relationship between local feature elements,
Obtain an image representation that is invariant to deformations of the original image. Furthermore, by extracting such an invariant expression form of the image for each scaling parameter σ, the image to be recognized is
It can be stored in advance as a common local feature element pattern model that does not depend on the spatial size of the relative arrangement relationship between the encoded local feature elements.

【００３１】このように第１実施例は、画像をより少な
い数のあらかじめ設定した局所的特徴要素と限定された
数のマトリクス状の空間配置関係との組み合せで表現す
ることにより、物体認識プロセスの効率向上（すなわち
演算コストの低減）と、物体の画像中のサイズの変化お
よび変形による影響を受け難い物体認識を可能にしてい
る。As described above, the first embodiment expresses an image by a combination of a smaller number of preset local feature elements and a limited number of spatial arrangement relations in a matrix form, thereby performing the object recognition process. This makes it possible to improve the efficiency (that is, reduce the calculation cost) and recognize an object that is not easily affected by the size change and deformation of the image of the object.

【００３２】次に、格子空間にマッピングした局所的特
徴要素配列を認識するために必要な配列データの符号化
について説明する。第１実施例では、Ｓ₁₅において、モ
デル配列データと実画像から生成した抽出特徴要素配列
データとのマッチングにより認識を実行するが、これを
計算機の数値演算により実行するためには局所的特徴要
素の各パターンを何らかの方法で数値化する必要があ
る。そこで全局所的特徴要素数をＭとして、各局所的特
徴要素に例えば１からＭまでの番号を付ける。付番の方
法は特に限定する必要はないが、同一カテゴリの局所的
特徴要素（例えば向きの異なるＬ型交差）同士は連続ま
たは値の近い番号とすることが望ましい。また局所的特
徴要素の存在しないセル（配列）の値は０または前記付
番の番号以外の値を用いればよい。局所的特徴要素の数
値符号化後の認識の処理例としては、通常のテンプレー
トマッチングの手法を用いてもよい。ただしモデル配列
データは画像のサイズによらない点が従来のテンプレー
トベースの方式とは異なる。すなわち画像からスケーリ
ングパラメータσ₁、σ₂、……、σ_nでそれぞれ局所的
特徴要素を符号化してモデル配列データとマッチングを
とる際には、モデル配列の格子サイズを仮想的に実際の
画像から抽出したデータの格子サイズと一致するように
縮少または拡大させる。したがって、異なる格子サイズ
ごとに認識すべき画像の局所的特徴要素のモデル配列デ
ータを用意する必要がない。Next, the encoding of array data necessary for recognizing the local feature element array mapped in the lattice space will be described. In the first embodiment, in S ₁₅ , the recognition is executed by matching the model array data with the extracted feature element array data generated from the actual image. However, in order to execute this recognition by the numerical operation of the computer, the local feature element is used. It is necessary to quantify each pattern of. Therefore, assuming that the total number of local feature elements is M, each local feature element is numbered from 1 to M, for example. The numbering method is not particularly limited, but it is desirable that local feature elements (for example, L-shaped intersections with different orientations) in the same category be consecutive or have similar values. The value of the cell (array) in which the local feature element does not exist may be 0 or a value other than the numbered numbers. An ordinary template matching method may be used as an example of the recognition processing after the numerical encoding of the local feature element. However, the model array data is different from the conventional template-based method in that it does not depend on the image size. That is, when the local feature elements are coded with the scaling parameters σ ₁ , σ ₂ , ..., σ _n from the image to match with the model array data, the lattice size of the model array is virtually calculated from the actual image. Reduce or expand to match the grid size of the extracted data. Therefore, it is not necessary to prepare model array data of local feature elements of an image to be recognized for different grid sizes.

【００３３】例えば顔画像の認識の場合には、認識に必
要な部位となる目、口などに対して、あらかじめＬ型交
差、曲線要素などの局所的特徴要素によりサイズ不変の
モデルマスクデータを作成し、さらに目と口の相対位置
関係を保持したサイズ不変のモデル配列データ（ただし
局所的特徴要素抽出時のスケーリングパラメータσに応
じて縮少あるいは拡大する）をマスクパターンとして記
憶し、局所的特徴要素抽出後の画像の各領域を走査し
て、最小２乗法などによりモデル配列データとのマッチ
ングの度合を計算する。すなわちFor example, in the case of recognizing a face image, size-invariant model mask data is created in advance for the eyes, mouth, etc., which are the parts required for recognition, by local feature elements such as L-shaped intersections and curve elements. In addition, size-invariant model array data that retains the relative positional relationship between the eyes and mouth (however, is reduced or expanded according to the scaling parameter σ when extracting local feature elements) is stored as a mask pattern, and the local feature is stored. Each area of the image after element extraction is scanned, and the degree of matching with the model array data is calculated by the least square method or the like. Ie

【００３４】[0034]

【外３】をスケーリングパラメータσで正規化した格子空間上
（ｉ，ｊ）の位置におけるセルの値（局所的特徴要素に
対応）とし、[Outside 3] Is a cell value (corresponding to a local feature element) at a position on the lattice space (i, j) normalized by the scaling parameter σ,

【００３５】[0035]

【外４】をスケーリングパラメータσでの格子空間上（ｉ，ｊ）
の位置におけるセルの値とすると、認識の演算過程は、
例えば数２で定義される。[Outside 4] On the lattice space with the scaling parameter σ (i, j)
Given the value of the cell at the position of, the calculation process of recognition is
For example, it is defined by Equation 2.

【００３６】[0036]

【数２】このＦ（ｋ，ｐ）が所定の閾値以下（あるいは閾値以
上）で極小（あるいは極大）となる位置（ｋ，ｐ）を求
めることによって、認識すべき対象が原画像中のどの位
置にあるかを出力する。ここで‖ｘ，ｙ‖は、（ｘ−
ｙ）の絶対値または（ｘ−ｙ）²ⁿ（ｎ＝１、２、……）
など、（ｘ−ｙ）に関する偶関数で非負値をとるものが
望ましい。この場合には‖ｘ，ｙ‖が閾値以下のとき、
ｙはｘであると認識する。またＪは認識すべき対象の格
子空間上で占める配列要素の範囲を示し、標準的には
（ｉ＝１、２、……、ｑ；ｊ＝１、２、……、ｒ）のよ
うに定めればよい。[Equation 2] By determining the position (k, p) where F (k, p) becomes a minimum value (or a maximum value) below (or above) a predetermined threshold value, which position in the original image should be recognized. Is output. Where ‖x and y‖ are (x−
The absolute value of y) or (xy) ²ⁿ (n = 1, 2, ...)
It is desirable that the non-negative value is an even function related to (xy). In this case, when ‖x and y‖ are below the threshold,
Recognize that y is x. Also, J indicates the range of array elements occupied in the lattice space of the object to be recognized, and is standardized as (i = 1, 2, ..., q; j = 1, 2, ..., R). You can set it.

【００３７】また関数Ｆ（ｋ，ｐ）としてはAs the function F (k, p),

【００３８】[0038]

【外５】と[Outside 5] When

【００３９】[0039]

【外６】との相関を計算してもよい。この場合はｑ×ｒのブロッ
クサイズのモデル配列データ[Outside 6] The correlation with may be calculated. In this case, model array data of q × r block size

【００４０】[0040]

【外７】を画像からの抽出データ[Outside 7] The extracted data from the image

【００４１】[0041]

【外８】上を走査しながら、相関値が閾値以上で極大となる
（ｋ，ｐ）を求める。さらに、認識すべき対象を中心と
した撮像システム、画像編集システムへ処理結果を出力
して所望の機能動作を行ってもよい。[Outside 8] While scanning the upper part, (k, p) at which the correlation value becomes the maximum when it is equal to or larger than the threshold value is obtained. Furthermore, the processing result may be output to an image pickup system or an image editing system centering on an object to be recognized to perform a desired functional operation.

【００４２】図５は、本発明の第２実施例における３次
元格子空間の構造図を示す。第２実施例では、第１実施
例で説明した局所的特徴要素の３次元的な空間配置関係
を抽出し、モデル化する。立体計測の手法としては、実
写の画像（例えば所定の値で離間した２台のカメラで撮
像して得られるステレオ写真）から画像処理により対応
点を抽出する方法、レーザ光線を照射して反射光の位相
を計測する方法、あるいは構造パターン（メッシュパタ
ーンなど）を投影してその変形度を測る方法などが挙げ
られる。FIG. 5 is a structural diagram of a three-dimensional lattice space in the second embodiment of the present invention. In the second embodiment, the three-dimensional spatial arrangement relationship of the local feature elements described in the first embodiment is extracted and modeled. As a method of stereoscopic measurement, a method of extracting corresponding points by image processing from a real image (for example, a stereo photograph obtained by taking images with two cameras separated by a predetermined value) and a reflected light by irradiating a laser beam There is a method of measuring the phase of, or a method of projecting a structural pattern (such as a mesh pattern) and measuring the degree of deformation thereof.

【００４３】図５においては、格子空間のセル形状を球
面を経度および緯度方向に等分割して得られる矩形とし
ているが、他の形状単位（例えば三角形など）で他の立
体（例えば円柱）を分割して得てもよい。このように３
次元格子空間は、対象物体の任意視点位置からの画像を
認識する場合に適用することができる。すなわち同一対
象であっても、ある視点から見て得られる画像と他の視
点による画像とでは一般的に異なり、１枚の２次元的画
像からだけでは、視点位置を変えたときの画像パターン
の変化を予測することは困難であり、また全ての視点位
置からの画像を記録して認識に利用することもほとんど
不可能である。しかしながら、限定された数の局所的特
徴要素の３次元的空間配置関係を３次元的に離散化した
代表点（格子空間上の一点）にマッピングしたものをマ
ッチング用モデルデータとし、同じドメイン（格子空
間）で実際の画像とのマッチングの度合いを測ることに
より任意の視点位置からの立体の画像認識に要する処理
の効率向上とメモリの節減を飛躍的に図ることができ
る。In FIG. 5, the cell shape of the lattice space is a rectangle obtained by equally dividing the spherical surface in the longitude and latitude directions, but another shape unit (for example, a triangle) is used for another solid (for example, a cylinder). It may be obtained by dividing. Like this 3
The dimensional lattice space can be applied when recognizing an image from an arbitrary viewpoint position of a target object. That is, even for the same object, an image obtained from a certain viewpoint is generally different from an image obtained from another viewpoint, and an image pattern when changing the viewpoint position can be obtained only from one two-dimensional image. It is difficult to predict changes, and it is almost impossible to record images from all viewpoint positions and use them for recognition. However, the three-dimensional spatial arrangement relationship of a limited number of local feature elements is mapped to a representative point (one point on the lattice space) that is three-dimensionally discretized as the matching model data, and the same domain (lattice By measuring the degree of matching with an actual image in (space), it is possible to dramatically improve the efficiency of processing and memory saving required for stereoscopic image recognition from an arbitrary viewpoint position.

【００４４】第２実施例では、球面上を被覆する有限個
のセルからなる領域（認識すべき対象をある視点から見
たときに見える範囲に相当）の個々のセルに局所的特徴
要素に固有な数値（あるいは記号）を設定して得られる
ｎ×ｍ配列ブロックのモデル配列データを、実際の画像
からのＮ×Ｍ配列ブロック（Ｎ＞ｎ、Ｍ＞ｍ）の配列デ
ータ上を走査して、第１実施例と同様のマッチング処理
を行う。In the second embodiment, a local feature element is unique to each cell in a region (corresponding to a range visible when the object to be recognized is seen from a certain viewpoint) consisting of a finite number of cells covering the spherical surface. The model array data of the n × m array block obtained by setting various numerical values (or symbols) is scanned on the array data of the N × M array block (N> n, M> m) from the actual image. The same matching process as in the first embodiment is performed.

【００４５】図６は、本発明の第３実施例における処理
部の構成図である。図６において、画像入力部Ｓ₆₁、局
所的特徴要素抽出部Ｓ_62a、マッチング処理部Ｓ₆₅は、
それぞれ図１のＳ₁₁、Ｓ₁₂、Ｓ₁₅と同様の処理を行う。
領域情報抽出部Ｓ_62bにおいてはＳ_62aと同様に、スケー
リングパラメータσに応じた大きさのブロックごとに、
局所的特徴要素を含む近傍領域の代表色、平均強度、局
所空間周波数などの領域情報の抽出を行う。Ｓ₆₁から入
力された画像は、Ｓ_62aおよびＳ_62bにおいて所定の処理
が施される。配列データ生成部Ｓ₆₃は、Ｓ_62aおよびＳ
_62bにより抽出した局所的特徴要素および領域情報か
ら、配列データの生成を行う。モデル配列データ記憶部
Ｓ₆₄は、スケーリングパラメータσに応じた矩形ブロッ
クであらかじめ分割した認識すべき画像の各ブロックご
とに、局所的特徴要素および領域情報を抽出した、モデ
ル配列データを記憶する。FIG. 6 is a block diagram of the processing unit in the third embodiment of the present invention. In FIG. 6, the image input unit S ₆₁ , the local feature element extraction unit S _62a , and the matching processing unit S ₆₅ are
The same processes as S ₁₁ , S ₁₂ , and S _{15 of} FIG. 1 are performed, respectively.
In the area information extraction unit S _62b , similarly to S _62a , for each block having a size corresponding to the scaling parameter σ,
Area information such as representative color, average intensity, and local spatial frequency of the neighborhood area including local feature elements is extracted. Image input from the S _61, the predetermined processing is performed in the S _62a and S _62b. The array data generation unit S ₆₃ uses S _62a and S _62a.
Sequence data is generated from the local feature elements and area information extracted by _62b . The model array data storage unit S ₆₄ stores model array data in which local feature elements and area information are extracted for each block of an image to be recognized which is divided in advance by rectangular blocks according to the scaling parameter σ.

【００４６】以下、領域情報として色を例にとり、２次
元画像認識に限定して説明する。抽出するブロックごと
の代表色としては、以下に定義される色ベクトルHereinafter, the color will be taken as an example of the area information, and the description will be limited to the two-dimensional image recognition. The color vector defined below is used as the representative color for each block to be extracted.

【００４７】[0047]

【外９】を用いる。[Outside 9] To use.

【００４８】[0048]

【数３】ここに[Equation 3] here

【００４９】[0049]

【外１０】は画像中の画素位置（ｉ，ｊ）におけるセンサのＲ画素
の出力強度を表わし、[Outside 10] Represents the output intensity of the R pixel of the sensor at pixel position (i, j) in the image,

【００５０】[0050]

【外１１】は同様にＧ画素、Ｂ画素の出力強度を表わす。記号[Outside 11] Similarly represents the output intensity of the G pixel and the B pixel. symbol

【００５１】[0051]

【外１２】はブロックごとの画素値の加算を示し、同じブロック内
にある全ての画素位置（ｉ，ｊ）にわたって行う。[Outside 12] Indicates addition of pixel values for each block, and is performed over all pixel positions (i, j) in the same block.

【００５２】このようにして、Ｓ_62aとＳ_62bにおいてス
ケーリングパラメータσごとに抽出する局所的特徴要素
と代表色などの領域情報に基いて、Ｓ₆₃において認識処
理のための配列データの生成を行う。In this way, the array data for the recognition process is generated in S ₆₃ based on the local feature elements extracted for each scaling parameter σ in S _{62 a} and S _{62 b} and the region information such as the representative color. .

【００５３】Ｓ₆₄において記憶するモデル配列データ
は、局所的特徴要素用のThe model array data stored in S ₆₄ is for the local feature elements.

【００５４】[0054]

【外１３】領域情報用の[Outside 13] For area information

【００５５】[0055]

【外１４】それぞれの配列における、局所的特徴要素あるいは代表
色固有の数値データである。例えば位置（ｋ，ｐ）のブ
ロックの色に関しては[Outside 14] Numerical data unique to the local feature element or representative color in each array. For example, for the color of the block at position (k, p)

【００５６】[0056]

【数４】の様に２次元ベクトル表示してもよいし、元のまま[Equation 4] You can display a two-dimensional vector like

【００５７】[0057]

【外１５】を用いてもよい。局所的特徴要素に関しては第１実施例
に示したとおりである。[Outside 15] May be used. The local feature elements are as described in the first embodiment.

【００５８】モデル配列データと画像から抽出した配列
データとのマッチング、すなわち認識の過程の第１の方
法としては、初めに領域情報（色）ベースでマッチング
をとり、次に色でおよその類似対応がとれた領域（ブロ
ック）について局所的特徴要素ベースでマッチングをと
る方法でもよい。第２の方法としては、マッチングの順
序を逆にして先に局所的特徴要素ベースで類似対応がと
れる領域を抽出し、次にそれら領域ごとに色ベースでの
類似対応の絞り込みを行ってもよい。また第３の方法と
しては、局所的特徴要素ベースでのマッチングの評価関
数ｆ_Fと領域情報ベースでのマッチングの評価関数ｆ_Aと
を適当な重みλをつけて加算した総合評価関数ｆｆ＝ｆ_F＋λｆ_A （１）の値が所定の閾値以下となるような位置を求めてもよ
い。ただし第１および第２の方法において「マッチング
をとる」とは第１実施例に示したようにモデルデータAs a first method of matching between model array data and array data extracted from an image, that is, a recognition process, first, matching is performed on the basis of area information (color), and then color is used to approximate similarity. A method may be used in which matching is performed on the basis of a local feature element with respect to a region (block) in which a defect has occurred. As a second method, it is possible to reverse the order of matching, first to extract regions where similar correspondence can be obtained on the basis of local feature elements, and then to narrow down the similar correspondence on a color basis for each region. . As a third method, a total evaluation function f f = is obtained by adding a matching evaluation function f _F based on a local feature element and a matching evaluation function f _A based on a region information base with appropriate weights λ. _A position may be obtained such that the value of f _F + λf _A (1) is equal to or less than a predetermined threshold value. However, in the first and second methods, "matching" means model data as described in the first embodiment.

【００５９】[0059]

【外１６】および[Outside 16] and

【００６０】[0060]

【外１７】と実際の画像からの抽出データ[Outside 17] And extracted data from the actual image

【００６１】[0061]

【外１８】に対し適当な評価関数[Outside 18] Appropriate evaluation function for

【００６２】[0062]

【数５】が所定の閾値以下となる（ｋ，ｐ）を求めることであ
る。なお‖ｘ，ｙ‖は第１実施例で提示した関数であ
る。[Equation 5] Is to obtain (k, p) that is less than or equal to a predetermined threshold. Note that ‖x and y‖ are the functions presented in the first embodiment.

【００６３】なお、局所的特徴要素情報と領域情報とを
組み合せることにより画像中に複数の物体が存在し、複
数の物体の一部が重なりあっているような状態において
も、あらかじめ領域分割して一つの領域内に一つの物体
のみを存在させることなく、認識を行うことができる。
図７は、Ｔ型交差が遮蔽により生じる場合の３領域の説
明図である。図７においては、画像中から他の局所的特
徴要素と比べて大きなサイズでＴ型交差を検出し、かつ
そのサイズでのＴ型交差に接する３つの領域情報Ａ₇₁、
Ａ₇₂、Ａ₇₃の属性（例えば色）が、Ａ₇₂とＡ₇₃はほぼ等
しいがＡ₇₁とは大きく異なる場合などには、Ａ₇₁によっ
てＡ₇₂およびＡ₇₃に対応する物体が一部遮蔽された状況
に相当する可能性があり、Ｔ型交差近傍においてＡ₇₂お
よびＡ₇₃を含む領域での画像を認識する場合は、モデル
配列データとのマッチングをとる際に、実際の画像デー
タからＡ₇₁を含みＡ₇₁と同じ属性をもつ領域を除いた
り、誤差の極小値検出によって閾値レベルの認識の判定
を行うときはこれを所定値上げ、相関によって判定を行
う場合にはこれを所定値下げるなどの処理をＳ₆₅に加え
ることにより、領域分割を前提としない認識ができる。Even when a plurality of objects are present in the image by combining the local feature element information and the area information and some of the plurality of objects overlap each other, the areas are divided in advance. It is possible to perform recognition without causing only one object to exist in one area.
FIG. 7 is an explanatory diagram of three regions when the T-shaped intersection is caused by the shielding. In FIG. 7, a T-shaped intersection is detected with a size larger than other local feature elements in the image, and three pieces of area information A ₇₁ contacting the T-shaped intersection with that size are detected.
Attributes of A _72, A ₇₃ (e.g. color), A ₇₂ and A ₇₃ in the like case substantially equal but for very different from the A _71, the object corresponding to the A ₇₂ and A ₇₃ by A ₇₁ is partially shielded When recognizing the image in the area including A ₇₂ and A ₇₃ near the T-shaped intersection, it is possible to use A ₇₁ from the actual image data when matching with the model array data. If a region having the same attribute as A ₇₁ is excluded, or if the threshold level is recognized by detecting the minimum value of the error, the threshold value is increased by a predetermined value, and if the correlation is determined, the value is decreased by a predetermined value. By adding the processing to S ₆₅ , it is possible to perform recognition that is not based on area division.

【００６４】図８は本発明の第４実施例における処理部
の構成図である。図８において、画像入力部Ｓ₈₁、局所
的特徴要素抽出部Ｓ₈₂、配列データ生成部Ｓ₈₃、マッチ
ング処理部Ｓ₈₅は、図１のＳ₁₁、Ｓ₁₂、Ｓ₁₃、Ｓ₁₅と同
様の処理を行う。中間的図形要素抽出部Ｓ₈₇は、物体の
画像の一部を形成し図形コンセプトとして意味をもつま
とまり、すなわち中間的図形要素を抽出する。モデル図
形要素記憶部Ｓ₈₈は、中間的図形要素のモデル図形要素
を、あらかじめ記憶する。モデル図形要素配列データ記
憶部Ｓ₈₄は、Ｓ₈₃の配列データとのマッチングをとるた
めの、モデル図形要素配列データを、あらかじめ記憶す
る。FIG. 8 is a block diagram of the processing unit in the fourth embodiment of the present invention. 8, the image input unit S ₈₁ , the local feature element extraction unit S ₈₂ , the array data generation unit S ₈₃ , and the matching processing unit S ₈₅ are the same as S ₁₁ , S ₁₂ , S ₁₃ , and S _{15 of} FIG. Perform processing. The intermediate graphic element extraction unit S ₈₇ forms a part of the image of the object and has a meaning as a graphic concept, that is, an intermediate graphic element is extracted. The model graphic element storage unit S ₈₈ stores in advance the model graphic element of the intermediate graphic element. The model graphic element array data storage unit S _{84 previously} stores the model graphic element array data for matching with the array data of S ₈₃ .

【００６５】第４実施例では、Ｓ₈₂において抽出を行っ
た後に、Ｓ₈₇において、中間的図形要素として、例えば
顔画像中の目、鼻、口、眉、耳などに相当する領域を抽
出する。抽出した中間的図形要素は、例えば顔全体のよ
うなより複雑で上位レベルの画像パターンを構成する階
層的に中位レベルの局所的特徴要素に属し、第１ないし
第３実施例で抽出した局所的特徴要素は下位レベルの局
所的特徴要素として位置づけることができ、格子空間上
での空間的配置関係により中間的図形要素を表現するも
のである。In the fourth embodiment, after extraction in S ₈₂ , in S ₈₇ , areas corresponding to eyes, nose, mouth, eyebrows, ears, etc. in the face image are extracted as intermediate graphic elements. . The extracted intermediate graphic element belongs to a hierarchically middle-level local feature element that constitutes a more complex upper-level image pattern, such as an entire face, and is extracted in the first to third embodiments. The characteristic feature element can be positioned as a lower-level local feature element, and expresses an intermediate graphic element by the spatial arrangement relation in the lattice space.

【００６６】Ｓ₈₈においてあらかじめ記憶した目、口な
どのモデル図形要素を、Ｓ₈₂において下位レベルで抽出
した局所的特徴要素の空間配置に基いてＳ₈₇において抽
出した後、Ｓ₈₃において中位レベルでの配列データをそ
れぞれの中間的図形要素に固有の数値データあるいは記
号によって生成する。After the model graphic elements such as eyes and mouth stored in advance in S ₈₈ are extracted in S ₈₇ based on the spatial arrangement of the local feature elements extracted in the lower level in S ₈₂ , the medium level is extracted in S ₈₃ . The array data in is generated by numerical data or symbols unique to each intermediate graphic element.

【００６７】図９は、中間的図形要素の一部による顔画
像の符号化例を示す図である。図９においては、Ｓ₈₇に
おいて画像から抽出した中間的図形要素と、Ｓ₈₄におい
て抽出した認識すべきモデル図形要素とを、Ｓ₈₅におい
てマッチングをとることにより、複数の物体が互いに重
なり合うなどして撮像された画像においても、その影響
を受け難いロバストな認識が可能となる。すなわち顔画
像の認識においては、前処理として中間的図形要素であ
る目、鼻、口などを抽出し、図９に示すように格子空間
上に相対位置を符号化して（ここでは目は９、鼻は５、
口は１に数値化している）表わすが、顔のうちのこれら
いずれかの要素が前述した要因により欠落した画像であ
っても、他の中間的図形要素の空間配置が顔画像の構成
と矛盾しなければ顔と認識することができる。FIG. 9 is a diagram showing an example of encoding a face image by a part of intermediate graphic elements. In FIG. 9, the intermediate graphic element extracted from the image in S ₈₇ and the model graphic element to be recognized extracted in S ₈₄ are matched in S ₈₅ so that a plurality of objects overlap each other. Even in a captured image, it is possible to perform robust recognition that is unlikely to be affected by it. That is, in face image recognition, intermediate graphic elements such as eyes, nose, and mouth are extracted as preprocessing, and relative positions are encoded in a lattice space as shown in FIG. 9 (here, eyes are 9, Nose is 5,
Although the mouth is quantified as 1, the spatial arrangement of other intermediate graphic elements is inconsistent with the structure of the face image even if any of these elements in the face is an image missing due to the above-mentioned factors. If you don't, you can recognize it as a face.

【００６８】第４実施例における上記のような無矛盾性
の検出は、中間的図形要素レベルの格子空間上のモデル
配列データとのマッチングが所定の閾値以上（あるいは
以下）で極大（あるいは極小）となるような位置を検出
することに等しい。In the detection of the above-mentioned consistency in the fourth embodiment, it is determined that the matching with the model array data on the grid space at the intermediate graphic element level is the maximum value (or the minimum value) when it is the predetermined threshold value (or less). Is equivalent to detecting such a position.

【００６９】[0069]

【発明の効果】以上説明したように本発明は、以下のよ
うな効果を有する。As described above, the present invention has the following effects.

【００７０】入力画像中の局所的特徴要素を抽出して配
置情報を生成し、あらかじめ記憶した認識すべき物体の
局所的特徴要素の組み合せ配置情報と照合して認識情報
を判定し、入力画像中の、認識情報の存在領域を決定し
抽出している。その際に、局所的特徴要素として、複数
方向のエッジセグメントの交差パターン、曲率一定の曲
線の全部またはその一部、およびエッジセグメントを、
複数の異なる大きさのスケーリングパラメータごとに抽
出している。また、局所的特徴要素の配置情報を局所的
特徴要素の離散化した数値化要素の２次元配列として表
わしている。さらに、局所的特徴要素の組み合せ配置情
報を、抽出した局所的特徴要素を所定サイズおよび所定
形状単位で構成される格子空間上に再配列して得られる
特徴要素のパターンによって表わしている。上記の方法
により、認識対象画像データに要するメモリ容量を節減
し、認識処理の効率を向上することができるという効果
を有する。The local feature elements in the input image are extracted to generate the placement information, and the recognition information is determined by collating with the combination placement information of the local feature elements of the object to be recognized which is stored in advance. The area in which the recognition information exists is determined and extracted. At that time, as local feature elements, an intersection pattern of edge segments in a plurality of directions, all or part of a curve having a constant curvature, and an edge segment,
It is extracted for each of a plurality of scaling parameters having different sizes. Also, the arrangement information of the local feature elements is represented as a two-dimensional array of digitized numerical elements of the local feature elements. Further, the combination arrangement information of local feature elements is represented by a pattern of feature elements obtained by rearranging the extracted local feature elements on a lattice space configured by a predetermined size and a predetermined shape unit. The above method has an effect that the memory capacity required for the image data to be recognized can be reduced and the efficiency of the recognition process can be improved.

【００７１】すなわち第１実施例に示すように、画像を
より少ない数のあらかじめ設定した局所的特徴要素と限
定された数のマトリクス状の空間配置関係との組み合せ
で表現することにより、物体認識処理の効率向上（すな
わち演算コストの低減）と、物体の画像中のサイズの変
化および変形による影響を受け難い物体認識を可能にし
ている。That is, as shown in the first embodiment, the object recognition processing is performed by expressing the image by a combination of a smaller number of preset local feature elements and a limited number of matrix-like spatial arrangement relationships. The efficiency of (1) (that is, the reduction of the calculation cost) and the object recognition that is not easily affected by the change and deformation of the size in the image of the object are enabled.

【００７２】また、局所的特徴要素の配置情報を数値化
要素の３次元配列に拡張することにより、画像に対する
視点位置の変化に対応した同一物体の任意の視点位置か
らの物体認識、および撮像時の照明条件の変化に対応し
た物体認識の際に、抽出する局所的特徴要素の種類は敏
感に変化せず、画像中の物体の変形の影響を受け難い物
体認識ができるという効果を有する。Further, when the arrangement information of the local feature elements is expanded to a three-dimensional array of digitizing elements, the object recognition from an arbitrary viewpoint position of the same object corresponding to the change of the viewpoint position with respect to the image and at the time of imaging In the object recognition corresponding to the change of the illumination condition, the type of the local feature element to be extracted does not change sensitively, and it is possible to perform the object recognition that is not easily affected by the deformation of the object in the image.

【００７３】すなわち第２実施例に示すように、限定さ
れた数の局所的特徴要素の３次元的空間配置関係を３次
元的に離散化した代表点（格子空間上の一点）にマッピ
ングしたものをマッチング用モデルデータとし、同じド
メイン（格子空間）で実際の画像とのマッチングの度合
いを測ることにより任意の視点位置からの立体の画像認
識に要する処理の効率向上とメモリの節減を飛躍的に図
ることができる。That is, as shown in the second embodiment, the three-dimensional spatial arrangement relationship of the limited number of local feature elements is mapped to the three-dimensionally discretized representative point (one point on the lattice space). Is used as the matching model data, and by measuring the degree of matching with the actual image in the same domain (lattice space), the processing efficiency required for recognizing a stereoscopic image from an arbitrary viewpoint position and the memory saving can be dramatically improved. Can be planned.

【００７４】さらに、局所的特徴要素近傍領域の色、局
所空間周波数、強度などの領域ベース情報を抽出し、局
所的特徴要素と領域ベース情報との配置情報を生成する
ことにより、画像中に複数の物体が存在し、複数の物体
の一部が互いに重なり合ったり接触するなどして物体の
本来の形が欠ける、隠れるなどの強い変形が存在する場
合でも、第３実施例に示すように、あらかじめ領域分割
を行うことなくロバストな認識を行うことができるとい
う効果を有する。Further, by extracting the region-based information such as the color, local spatial frequency, and intensity of the region near the local feature element and generating the arrangement information of the local feature element and the region-based information, a plurality of images can be displayed in the image. Even if there is a strong deformation such as the original shape of the object being lost or hidden due to the fact that some of the plurality of objects overlap each other or come into contact with each other, as shown in the third embodiment, This has an effect that robust recognition can be performed without performing region division.

【００７５】これにより画像中のどの位置にどの認識す
べき対象があるかを出力し、その位置を中心とした撮
像、あるいは対象画像を中心とした部分画像を原画像か
ら抽出し、特定対象を中心とした撮像、あるいは特定対
象を含む画像と他の画像と合成するなどの画像編集を、
効率良く、かつロバストに行うために必要な情報を出力
することができるという効果を有する。As a result, which position in the image is to be recognized and which target should be recognized are output, and an image centering on that position or a partial image centering on the target image is extracted from the original image to identify the specific target. Image editing such as centering imaging or combining images containing specific objects with other images,
This has the effect of being able to output the information necessary for efficient and robust operation.

【００７６】加えて、局所的特徴要素の中間的図形要素
を抽出し、中間的図形要素の配置情報を生成することに
より、階層的特徴抽出に基づく認識を行うことができ、
複数の物体が互いに重なり合うなどして撮像された画像
においても、その影響を受け難いロバストな認識ができ
るという効果を有する。In addition, by extracting the intermediate graphic element of the local feature element and generating the arrangement information of the intermediate graphic element, the recognition based on the hierarchical feature extraction can be performed,
Even in images captured by overlapping a plurality of objects with each other, it is possible to perform robust recognition that is not easily influenced by the images.

【００７７】すなわち第４実施例に示すように、認識の
前処理として中間的図形要素を抽出して格子空間上に相
対位置を符号化して表わし、これらいずれかの要素が前
述した要因により欠落した画像であっても、他の中間的
図形要素の空間配置が認識すべき物体の構成と矛盾しな
ければ、認識を行うことができる。That is, as shown in the fourth embodiment, an intermediate figure element is extracted as a preprocessing for recognition and the relative position is coded and expressed in the grid space, and any one of these elements is missing due to the above-mentioned factors. Even an image can be recognized if the spatial arrangement of other intermediate graphic elements does not conflict with the configuration of the object to be recognized.

[Brief description of drawings]

【図１】本発明の第１実施例における処理部の構成図FIG. 1 is a configuration diagram of a processing unit according to a first embodiment of the present invention.

【図２】抽出した局所的特徴要素パターン例を示す図FIG. 2 is a diagram showing an example of extracted local feature element patterns.

【図３】図２の局所的特徴要素を用いた顔画像の符号化
例を示す図FIG. 3 is a diagram showing an example of encoding a face image using the local feature elements of FIG.

【図４】符号化した局所的特徴要素配列表示用格子空間
例を示す図FIG. 4 is a diagram showing an example of an encoded grid space for displaying a local feature array.

【図５】本発明の第２実施例における３次元格子空間の
構造図FIG. 5 is a structural diagram of a three-dimensional lattice space in the second embodiment of the present invention.

【図６】本発明の第３実施例における処理部の構成図FIG. 6 is a configuration diagram of a processing unit in a third embodiment of the present invention.

【図７】Ｔ型交差が遮蔽により生じる場合の３領域の説
明図FIG. 7 is an explanatory diagram of three regions when a T-shaped intersection is caused by shielding.

【図８】本発明の第４実施例における処理部の構成図FIG. 8 is a configuration diagram of a processing unit in a fourth embodiment of the present invention.

【図９】中間的図形要素の一部による顔画像の符号化例
を示す図FIG. 9 is a diagram showing an example of encoding a face image by a part of intermediate graphic elements.

[Explanation of symbols]

Ｓ₁₁、Ｓ₆₁、Ｓ₈₁ 画像入力部Ｓ₁₂、Ｓ_62a、Ｓ₈₂ 局所的特徴要素抽出部Ｓ₁₃ 抽出特徴要素配列データ生成部Ｓ₁₄ 局所的特徴要素モデル配列データ記憶部Ｓ₁₅、Ｓ₆₅、Ｓ₈₅ マッチング処理部Ｓ₁₆ 適合画像領域抽出部Ｓ_62b 領域情報抽出部Ｓ₆₃、Ｓ₈₃ 配列データ生成部Ｓ₆₄ モデル配列データ記憶部Ｓ₈₄ モデル図形要素配列データ記憶部Ｓ₈₇ 中間的図形要素抽出部Ｓ₈₈ モデル図形要素記憶部S ₁₁ , S ₆₁ , S ₈₁ Image input section S ₁₂ , S _62a , S ₈₂ Local feature element extraction section S ₁₃ Extracted feature element array data generation section S ₁₄ Local feature element model array data storage section S ₁₅ , S ₆₅ , S ₈₅ Matching processing unit S ₁₆ Compatible image region extraction unit S _62b Region information extraction unit S ₆₃ , S ₈₃ Array data generation unit S ₆₄ Model array data storage unit S ₈₄ Model graphic element array data storage unit S ₈₇ Intermediate graphic element Extractor S ₈₈ Model graphic element storage

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所 9061−5Ｌ４６０Ｂ 9061−5Ｌ４６０Ｆ ─────────────────────────────────────────────────── ─── Continuation of the front page (51) Int.Cl. ⁶ Identification code Internal reference number FI Technical display location 9061-5L 460 B 9061-5L 460 F

Claims

[Claims]

1. An input image is recorded and held, a local feature element in the image is extracted, placement information of the local feature element is generated, and the local feature of an object to be recognized. The combination arrangement information of elements is stored as storage information, and the arrangement information of the generated local feature element and the storage information are collated and determined, and the existence area in the image of the determined recognition information is determined. Object recognition method to extract.

2. An input image is recorded and held, local feature elements in the image are extracted, and area-based information such as color, local spatial frequency, and intensity of the area near the local feature element is extracted. Then, the arrangement information of the local feature element and the region-based information is generated, the combination arrangement information of the local feature element of the object to be recognized is stored as storage information, and the generated arrangement information and the storage are stored. An object recognition method that determines by checking information.

3. An image to be input is recorded and held, local feature elements in the image are extracted, and an object to be recognized,
The model feature element of the local feature element is stored as the first storage information, and the extracted local feature element and the first feature information are stored.
The intermediate graphic element of the local feature element is extracted from the stored information of the local feature element, the layout information of the intermediate graphic element is generated, and the combination layout information of the model graphic element of the object to be recognized is stored in the second storage. An object recognition method, which stores the information as information and collates the generated arrangement information of the intermediate graphic element with the second stored information for determination.

4. The intersection pattern of edge segments in a plurality of directions, all or a part of a curve having a constant curvature, and edge segments are extracted as the local feature elements. The described object recognition method.

5. The arrangement information of the local feature elements is
The object recognition method according to claim 1, wherein the local feature elements are represented as a two-dimensional array or a three-dimensional array of digitized elements to which numerical values discretized by a predetermined method are assigned.

6. The pattern of feature elements obtained by rearranging the combined arrangement information of the local feature elements on a grid space formed by the extracted local feature elements with a predetermined size and a predetermined shape unit. The object recognition method according to claim 1, wherein the object recognition method is represented.

7. A process of extracting the local feature element,
The object recognition method according to claim 1, wherein the object recognition method is performed for each of a plurality of scaling parameters having different sizes.