JP6289027B2

JP6289027B2 - Person detection device and program

Info

Publication number: JP6289027B2
Application number: JP2013221348A
Authority: JP
Inventors: 高橋　正樹; 正樹高橋; 苗村　昌秀; 昌秀苗村
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2013-10-24
Filing date: 2013-10-24
Publication date: 2018-03-07
Anticipated expiration: 2033-10-24
Also published as: JP2015082295A

Description

本発明は、人物の存在を検出する人物検出装置、及び、その人物検出装置で実行されるプログラムに関する。 The present invention relates to a person detection device that detects the presence of a person and a program executed by the person detection device.

従来から、カメラ画像から人物を検出する技術がある。特許文献１には、カメラ映像に含まれる顔領域の情報を基に、人物を検出する装置が開示されている。特許文献２には、入力画像から人物の存在する領域を抽出し、抽出の結果に基づいて人物の存在を推定する装置が開示されている。特許文献３には、赤外線カメラを利用して、人物を検出する方法が記載されている。 Conventionally, there is a technique for detecting a person from a camera image. Patent Document 1 discloses an apparatus for detecting a person based on information on a face area included in a camera video. Patent Document 2 discloses an apparatus that extracts a region where a person exists from an input image and estimates the presence of the person based on the extraction result. Patent Document 3 describes a method for detecting a person using an infrared camera.

特開２００５−２３４９４７号公報JP 2005-234947 A 特開２０１２−１０８７８５号公報JP 2012-108785 A 特開２００２−２０２１９７号公報Japanese Patent Laid-Open No. 2002-202197

従来から存在する技術では、日常生活空間における人物の多種多様な姿勢及び振る舞いに対し、人物を検出することができない。 With existing technologies, it is not possible to detect a person for various postures and behaviors of the person in the daily life space.

本発明は、人物の姿勢等にかかわらず、人物を検出することができる人物検出装置を提供することを目的とする。
また、本発明は、人物検出装置において実行されるプログラムを提供することを目的とする。 An object of this invention is to provide the person detection apparatus which can detect a person irrespective of a person's attitude | position etc.
Another object of the present invention is to provide a program executed in a person detection device.

本発明は、動画に基づいて人物の存在を検出する人物検出装置であって、動画を取得する撮像部と、前記撮像部によって取得された動画の各フレームから特徴点を検出する特徴点検出部と、前記特徴点検出部によって検出された特徴点の軌跡に関する特徴量を取得し、特徴量を固定次元で記述する記述部と、前記記述部によって固定次元で記述された特徴量と、予め人物の有無が学習された特徴量とに基づいて、人物の存在を検出する人物検出部と、を備える人物検出装置に関する。 The present invention is a person detection device that detects the presence of a person based on a moving image, an imaging unit that acquires the moving image, and a feature point detection unit that detects a feature point from each frame of the moving image acquired by the imaging unit Acquiring a feature quantity related to the trajectory of the feature point detected by the feature point detection section, describing the feature quantity in a fixed dimension, a feature quantity described in the fixed dimension by the description section, and a person in advance The present invention relates to a person detection device including a person detection unit that detects the presence of a person based on a feature amount learned about the presence or absence of a person.

前記記述部は、特徴点の軌跡から、軌跡の動きベクトルの方向と長さとに基づく動き特徴と、軌跡を構成する各特徴点の周辺の輝度又は色ヒストグラムに基づく見え特徴と、軌跡の総移動量、軌跡の発生点から消滅点までの距離、全ての特徴点を包含する矩形の面積及びアスペクト、並びに、軌跡の発生から消滅までの時間のいずれかに基づく形状特徴と、を抽出することに基づいて、前記動き特徴と、前記見え特徴と、前記形状特徴とのそれぞれの特徴量を取得することが好ましい。 The description unit includes, from the trajectory of the feature point, a motion feature based on the direction and length of the motion vector of the trajectory, an appearance feature based on a luminance or color histogram around each feature point constituting the trajectory, and a total trajectory movement. To extract a shape feature based on any of the following: the amount, the distance from the origin of the trajectory to the disappearance point, the area and aspect of the rectangle encompassing all the feature points, and the time from the occurrence of the trajectory to the disappearance It is preferable that the feature amounts of the motion feature, the appearance feature, and the shape feature are acquired based on the feature feature.

前記特徴点検出部は、１つのフレームから複数の特徴点を検出することが好ましい。この場合、前記人物検出部は、機械学習の枠組みに基づいて人物の存在を検出する。 The feature point detector preferably detects a plurality of feature points from one frame. In this case, the person detection unit detects the presence of a person based on a machine learning framework.

人物検出装置は、前記人物検出部によって人物の存在が検出された場合、所定の表示を行う表示部を備えることが好ましい。 The person detection device preferably includes a display unit that performs a predetermined display when the presence of a person is detected by the person detection unit.

また、本発明は、動画に基づいて人物の存在を検出する人物検出装置おいて実行されるプログラムであって、動画の各フレームから特徴点を検出する第１ステップと、前記第１ステップにおいて検出された特徴点の軌跡に関する特徴量を取得し、特徴量を固定次元で記述する第２ステップと、前記第２ステップにおいて固定次元で記述された特徴量と、予め人物の有無が学習された特徴量とに基づいて、人物の存在を検出する第３ステップと、を実行するプログラムに関する。 In addition, the present invention is a program executed in a person detection device that detects the presence of a person based on a moving image, the first step detecting a feature point from each frame of the moving image, and the detection in the first step A second step of acquiring a feature amount related to the trajectory of the feature point and describing the feature amount in a fixed dimension; a feature amount described in a fixed dimension in the second step; And a third step of detecting the presence of a person based on the amount.

本発明によれば、人物の姿勢等にかかわらず、人物を検出することができる人物検出装置を提供することができる。本発明によれば、上記の人物検出装置において実行されるプログラムを提供することができる。 According to the present invention, it is possible to provide a person detection device that can detect a person regardless of the posture of the person. According to the present invention, it is possible to provide a program that is executed in the person detection device.

一実施形態に係る人物検出装置の構成を示すブロック図である。It is a block diagram which shows the structure of the person detection apparatus which concerns on one Embodiment. 特徴点の軌跡の一例を示す図である。It is a figure which shows an example of the locus | trajectory of a feature point. 動き特徴を得る場合の例について説明するための図である。It is a figure for demonstrating the example in the case of obtaining a motion characteristic. 変形例に係るラベル付けについて説明するための図である。It is a figure for demonstrating the labeling which concerns on a modification. 見え特徴を得る場合の例について説明するための図である。It is a figure for demonstrating the example in the case of obtaining an appearance feature. 固定次元を得る方法について説明するための図である。It is a figure for demonstrating the method of obtaining a fixed dimension. 人物の存在を検出することについて説明するための図である。It is a figure for demonstrating detecting the presence of a person.

以下、本発明の一実施形態について説明する。図１は、一実施形態に係る人物検出装置１の構成を示すブロック図である。図２は、特徴点の軌跡の一例を示す図である。 Hereinafter, an embodiment of the present invention will be described. FIG. 1 is a block diagram illustrating a configuration of a person detection device 1 according to an embodiment. FIG. 2 is a diagram illustrating an example of a trajectory of feature points.

人物検出装置１は、動画に基づいて人物の存在を検出する。すなわち、人物検出装置１は、映像から人物を検出する。図１に示すように、人物検出装置１は、撮像部１１と、制御部１２と、記憶部１３と、表示部１４と、を備える。
撮像部１１は、動画を取得する。撮像部１１は、動画を取得することが可能なカメラである。 The person detection device 1 detects the presence of a person based on a moving image. That is, the person detection device 1 detects a person from the video. As shown in FIG. 1, the person detection device 1 includes an imaging unit 11, a control unit 12, a storage unit 13, and a display unit 14.
The imaging unit 11 acquires a moving image. The imaging unit 11 is a camera that can acquire a moving image.

制御部１２は、人物検出装置１を制御する。制御部１２は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）であってよい。制御部１２は、特徴点検出部１２１と、記述部１２２と、人物検出部１２３と、を備える。 The control unit 12 controls the person detection device 1. The control unit 12 may be a CPU (Central Processing Unit). The control unit 12 includes a feature point detection unit 121, a description unit 122, and a person detection unit 123.

特徴点検出部１２１は、撮像部１１によって取得された動画の各フレームから特徴点を検出する。特徴点検出部１２１は、１つのフレームから複数の特徴点を検出する。特徴点は、例えば、フレームに記録された被写体のエッジ等である。特徴点検出部１２１は、既存の画像解析手法により特徴点を検出する。特徴点検出部１２１は、一例として、Ｈａｒｒｉｓオペレータ等を用いて、フレーム内の特徴点を高速に検出する。一例として、特徴点検出部１２１は、１フレームあたり約２００点の特徴点を検出することが可能である。なお、特徴点の検出方法は、上記の方法に限定されることはない。 The feature point detection unit 121 detects a feature point from each frame of the moving image acquired by the imaging unit 11. The feature point detection unit 121 detects a plurality of feature points from one frame. The feature point is, for example, the edge of the subject recorded in the frame. The feature point detection unit 121 detects feature points using an existing image analysis method. As an example, the feature point detection unit 121 detects feature points in a frame at high speed using a Harris operator or the like. As an example, the feature point detection unit 121 can detect about 200 feature points per frame. Note that the method of detecting feature points is not limited to the above method.

記述部１２２は、特徴点検出部１２１によって検出された特徴点の軌跡に関する特徴量を取得し、特徴量を固定次元で記述する。すなわち、記述部１２２は、特徴点検出部１２１によって検出された特徴点それぞれを、複数のフレームにわたって追跡する。記述部１２２は、一例として、Ｌｕｃａｓ−Ｋａｎａｄｅ法に代表されるオプティカルフロー算出法を用いて、隣接フレームでの特徴点のマッチングを行い、特徴点の発生から消滅まで特徴点を追跡する。この方法はＫａｎａｄｅ−Ｌｕｃａｓ−Ｔｏｍａｓｉ（ＫＬＴ）トラッカとして一般的に用いられている。これにより、複数のフレームにわたって特徴点２１が移動する場合には、記述部１２２は、特徴点の軌跡２２を得る（図２参照）。特徴点検出部１２１において特徴点を検出するためのパラメータ変化に応じて、記述部１２２が得る特徴点の軌跡の数も変化する。なお、特徴点の軌跡の取得方法は、上記の方法に限定されることはない。 The description unit 122 acquires a feature amount related to the trajectory of the feature point detected by the feature point detection unit 121, and describes the feature amount in a fixed dimension. That is, the description unit 122 tracks each feature point detected by the feature point detection unit 121 over a plurality of frames. As an example, the description unit 122 performs matching of feature points in adjacent frames using an optical flow calculation method typified by the Lucas-Kanade method, and tracks feature points from generation to disappearance of feature points. This method is commonly used as a Kanade-Lucas-Tomasi (KLT) tracker. Thereby, when the feature point 21 moves over a plurality of frames, the description unit 122 obtains a trajectory 22 of the feature point (see FIG. 2). The number of feature point trajectories obtained by the description unit 122 also changes in accordance with a parameter change for detecting the feature point in the feature point detection unit 121. Note that the method of acquiring the trajectory of the feature point is not limited to the above method.

記述部１２２は、特徴点の軌跡から、動き特徴と、見え特徴と、形状特徴と、を抽出する。記述部１２２は、動き特徴と、見え特徴と、形状特徴と、を抽出することに基づいて、動き特徴と、見え特徴と、形状特徴とのそれぞれの特徴量を取得する。 The description unit 122 extracts motion features, appearance features, and shape features from the trajectory of feature points. The description unit 122 acquires the feature amounts of the motion feature, the appearance feature, and the shape feature based on the extraction of the motion feature, the appearance feature, and the shape feature.

動き特徴は、軌跡の動きベクトルの方向と長さとに基づく特徴である。図３は、動き特徴を得る場合の例について説明するための図である。記述部１２２は、図３（Ａ）に示す特徴点の軌跡を、フレーム単位の動きベクトルに分割する（図３（Ｂ）参照）。そして、記述部１２２は、分割した動きベクトルそれぞれについて、方向及び長さに応じたラベル付けを行う（図３（Ｃ）参照）。図３（Ｃ）に例示する場合では、方向は８通りであり、長さは４通り（長さ０を含む）である。図３（Ｃ）に例示する場合では、動きベクトルは、２５通り（「方向８通り」×「長さ３通り」＋「長さ０の１通り」）のラベル付けがなされる。記述部１２２は、ラベル付けに基づいて、頻度ヒストグラムを作成する（図３（Ｄ）参照）。図３に示す場合、ヒストグラムは、２５次元となっている。記述部１２２は、各ラベルのヒストグラムを、特徴点の軌跡の動きに関する特徴量とする。動き特徴に関しては、ｂｉｎ数固定のヒストグラムへと変換される。 The motion feature is a feature based on the direction and length of the motion vector of the trajectory. FIG. 3 is a diagram for explaining an example of obtaining a motion feature. The description unit 122 divides the trajectory of feature points shown in FIG. 3A into motion vectors in units of frames (see FIG. 3B). The description unit 122 then labels each of the divided motion vectors according to the direction and length (see FIG. 3C). In the case illustrated in FIG. 3C, there are 8 directions and 4 lengths (including 0 length). In the case illustrated in FIG. 3C, 25 motion vectors are labeled (“8 ways” × “3 ways in length” + “1 way in length 0”). The description unit 122 creates a frequency histogram based on the labeling (see FIG. 3D). In the case shown in FIG. 3, the histogram has 25 dimensions. The description unit 122 uses the histogram of each label as a feature amount related to the movement of the trajectory of the feature point. The motion feature is converted into a histogram with a fixed number of bins.

記述部１２２は、１フレームのサイズ（撮像部１１の撮像サイズ）によって動きベクトルの長さが変化するのを避けるために、各軌跡内の動きベクトルの平均長及び分散を指標として分割数を設定することが可能である。記述部１２２は、動きベクトルの方向及び長さの分割数に応じて、次元数を任意に設定することができる。図４は、変形例に係るラベル付けについて説明するための図である。例えば、動きベクトルの方向（Ｄ）を４通り、動きベクトルの長さ（Ｌ）を３通り（長さ０を含む）としてラベル付けを行うことが可能である（図４（Ａ）参照）。図４（Ａ）に示す場合、９次元のヒストグラムが得られる。また、例えば、動きベクトルの方向（Ｄ）を１６通り、動きベクトルの長さ（Ｌ）を５通り（長さ０を含む）としてラベル付けを行うことが可能である（図４（Ｂ）参照）。図４（Ｂ）に示す場合、６５次元のヒストグラムが得られる。さらに、記述部１２２は、異なる分割領域から作成したヒストグラムを複数まとめて特徴量化することで、特徴点の大局的な動きと微小動作の双方を考慮した特徴量とすることが可能である。 The description unit 122 sets the number of divisions using the average length and variance of the motion vectors in each trajectory as indices, in order to avoid changing the length of the motion vector depending on the size of one frame (imaging size of the imaging unit 11). Is possible. The description unit 122 can arbitrarily set the number of dimensions according to the direction and length division number of the motion vector. FIG. 4 is a diagram for explaining labeling according to the modification. For example, it is possible to perform labeling with four motion vector directions (D) and three motion vector lengths (L) (including length 0) (see FIG. 4A). In the case shown in FIG. 4A, a nine-dimensional histogram is obtained. Further, for example, it is possible to perform labeling with 16 motion vector directions (D) and 5 motion vector lengths (L) (including length 0) (see FIG. 4B). ). In the case shown in FIG. 4B, a 65-dimensional histogram is obtained. Further, the description unit 122 can collect a plurality of histograms created from different divided areas and convert them into feature quantities, thereby making it possible to obtain a feature quantity that takes into account both a global movement of feature points and a minute movement.

見え特徴は、軌跡を構成する各特徴点の周辺の輝度又は色ヒストグラムに基づく特徴である。記述部１２２は、各特長点の周辺、例えば、１６×１６画素領域に対し、各画素の輝度又は色に関するヒストグラムを作成する。輝度又は色空間の分割方法は任意である。図５は、見え特徴を得る場合の例について説明するための図である。図５に例示すように、記述部１２２は、特徴点の周囲（正方形の領域）の色を、ＲＧＢ色空間で２７分割したいずれかの領域に当てはめる。２７分割は、Ｒが３通り、Ｇが３通り、Ｂが３通りである。記述部１２２は、当てはめた結果に基づいて、頻度ヒストグラムを作成する。図５に例示する場合、ヒストグラムは２７次元となっている。記述部１２２は、ヒストグラムを、特徴点の軌跡の見えに関する特徴量とする。見え特徴に関しては、ｂｉｎ数固定のヒストグラムへと変換される。 The appearance feature is a feature based on a luminance or color histogram around each feature point constituting the trajectory. The description unit 122 creates a histogram related to the luminance or color of each pixel in the vicinity of each feature point, for example, a 16 × 16 pixel region. The method of dividing the luminance or color space is arbitrary. FIG. 5 is a diagram for describing an example of obtaining appearance features. As illustrated in FIG. 5, the description unit 122 applies the color around the feature point (square area) to any area divided into 27 in the RGB color space. In 27 divisions, there are 3 ways for R, 3 ways for G, and 3 ways for B. The description unit 122 creates a frequency histogram based on the fitted result. In the case illustrated in FIG. 5, the histogram has 27 dimensions. The description unit 122 uses the histogram as a feature amount regarding the appearance of the trajectory of the feature point. The appearance feature is converted into a histogram with a fixed number of bins.

また、記述部１２２は、色情報のヒストグラムの代わりに、輝度の勾配情報に着目し、例えばＬｏｃａｌＢｉｎａｒｙＰａｔｔｅｒｎ等の輝度勾配特徴ヒストグラムを作成してもよい。見えに関するヒストグラムは、特徴点の軌跡フレーム数分作成されるが、ヒストグラムを軌跡単位で平均化することにより、特定フレームでのノイズの影響を抑えた特徴量となる。また、記述部１２２は、軌跡上の数点のみのヒストグラムで平均化し、計算量を節約してもよい。 Further, the description unit 122 may create a luminance gradient feature histogram such as a Local Binary Pattern by paying attention to luminance gradient information instead of the color information histogram. Histograms relating to appearance are created for the number of trajectory frames of feature points. By averaging the histograms in units of trajectories, the amount of feature is reduced while suppressing the influence of noise in a specific frame. Moreover, the description part 122 may average by the histogram of only several points on a locus | trajectory, and may save calculation amount.

形状特徴は、軌跡の総移動量、軌跡の発生点から消滅点までの距離、全ての特徴点を包含する矩形の面積及びアスペクト、並びに、軌跡の発生から消滅までの時間のいずれかに基づく特徴である。軌跡の総移動量は、各軌跡の動きベクトル長の総和に基づいて得られる。発生点から消滅点までの距離は、特徴点についての初回検出位置から特徴点についての最終検出位置までの距離に基づいて得られる。 A shape feature is a feature based on any of the total amount of movement of the trajectory, the distance from the origination point of the trajectory to the disappearance point, the area and aspect of the rectangle encompassing all the feature points, and the time from the occurrence of the trajectory to the disappearance. It is. The total movement amount of the trajectory is obtained based on the sum of the motion vector lengths of the trajectories. The distance from the generation point to the disappearance point is obtained based on the distance from the initial detection position for the feature point to the final detection position for the feature point.

上記の総移動量及び距離は、１フレームのサイズ（撮像部１１の撮像サイズ）の影響を受ける。このため、下記の式（１）に示すように、フレームのサイズ（カメラ映像の画像幅）で正規化する。
Ｌｉ＝ｌｉ／Ｗ …（１）
ここで、ｌｉはＬｉ正規化前・後の移動量であり、Ｗはカメラ映像の画像幅である。記述部１２２は、正規化することにより、撮像部１１を利用する上においては、撮像サイズの影響を受けずに形状特徴を算出することができる。 The total movement amount and distance are affected by the size of one frame (the imaging size of the imaging unit 11). For this reason, normalization is performed by the size of the frame (image width of the camera video) as shown in the following equation (1).
Li = li / W (1)
Here, li is the movement amount before and after Li normalization, and W is the image width of the camera video. The description unit 122 can calculate the shape feature without being influenced by the imaging size when using the imaging unit 11 by normalization.

図６は、固定次元を得る方法について説明するための図である。記述部１２２は、上述したように各特長量（軌跡特徴）を取得すると、一定時間内の各特長量を特徴空間に投射する。特徴空間では、分布作成に有効なコードワード中心が任意数定められている。記述部１２２は、特徴空間に投射された特徴量を最も近いコードワードに帰属させ、コードワード毎の特徴量の数をカウントして、頻度ヒストグラムを作成する。その頻度ヒストグラムは、特徴表現となり、固定次元の特徴量となる。なお、各特徴点の軌跡の時間長は不定である。後述する人物検出部１２３で用いられるＢａｇ−ｏｆ−ｆｅａｔｕｒｅｓの枠組みでは、各特徴量を固定次元に揃える必要がある。 FIG. 6 is a diagram for explaining a method of obtaining a fixed dimension. When the description unit 122 acquires each feature amount (trajectory feature) as described above, the description unit 122 projects each feature amount within a predetermined time onto the feature space. In the feature space, an arbitrary number of codeword centers effective for creating a distribution are determined. The description unit 122 assigns the feature amount projected to the feature space to the nearest code word, counts the number of feature amounts for each code word, and creates a frequency histogram. The frequency histogram becomes a feature expression and a fixed dimension feature amount. Note that the time length of the trajectory of each feature point is indefinite. In the Bag-of-features framework used by the person detection unit 123 described later, it is necessary to align each feature amount in a fixed dimension.

図７は、人物の存在を検出することについて説明するための図である。人物検出部１２３は、記述部１２２によって固定次元で記述された特徴量と、予め人物の有無が学習された特徴量とに基づいて、人物の存在を検出する。人物検出部１２３は、機械学習の枠組みに基づいて人物の存在を検出する。人物検出部１２３では、例えば、Ｓｕｐｐｏｒｔｖｅｃｔｏｒｍａｃｈｉｎｅ（ＳＶＭ）又はＡｄａＢｏｏｓｔ等の教師付き機械学習の枠組みで識別器が作成されている。学習フェーズにおいては、特徴表現に「存在」又は「不存在」の正解データを与えて学習させている。人物検出部１２３において検出の判断基準となる特徴量は、記憶部１３に記憶されている。運用フェーズでは、人物検出部１２３は、入力された固定次元の特徴量と、学習フェーズにおける学習とに基づいて、人物の存在又は人物の不存在を判断する。 FIG. 7 is a diagram for explaining the detection of the presence of a person. The person detection unit 123 detects the presence of a person based on the feature amount described in a fixed dimension by the description unit 122 and the feature amount for which the presence / absence of a person has been learned in advance. The person detection unit 123 detects the presence of a person based on a machine learning framework. In the person detection unit 123, for example, a discriminator is created in a supervised machine learning framework such as Support vector machine (SVM) or AdaBoost. In the learning phase, learning is performed by giving correct data of “existence” or “absence” to the feature expression. A feature amount that is a determination criterion for detection in the person detection unit 123 is stored in the storage unit 13. In the operation phase, the person detection unit 123 determines the presence or absence of a person based on the input fixed-dimension feature value and the learning in the learning phase.

表示部１４は、人物検出部１２３によって人物の存在が検出された場合、所定の表示を行うことが好ましい。一例として、表示部１４は、人物が存在していることを示す文字又は画像を表示する。 The display unit 14 preferably performs a predetermined display when the presence of a person is detected by the person detection unit 123. As an example, the display unit 14 displays a character or an image indicating that a person exists.

［比較例］
次に、比較例について説明する。第１比較例は、カメラ及び深度センサを用いた検出装置（第１検出装置）である。第１検出装置は、被写体及び深度センサにより被写体の顔を検出することに基づいて、人物を検出する。第２比較例は、カメラを用いた検出装置である（第２検出装置）。第２検出装置は、カメラにより被写体の顔を検出することに基づいて、人物を検出する。第２検出装置は、一般的な顔検出方法を利用する。表１は、第１比較例、第２比較例、及び本実施形態の人物検出装置１で人物を検出した場合についての比較である。 [Comparative example]
Next, a comparative example will be described. The first comparative example is a detection device (first detection device) using a camera and a depth sensor. The first detection device detects a person based on detecting the face of the subject by the subject and the depth sensor. The second comparative example is a detection device using a camera (second detection device). The second detection device detects a person based on detecting the face of the subject by the camera. The second detection device uses a general face detection method. Table 1 is a comparison for the case where a person is detected by the first comparative example, the second comparative example, and the person detection device 1 of the present embodiment.

第１比較例及び第２比較例では、適合率が１００％である。このため、第１比較例及び第２比較例では、顔を検出した場合には、正確に人物が存在していると判定している。しかしながら、第１比較例及び第２比較例の再現性は、本実施形態の再現性よりも低く、検出漏れが多く生じている。第１比較例及び第２比較例では、顔検出に基づいて人物を検出しているため、人物が横や後ろを向いた場合には見逃すことが多く、様々な角度で人物が行動する環境においては、高い再現性が得られなかった。一方、本実施形態では、再現性及び適合率ともに高い値が得られている。さらに、本実施形態では、適合率と再現性の調和平均であるＦ値が、第１比較例及び第２比較例よりも高い値となった。本実施形態は、第１比較例及び第２比較例と異なり顔検出が不要なため、被写体のどのような向きでも人物を検出することができる。また、本実施形態は、特徴点が移動する軌跡の単位で学習しているため、様々な見え方に対応することができる。また、本実施形態は、特徴点の微細な動きが人物の検出に寄与しているため、高い精度で人物を検出することができる。 In the first comparative example and the second comparative example, the matching rate is 100%. For this reason, in the first comparative example and the second comparative example, when a face is detected, it is determined that a person is present accurately. However, the reproducibility of the first comparative example and the second comparative example is lower than the reproducibility of the present embodiment, and many detection omissions occur. In the first comparative example and the second comparative example, since the person is detected based on the face detection, it is often overlooked when the person turns sideways or backward, and in an environment where the person acts at various angles. The high reproducibility was not obtained. On the other hand, in this embodiment, high values are obtained for both reproducibility and precision. Furthermore, in the present embodiment, the F value, which is a harmonic average of the precision and reproducibility, is higher than those in the first comparative example and the second comparative example. In the present embodiment, unlike the first comparative example and the second comparative example, face detection is not required, and therefore a person can be detected in any orientation of the subject. In addition, since the present embodiment learns in units of trajectories along which the feature points move, it can deal with various appearances. In the present embodiment, since the minute movement of the feature points contributes to the detection of the person, it is possible to detect the person with high accuracy.

上述した人物検出装置１は、動画を取得する撮像部１１と、撮像部１１によって取得された動画の各フレームから特徴点を検出する特徴点検出部１２１と、特徴点検出部１２１によって検出された特徴点の軌跡に関する特徴量を取得し、特徴量を固定次元で記述する記述部１２２と、記述部１２２によって固定次元で記述された特徴量と、予め人物の有無が学習された特徴量とに基づいて、人物の存在を検出する人物検出部１２３と、を備える。このように、人物検出装置１は、特徴点の軌跡に基づいて人物の存在又は人物の不存在を判断するので、人物の姿勢等にかかわらず、人物を検出することができる。すなわち、人物検出装置１は、人物が撮像部１１に対して任意の角度に位置しても、人物を検出することができる。また、人物検出装置１は、特徴点を利用するので、人物が僅かに動く場合であっても、高い精度で人物を検出することができる。また、人物検出装置１は、一般に広く利用可能な安価なカメラデバイスを用いて、人物を検出することができる。 The person detection device 1 described above is detected by the imaging unit 11 that acquires a moving image, the feature point detection unit 121 that detects a feature point from each frame of the moving image acquired by the imaging unit 11, and the feature point detection unit 121. The feature amount related to the trajectory of the feature point is acquired, and the description portion 122 describing the feature amount in a fixed dimension, the feature amount described in the fixed dimension by the description portion 122, and the feature amount in which the presence / absence of a person is learned in advance And a person detection unit 123 that detects the presence of a person. Thus, since the person detection apparatus 1 determines the presence of a person or the absence of a person based on the trajectory of feature points, the person detection apparatus 1 can detect a person regardless of the posture of the person. That is, the person detection device 1 can detect a person even if the person is located at an arbitrary angle with respect to the imaging unit 11. Moreover, since the person detection device 1 uses feature points, even if the person moves slightly, the person detection apparatus 1 can detect the person with high accuracy. In addition, the person detection apparatus 1 can detect a person using an inexpensive camera device that is generally widely available.

このような人物検出装置１は、例えば、次のような用途で利用できる。すなわち、人物検出装置１は、監視カメラで撮影された画像からの人物検出、職場でのＶｉｓｕａｌＤｉｓｐｌａｙＴｅｒｍｉｎａｌｓ（ＶＤＴ）作業、家庭におけるテレビジョンの視聴状況を解析する際の人物検出等に利用できる。
また、人物検出装置１は、人物を検出するばかりでなく、学習を行うことにより、人物の動作を検出することもできる。例えば、人物検出装置１は、人物が食事をしている又は新聞を読んでいる等を検出することもできる。また、人物検出装置１は、例えば、人物が野球をしている（投球をしている）又はバスケットボールをしている（シュートをしている）等を検出することもできる。
また、人物検出装置１は、人物を検出するばかりでなく、学習を行うことにより、動物を検出することもできる。 Such a person detection device 1 can be used for the following purposes, for example. That is, the person detection device 1 can be used for detecting a person from an image taken by a surveillance camera, visual display terminals (VDT) work at work, detecting a person when analyzing a television viewing situation at home, and the like.
In addition, the person detection device 1 can detect not only a person but also a person's motion by learning. For example, the person detection device 1 can also detect that a person is eating or reading a newspaper. The person detection device 1 can also detect, for example, whether a person is playing baseball (throwing) or playing basketball (shooting).
Moreover, the person detection apparatus 1 can detect not only a person but also an animal by learning.

なお、本発明は、プログラムとして構成されてもよい。プログラムは、上述した人物検出装置１、すなわち、コンピュータにおいて実行される。プログラムは、第１ステップと、第２ステップと、第３ステップと、を順に実行する。第１ステップは、動画の各フレームから特徴点を検出する。第２ステップは、第１ステップにおいて検出された特徴点の軌跡に関する特徴量を取得し、特徴量を固定次元で記述する。第３ステップは、第２ステップにおいて固定次元で記述された特徴量と、予め人物の有無が学習された特徴量とに基づいて、人物の存在を検出する。プログラムは、人物検出装置１、すなわち、コンピュータで読み取り可能な記録媒体に記録されていてもよい。記録媒体とは、例えば、フレキシブルディスク、光ディスク、メモリ、又はハードディスク等のことである。 The present invention may be configured as a program. The program is executed in the above-described person detection device 1, that is, a computer. The program sequentially executes a first step, a second step, and a third step. The first step detects feature points from each frame of the moving image. In the second step, a feature amount related to the trajectory of the feature point detected in the first step is acquired, and the feature amount is described in a fixed dimension. In the third step, the presence of a person is detected based on the feature amount described in the fixed dimension in the second step and the feature amount learned in advance. The program may be recorded on the person detection device 1, that is, a computer-readable recording medium. The recording medium is, for example, a flexible disk, an optical disk, a memory, a hard disk, or the like.

１人物検出装置
１１撮像部
１４表示部
１２１特徴点検出部
１２２記述部
１２３人物検出部 DESCRIPTION OF SYMBOLS 1 Person detection apparatus 11 Imaging part 14 Display part 121 Feature point detection part 122 Description part 123 Person detection part

Claims

A person detection device that detects the presence or absence of a person based on an arbitrary video,
An imaging unit for acquiring a video;
A feature point detection unit that detects a feature point from each frame of the moving image acquired by the imaging unit;
Get the trajectories of the feature points detected by the feature point detecting unit, based on the trajectory of the feature point, which is characteristic motion characteristic on the time axis of the trajectory, an image characteristic of the periphery of the track visible features and described description unit in the fixed dimension of the feature quantity of shape features is and envelope characteristics of the track,
A human detection unit for detecting presence / absence of a person by inputting feature amounts of motion features, appearance features, and shape features described in a fixed dimension by the description unit;
With
The person detection unit
Supervised by using the motion feature, appearance feature, and shape feature feature values created in advance as a moving image in which a person is present or a moving image in which a person is not present as input data, and the presence or absence of a person in the moving image as correct data Including a classifier created by machine learning,
Human detection device.

From the trajectory of feature points, the description section
Motion features based on the direction and length of the motion vector of the trajectory;
Appearance features based on luminance or color histograms around each feature point constituting the trajectory;
A shape feature based on any of the total amount of movement of the trajectory, the distance from the occurrence point of the trajectory to the disappearance point, the area and aspect of the rectangle encompassing all the feature points, and the time from the occurrence of the trajectory to the disappearance,
The person detection apparatus according to claim 1, wherein the feature amount of each of the motion feature, the appearance feature, and the shape feature is acquired based on extracting the motion feature.

The feature point detection unit detects a plurality of feature points from one frame,
The person detection device according to claim 1, wherein the person detection unit detects the presence of a person based on a machine learning framework.

The person detection apparatus according to claim 1, further comprising a display unit configured to perform a predetermined display when the presence of a person is detected by the person detection unit.

The program for functioning a computer as each function part of the person detection apparatus of any one of Claim 1 thru | or 4.