JP5067224B2

JP5067224B2 - Object detection apparatus, object detection method, object detection program, and printing apparatus

Info

Publication number: JP5067224B2
Application number: JP2008076475A
Authority: JP
Inventors: 宏幸辻
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2008-03-24
Filing date: 2008-03-24
Publication date: 2012-11-07
Anticipated expiration: 2028-03-24
Also published as: JP2009230556A

Description

本発明は、オブジェクト検出装置、オブジェクト検出方法、オブジェクト検出プログラムおよび印刷装置に関する。 The present invention relates to an object detection apparatus, an object detection method, an object detection program, and a printing apparatus.

入力画像の中からある目的画像（オブジェクト）を検出する技術が知られている。
画像データから肌色画素を判定し、判定した肌色画素の値を２５５、それ以外の画素値を０とし、検出された肌色画素の射影分布を求め、射影分布の形状から放物線を遺伝子アルゴリズムで探索して探索された放物線の位置から顔領域候補を抽出し、抽出した顔領域候補に対して顔領域かどうかの判断を行なう顔領域抽出方法が知られている（特許文献１参照。）。
特開２０００‐４８１８４号公報 A technique for detecting a target image (object) from an input image is known.
The skin color pixel is determined from the image data, the determined skin color pixel value is set to 255, and the other pixel values are set to 0. The projection distribution of the detected skin color pixel is obtained, and the parabola is searched by the genetic algorithm from the shape of the projection distribution. A face area extraction method is known that extracts face area candidates from the searched parabola positions and determines whether the extracted face area candidates are face areas (see Patent Document 1).
JP 2000-48184 A

入力画像から顔画像等のオブジェクトを検出しようとする場合、その検出の精度とともに、検出に要する処理の軽減化および処理の高速化が求められる。しかし上記文献１では、肌色画素の射影分布の算出、射影分布の形状に基づく放物線の探索、放物線の位置からの顔領域候補の抽出、顔領域候補に対する顔領域かどうかの判断、というように非常に多くの処理を踏む必要があり、処理の軽減化や高速化の点で全く不十分であった。
また、オブジェクトの検出後においては、入力画像に対する補正処理等が施される場合があるが、上記オブジェクトを検出するまでの過程で、当該補正処理にとって有用な情報を併せて取得することができれば、当該補正処理の際の負担が大幅に軽減され、非常に好適である。 When an object such as a face image is to be detected from an input image, it is required to reduce the processing required for the detection and increase the processing speed as well as the accuracy of the detection. However, in the above-mentioned document 1, the calculation of the skin color pixel projection distribution, the search for a parabola based on the shape of the projection distribution, the extraction of a face area candidate from the position of the parabola, the determination of whether the face area candidate is a face area, etc. Therefore, it is necessary to take a lot of processing, and it is quite insufficient in terms of reduction in processing and speeding up.
In addition, after the detection of the object, correction processing or the like may be performed on the input image, but if it is possible to acquire information useful for the correction processing in the process until the detection of the object, The burden of the correction process is greatly reduced, which is very suitable.

本発明は上記課題に鑑みてなされたもので、入力画像からオブジェクトを検出するに際して、高精度な検出を担保しつつ、従来に増して処理の軽減および高速化を実現し、また、オブジェクトを検出した後の所定の処理の負担軽減にも資するオブジェクト検出装置、オブジェクト検出方法、オブジェクト検出プログラムおよび印刷装置を提供することを目的とする。 The present invention has been made in view of the above-described problems. When detecting an object from an input image, while ensuring high-accuracy detection, the processing is reduced and speeded up as compared with the conventional method, and the object is detected. An object of the present invention is to provide an object detection apparatus, an object detection method, an object detection program, and a printing apparatus that also contribute to reducing the burden of predetermined processing after being performed.

上記目的を達成するため、本発明は、入力画像から所定のオブジェクトを検出するオブジェクト検出装置であって、上記入力画像における画像領域のうち上記オブジェクトに対応する色域に属する画像領域以外の画像領域の画素値を、所定の代表値に置き換える変換部と、上記置き換えが実行された入力画像上に検出窓を設定し、当該検出窓内における画素値のばらつきを求めるとともに、当該ばらつきが所定値以上である場合に、当該検出窓内の画像を対象として上記オブジェクトの有無を判定する判定部とを備える構成としてある。本発明によれば、入力画像上に検出窓を設定する前に、オブジェクトに対応する色域に属する画像領域以外の画像領域の画素値が上記代表値に置き換えられる。そのため、検出窓が、上記置き換えが行なわれた画素が多く集まる領域に設定された場合には、上記ばらつきも小さい値（所定値よりも小さい値）になりやすい。その結果、上記置き換えが行なわれた画素が多く集まる領域つまりオブジェクトが存在しないと推定される領域については、オブジェクトの有無判定の対象から外され、入力画像からのオブジェクトの検出精度を落とすことなく処理量および処理時間を軽減することができる。 In order to achieve the above object, the present invention provides an object detection apparatus for detecting a predetermined object from an input image, wherein an image area other than an image area belonging to a color gamut corresponding to the object among image areas in the input image is provided. A conversion unit that replaces the pixel value with a predetermined representative value, a detection window is set on the input image on which the replacement has been performed, and a variation in the pixel value within the detection window is obtained, and the variation is greater than or equal to a predetermined value In this case, a determination unit that determines the presence or absence of the object for the image in the detection window is used. According to the present invention, before setting the detection window on the input image, the pixel value of the image area other than the image area belonging to the color gamut corresponding to the object is replaced with the representative value. For this reason, when the detection window is set in a region where many pixels having undergone the replacement are gathered, the variation is likely to be a small value (a value smaller than a predetermined value). As a result, the area where many pixels are replaced, that is, the area where the object is estimated not to exist, is excluded from the object presence / absence determination target and processed without degrading the detection accuracy of the object from the input image. The amount and processing time can be reduced.

上記変換部は、上記オブジェクトに対応する色域に属する画像領域以外の各画像領域であって色相が異なる各画像領域の画素値を、各画像領域毎の代表色に対応した各代表値に置き換えるとしてもよい。当該構成によれば、上記置き換えが行なわれた画素が多く集まる領域に検出窓が設定された場合に算出される上記ばらつきは小さい値になりやすいという効果を維持しつつ、入力画像の各領域の色の情報を取得することができる。このような各領域の色の情報は、オブジェクト検出後の所定の処理、例えば、入力画像のシーンに応じた補正処理等において有用な情報として使用される。 The conversion unit replaces the pixel value of each image region having a different hue other than the image region belonging to the color gamut corresponding to the object with each representative value corresponding to the representative color for each image region. It is good. According to this configuration, the variation calculated when the detection window is set in a region where a large number of pixels that have undergone the replacement are gathered tends to be small, while maintaining the effect that each region of the input image Color information can be acquired. Such color information of each region is used as useful information in predetermined processing after object detection, for example, correction processing according to the scene of the input image.

上記変換部は、上記入力画像を解析するとともに、当該解析結果に基づいて、上記オブジェクトに対応する色域を変更するとしてもよい。当該構成によれば、入力画像の内容に応じて、上記置き換えを行なう画像領域を狭めたり広げたりすることができる。より具体的に、上記変換部は、上記入力画像の画素値のヒストグラムを生成するとともに当該ヒストグラムの解析結果に基づいて上記入力画像が色かぶり画像または逆光画像であると判定した場合に、上記オブジェクトに対応する色域として、第一の色域と当該第一の色域よりも広い第二の色域とのうち第二の色域を選択するとしてもよい。当該構成によれば、入力画像がいわゆる色かぶり画像であったり逆光画像である場合には、上記置き換えが行なわれる画像領域が狭められる。そのため、入力画像が色かぶり画像や逆光画像等であってもオブジェクトの検出精度が担保される。 The conversion unit may analyze the input image and change a color gamut corresponding to the object based on the analysis result. According to this configuration, the image area to be replaced can be narrowed or widened according to the contents of the input image. More specifically, when the conversion unit generates a histogram of pixel values of the input image and determines that the input image is a color cast image or a backlight image based on the analysis result of the histogram, the conversion unit As the color gamut corresponding to, a second color gamut may be selected from the first color gamut and the second color gamut wider than the first color gamut. According to this configuration, when the input image is a so-called color cast image or a backlight image, the image area where the replacement is performed is narrowed. Therefore, even if the input image is a color cast image or a backlight image, the object detection accuracy is ensured.

上記判定部は、入力画像における検出窓の位置と大きさとの少なくとも一方を変更しながら入力画像上に繰り返し検出窓の設定を行なうとともに、検出窓を設定して求めた上記ばらつきが所定値未満である領域については避けて以後の検出窓の設定を行なうとしてもよい。当該構成によれば、一度検出窓を設定したときにオブジェクトの有無判定の対象から外された領域については、以降、検出窓の設定対象から除外される。そのため、入力画像全体を対象として精密にオブジェクトの検出を試みる場合においても、処理量および処理時間が効果的に軽減される。 The determination unit repeatedly sets the detection window on the input image while changing at least one of the position and size of the detection window in the input image, and the variation obtained by setting the detection window is less than a predetermined value. The detection window may be set afterwards while avoiding a certain area. According to this configuration, the area that is excluded from the object presence / absence determination target once the detection window is set is subsequently excluded from the detection window setting target. Therefore, even when attempting to detect an object precisely for the entire input image, the processing amount and processing time are effectively reduced.

上記判定部は、検出窓内の画像にかかる情報を入力し上記オブジェクトの有無を示す情報を出力するニューラルネットワークを利用することにより、オブジェクトの有無を判定するとしてもよい。当該構成によれば、ニューラルネットワークを用いることにより、精度良くオブジェクトの有無を判定することができる。 The determination unit may determine the presence / absence of an object by using a neural network that inputs information relating to an image in the detection window and outputs information indicating the presence / absence of the object. According to this configuration, the presence / absence of an object can be accurately determined by using a neural network.

本発明の技術的思想は、上述したオブジェクト検出装置の発明以外にも、上述したオブジェクト検出装置が備える各部が行なう各処理工程を備えたオブジェクト検出方法の発明や、上述したオブジェクト検出装置が備える各部に対応した機能をコンピュータに実行させるオブジェクト検出プログラムの発明としても捉えることができる。また、入力画像から所定のオブジェクトを検出するとともに、入力画像に基づく印刷を実行する印刷装置であって、上記入力画像における画像領域のうち上記オブジェクトに対応する色域に属する画像領域以外の画像領域の画素値を、所定の代表値に置き換える変換部と、上記置き換えが実行された入力画像上に検出窓を設定し、当該検出窓内における画素値のばらつきを求めるとともに、当該ばらつきが所定値以上である場合に、当該検出窓内の画像を対象として上記オブジェクトの有無を判定する判定部と、上記判定部によってオブジェクトが有ると判定された検出窓内の画像に基づいて決定した補正情報に応じて上記入力画像の少なくとも一部を補正し、当該補正後の入力画像に基づいて印刷を行なう印刷制御部とを備える構成も把握することが可能である。 The technical idea of the present invention is that, in addition to the above-described invention of the object detection device, the invention of the object detection method including each processing step performed by each unit included in the above-described object detection device, and each unit included in the above-described object detection device It can also be understood as an invention of an object detection program for causing a computer to execute a function corresponding to the above. A printing apparatus for detecting a predetermined object from an input image and executing printing based on the input image, wherein an image area other than an image area belonging to a color gamut corresponding to the object among image areas in the input image A conversion unit that replaces the pixel value with a predetermined representative value, a detection window is set on the input image on which the replacement has been performed, and a variation in the pixel value within the detection window is obtained, and the variation is greater than or equal to a predetermined value The determination unit that determines the presence or absence of the object for the image in the detection window, and the correction information determined based on the image in the detection window that is determined by the determination unit to have an object. And a print control unit that corrects at least part of the input image and performs printing based on the corrected input image. Rukoto is possible.

下記の順序に従って本発明の実施形態を説明する。
１．プリンタの概略構成：
２．プリンタによる処理：
２‐１．前処理：
２‐２．オブジェクトの有無判定の要否判断：
２‐３．オブジェクトの有無判定から印刷まで：
３．変形例： Embodiments of the present invention will be described in the following order.
1. General printer configuration:
2. Processing by printer:
2-1. Preprocessing:
2-2. Determining whether or not an object exists:
2-3. From object presence determination to printing:
3. Variations:

１．プリンタの概略構成：
図１は、本発明のオブジェクト検出装置および印刷装置の一例に該当するプリンタ１０の構成を概略的に示している。プリンタ１０は、記録メディア（例えば、メモリカードＭＣ等）から取得した画像データに基づき画像を印刷する、いわゆるダイレクトプリントに対応したカラーインクジェットプリンタである。プリンタ１０は、プリンタ１０の各部を制御するＣＰＵ１１と、例えばＲＯＭやＲＡＭによって構成された内部メモリ１２と、ボタンやタッチパネルにより構成された操作部１４と、液晶ディスプレイにより構成された表示部１５と、プリンタエンジン１６と、カードインターフェース（カードＩ／Ｆ）１７と、ＰＣやサーバやデジタルスチルカメラ等の外部機器との情報のやり取りのためのＩ／Ｆ部１３とを備えている。プリンタ１０の各構成要素は、バスを介して互いに接続されている。 1. General printer configuration:
FIG. 1 schematically shows a configuration of a printer 10 corresponding to an example of an object detection apparatus and a printing apparatus of the present invention. The printer 10 is a color inkjet printer that supports so-called direct printing, in which an image is printed based on image data acquired from a recording medium (for example, a memory card MC). The printer 10 includes a CPU 11 that controls each unit of the printer 10, an internal memory 12 configured by, for example, a ROM and a RAM, an operation unit 14 configured by buttons and a touch panel, a display unit 15 configured by a liquid crystal display, A printer engine 16, a card interface (card I / F) 17, and an I / F unit 13 for exchanging information with an external device such as a PC, a server, or a digital still camera are provided. Each component of the printer 10 is connected to each other via a bus.

プリンタエンジン１６は、印刷データに基づき印刷を行う印刷機構である。カードＩ／Ｆ１７は、カードスロット１７２に挿入されたメモリカードＭＣとの間でデータのやり取りを行うためのＩ／Ｆである。メモリカードＭＣには画像データが格納されており、プリンタ１０は、カードＩ／Ｆ１７を介してメモリカードＭＣに格納された画像データを取得することができる。画像データ提供のための記録メディアとしてはメモリカードＭＣ以外にも種々の媒体を用いることができる。むろんプリンタ１０は、記録メディア以外にも、Ｉ／Ｆ部１３を介して接続した上記外部機器から画像データを入力することも可能である。プリンタ１０は、コンシューマ向けの印刷装置であってもよいし、ＤＰＥ向けの業務用印刷装置（いわゆるミニラボ機）であってもよい。操作部１４や表示部１５は、プリンタ１０本体とは別体の入力操作部（マウスやキーボードなど）やディスプレイであってもよい。プリンタ１０は、Ｉ／Ｆ部１３を介して接続したＰＣやサーバ等から印刷データを入力することもできる。 The printer engine 16 is a printing mechanism that performs printing based on print data. The card I / F 17 is an I / F for exchanging data with the memory card MC inserted into the card slot 172. Image data is stored in the memory card MC, and the printer 10 can acquire the image data stored in the memory card MC via the card I / F 17. In addition to the memory card MC, various media can be used as recording media for providing image data. Of course, in addition to the recording medium, the printer 10 can also input image data from the external device connected via the I / F unit 13. The printer 10 may be a printing device for consumers, or may be a business printing device for DPE (so-called minilab machine). The operation unit 14 and the display unit 15 may be an input operation unit (such as a mouse or a keyboard) or a display separate from the main body of the printer 10. The printer 10 can also input print data from a PC or server connected via the I / F unit 13.

内部メモリ１２には、オブジェクト検出部２０と、画像補正部３０と、表示処理部４０と、印刷処理部５０とが格納されている。オブジェクト検出部２０や画像補正部３０は、所定のオペレーティングシステムの下で、後述する前処理や、オブジェクト検出処理や、画像補正処理等を実行するためのコンピュータプログラムである。表示処理部４０は、表示部１５を制御して、表示部１５に処理メニューやメッセージを表示させるディスプレイドライバである。印刷処理部５０は、画像データから印刷データを生成し、プリンタエンジン１６を制御して、印刷データに基づく画像の印刷を実行するためのコンピュータプログラムである。ＣＰＵ１１は、内部メモリ１２から、これらのプログラムを読み出して実行することにより、これら各部の機能を実現する。 The internal memory 12 stores an object detection unit 20, an image correction unit 30, a display processing unit 40, and a print processing unit 50. The object detection unit 20 and the image correction unit 30 are computer programs for executing pre-processing, object detection processing, image correction processing, and the like described below under a predetermined operating system. The display processing unit 40 is a display driver that controls the display unit 15 to display a processing menu and a message on the display unit 15. The print processing unit 50 is a computer program for generating print data from image data, controlling the printer engine 16 and printing an image based on the print data. The CPU 11 implements the functions of these units by reading and executing these programs from the internal memory 12.

オブジェクト検出部２０は、プログラムモジュールとして、変換部２１と、検出窓設定部２２と、要否判断部２３と、検出実行部２４とを含んでいる。画像補正部３０は、プログラムモジュールとして、補正情報決定部３１と、補正実行部３２とを含んでいる。検出窓設定部２２と、要否判断部２３と、検出実行部２４とは、特許請求の範囲に言う判定部に該当する。画像補正部３０と、印刷処理部５０とは、特許請求の範囲に言う印刷制御部に該当する。これら各部の機能については後述する。さらに、内部メモリ１２には、色域定義情報１４ｂや、ニューラルネットワークＮＮ等の各種データやプログラムが格納されている。プリンタ１０は、印刷機能以外にも、コピー機能やスキャナ機能など多種の機能を備えたいわゆる複合機であってもよい。 The object detection unit 20 includes a conversion unit 21, a detection window setting unit 22, a necessity determination unit 23, and a detection execution unit 24 as program modules. The image correction unit 30 includes a correction information determination unit 31 and a correction execution unit 32 as program modules. The detection window setting unit 22, the necessity determination unit 23, and the detection execution unit 24 correspond to a determination unit referred to in the claims. The image correction unit 30 and the print processing unit 50 correspond to a print control unit in the claims. The functions of these units will be described later. Further, the internal memory 12 stores color gamut definition information 14b, various data such as a neural network NN, and programs. The printer 10 may be a so-called multifunction machine having various functions such as a copy function and a scanner function in addition to the print function.

２．プリンタによる処理：
２‐１．前処理：
図２は、本実施形態においてプリンタ１０が実行する処理をフローチャートにより示している。ステップＳ（以下、ステップの表記は省略。）１００では、オブジェクト検出部２０が、画像処理の対象となる画像（入力画像）を表した画像データＤを、メモリカードＭＣ等、所定の記録メディアから取得する。つまりオブジェクト検出部２０は、入力画像を取得する。むろん、オブジェクト検出部２０は、プリンタ１０がハードディスクドライブ（ＨＤＤ）を有していれば、当該ＨＤＤに保存されている画像データＤを取得可能であるし、上述したようにＩ／Ｆ部１３を介して接続した上記外部機器から画像データＤを取得可能である。つまり、ユーザが表示部１５に表示されたユーザインターフェース（ＵＩ）画面を参照しながら操作部１４を操作して、入力画像としての画像データＤを任意に選択するとともに当該選択した画像データＤの印刷指示を行なった場合に、オブジェクト検出部２０は上記選択にかかる画像データＤを記録メディア等から取得する。 2. Processing by printer:
2-1. Preprocessing:
FIG. 2 is a flowchart showing processing executed by the printer 10 in this embodiment. In step S (hereinafter, step notation is omitted) 100, the object detection unit 20 obtains image data D representing an image (input image) to be subjected to image processing from a predetermined recording medium such as a memory card MC. get. That is, the object detection unit 20 acquires an input image. Of course, if the printer 10 has a hard disk drive (HDD), the object detection unit 20 can acquire the image data D stored in the HDD, and the I / F unit 13 can be used as described above. The image data D can be acquired from the external device connected via the network. That is, the user operates the operation unit 14 while referring to the user interface (UI) screen displayed on the display unit 15 to arbitrarily select the image data D as the input image and print the selected image data D. When the instruction is given, the object detection unit 20 acquires the image data D relating to the selection from a recording medium or the like.

画像データＤは、複数の画素からなるビットマップデータであり、それぞれの画素は、ＲＧＢ各チャネルの階調（例えば、０〜２５５の２５６階調）の組み合わせで表現されている。画像データＤは、記録メディア等に記録されている段階で圧縮されていてもよいし、他の色空間で各画素の色が表現されていてもよい。これらの場合、オブジェクト検出部２０は、画像データＤの展開や色空間の変換を実行してＲＧＢビットマップデータとしての画像データＤを取得する。 The image data D is bitmap data composed of a plurality of pixels, and each pixel is expressed by a combination of gradations of RGB channels (for example, 256 gradations of 0 to 255). The image data D may be compressed when recorded on a recording medium or the like, or the color of each pixel may be expressed in another color space. In these cases, the object detection unit 20 executes the development of the image data D and the conversion of the color space to acquire the image data D as RGB bitmap data.

Ｓ２００では、オブジェクト検出部２０は、画像データＤを縮小化する。オリジナルの画像サイズのままの画像データＤを対象として、後述する前処理やオブジェクト検出処理を行なった場合には処理負担が大きい。そのため、オブジェクト検出部２０は、画像データＤについて画素数を減らすなどして画像サイズを縮小し、縮小後の画像データを取得する。オブジェクト検出部２０は、例えば、画像データＤをＱＶＧＡ（Quarter Video Graphics Array）サイズ（３２０画素×２４０画素）に縮小した画像データＤＲを取得する。本実施形態では、画像データＤＲについても適宜、入力画像と呼ぶ。 In S200, the object detection unit 20 reduces the image data D. When preprocessing and object detection processing described later are performed on the image data D with the original image size as a target, the processing load is large. Therefore, the object detection unit 20 reduces the image size by reducing the number of pixels of the image data D, and acquires the reduced image data. For example, the object detection unit 20 acquires image data DR obtained by reducing the image data D to a QVGA (Quarter Video Graphics Array) size (320 pixels × 240 pixels). In the present embodiment, the image data DR is also referred to as an input image as appropriate.

Ｓ３００では、オブジェクト検出部２０の変換部２１が、画像データＤＲの画像領域のうち所定のオブジェクトに対応する色域に属する画像領域以外の画像領域の画素値を、所定の代表値に置き換える。本実施形態では、オブジェクトは人間の顔画像であるとして説明を行なう。また、人間の顔画像に対応する色域とは肌色域を意味する。ただし本発明の構成を用いて検出可能なオブジェクトは人間の顔画像に限られるものではなく、人工物や、生物や、自然物や、風景など、様々な対象をオブジェクトとして検出することが可能である。 In S300, the conversion unit 21 of the object detection unit 20 replaces the pixel value of the image region other than the image region belonging to the color gamut corresponding to the predetermined object in the image region of the image data DR with a predetermined representative value. In the present embodiment, description will be made assuming that the object is a human face image. A color gamut corresponding to a human face image means a skin color gamut. However, objects that can be detected using the configuration of the present invention are not limited to human face images, and various objects such as artifacts, living things, natural objects, and landscapes can be detected as objects. .

図３は、Ｓ３００の詳細をフローチャートにより示している。Ｓ３１０では、変換部２１は、画像データＤＲを構成する画素のうち１つの画素を選択する。
Ｓ３２０では、変換部２１は、直近のＳ３１０で選択された画素の色が、肌色域に属するか否かを、内部メモリ１２に保存された色域定義情報１４ｂを参照することにより判定する。色域定義情報１４ｂは所定の表色系において肌色域を定義した情報である。肌色域を定義する表色系は、国際照明委員会（ＣＩＥ）で規定されたＬ*ａ*ｂ*表色系（以下、「*」の表記は省略。）や、ＣＩＥ規定のＸＹＺ表色系や、ＲＧＢ表色系や、ＨＳＶ表色系など、様々な表色系を採用可能である。 FIG. 3 is a flowchart showing details of S300. In S310, the conversion unit 21 selects one pixel among the pixels constituting the image data DR.
In S320, the conversion unit 21 determines whether or not the color of the pixel selected in the latest S310 belongs to the skin color gamut by referring to the color gamut definition information 14b stored in the internal memory 12. The color gamut definition information 14b is information defining a skin color gamut in a predetermined color system. The color system that defines the skin color gamut is the L * a * b * color system specified by the International Commission on Illumination (CIE) (hereinafter “*” is omitted) or the XYZ color system specified by the CIE. Various color systems such as a color system, an RGB color system, and an HSV color system can be adopted.

図４は、色域定義情報１４ｂが定義する肌色域Ａ１の一例等を示している。図４では、Ｌａｂ表色系のａｂ平面において円形の肌色域Ａ１を示している。色域定義情報１４ｂは例えば、肌色域Ａ１をその領域の位置（中心位置）Ｐ１と領域の半径ｒとによって定義している。ただし肌色域Ａ１の形状は円形である必要はないし、立体であってもよい。色域定義情報１４ｂは、ある表色系において肌色らしい色域の範囲を規定した情報であればよい。Ｓ３２０において変換部２１は、直近のＳ３１０で選択された画素のＲＧＢデータを、色域定義情報１４ｂが肌色域Ａ１を定義する表色系（Ｌａｂ表色系）のデータ（Ｌａｂデータ）に変換し、変換後のＬａｂデータ（Ｌａｂデータのａ値およびｂ値）が肌色域Ａ１内に属する場合にはＳ３３０をスキップしてＳ３４０に進み、一方、変換後のＬａｂデータが肌色域Ａ１に属さない場合にはＳ３３０に進む。ＲＧＢデータからＬａｂデータへの変換は、ＲＧＢ表色系からＬａｂ表色系への変換を行なう所定の色変換プロファイルなどを用いることで可能である。内部メモリ１２には、かかるプロファイルも保存されているとしてもよい。本実施形態では、Ｓ３２０において肌色域に属すると判定された画素を肌色画素と呼び、Ｓ３２０において肌色に属さないと判定された画素を非肌色画素と呼ぶ。 FIG. 4 shows an example of the skin color gamut A1 defined by the color gamut definition information 14b. In FIG. 4, a circular skin color gamut A1 is shown in the ab plane of the Lab color system. For example, the color gamut definition information 14b defines the skin color gamut A1 by the position (center position) P1 of the area and the radius r of the area. However, the shape of the skin color gamut A1 does not have to be a circle, and may be a solid. The color gamut definition information 14b may be information that defines a range of a color gamut that seems to be a skin color in a certain color system. In S320, the conversion unit 21 converts the RGB data of the pixel selected in the latest S310 into data (Lab data) of the color system (Lab color system) in which the color gamut definition information 14b defines the skin color gamut A1. If the converted Lab data (a value and b value of the Lab data) belong to the skin color gamut A1, the process skips S330 and proceeds to S340, while the converted Lab data does not belong to the skin color gamut A1. The process proceeds to S330. Conversion from RGB data to Lab data can be performed by using a predetermined color conversion profile that performs conversion from the RGB color system to the Lab color system. Such a profile may be stored in the internal memory 12. In this embodiment, the pixel determined to belong to the skin color gamut in S320 is referred to as a skin color pixel, and the pixel determined to not belong to the skin color in S320 is referred to as a non-skin color pixel.

Ｓ３３０では、変換部２１は、直近のＳ３２０で非肌色画素と判定された画素の画素値（ＲＧＢデータ）を所定の代表値（代表色）に置き換える（変換する）。代表値としては、例えば黒色を表す値（Ｒ＝Ｇ＝Ｂ＝０）が該当する。言い換えると、変換部２１は、非肌色画素からなる領域を代表値の色で塗りつぶす。代表値は上記のように１種類であってもよいが、本実施形態では代表値は複数存在するとしている。図４では、一例としてａｂ平面において、肌色域Ａ１以外の色域であって、緑色らしい色域（緑色域）Ａ２と、空（主に青空）色らしい色域（空色域）Ａ３と、マゼンダらしい色域（マゼンダ色域）Ａ４とを示している。また、緑色域Ａ２、空色域Ａ３、マゼンダ色域Ａ４それぞれの中心位置を、各色域Ａ２，Ａ３，Ａ４に対応する代表値Ｐ２，Ｐ３，Ｐ４の色としている。つまり色域定義情報１４ｂは、緑色域Ａ２、空色域Ａ３、マゼンダ色域Ａ４の各範囲や、色域Ａ２，Ａ３，Ａ４毎の代表値Ｐ２，Ｐ３，Ｐ４の色（ＬａｂデータやＲＧＢデータ）等についても予め定義している。むろん、本実施形態において予め定義される代表値の種類や数は図４に示したものに限られない。 In S330, the conversion unit 21 replaces (converts) the pixel value (RGB data) of the pixel determined as the non-skin color pixel in the latest S320 with a predetermined representative value (representative color). As the representative value, for example, a value representing black (R = G = B = 0) is applicable. In other words, the conversion unit 21 fills a region composed of non-skin color pixels with a representative color. The representative value may be one kind as described above, but in the present embodiment, there are a plurality of representative values. In FIG. 4, as an example, in the ab plane, a color gamut other than the skin color gamut A1, which is a color gamut (green gamut) A2 that seems to be green, a color gamut (sky gamut) A3 that seems to be a sky (mainly blue sky) color, and magenta. A special color gamut (magenta color gamut) A4 is shown. The central positions of the green gamut A2, the sky gamut A3, and the magenta gamut A4 are the colors of the representative values P2, P3, and P4 corresponding to the gamuts A2, A3, and A4. In other words, the color gamut definition information 14b includes the ranges of the green color gamut A2, the sky color gamut A3, and the magenta color gamut A4, and the representative values P2, P3, and P4 for each color gamut A2, A3, and A4 (Lab data and RGB data). Etc. are also defined in advance. Of course, the types and number of representative values defined in advance in the present embodiment are not limited to those shown in FIG.

変換部２１はＳ３３０では、非肌色画素の色が、ａｂ平面において緑色域Ａ２に属する場合には、当該非肌色画素の画素値（ＲＧＢデータ）を、緑色域Ａ２の代表値Ｐ２としてのＲＧＢデータ（例えば、Ｒ＝０，Ｇ＝２５５，Ｂ＝０。）に置き換える。同様に変換部２１は、非肌色画素の色が空色域Ａ３に属する場合には、その画素値を、空色域Ａ３の代表値Ｐ３としてのＲＧＢデータに置き換え、非肌色画素の色がマゼンダ色域Ａ４に属する場合には、その画素値を、マゼンダ色域Ａ４の代表値Ｐ４としてのＲＧＢデータに置き換える。非肌色画素の色が、肌色域および上記のように定義された各代表値に対応する各色域の何れにも属さない場合には、別の所定の代表値（例えば黒色を表す値）のＲＧＢデータに、その画素値を置き換える。このように本実施形態では、色相の異なる代表色（代表値）が予め複数設定されており、変換部２１は、非肌色画素の画素値について、その画素の色が近いいずれかの代表値に置き換える作業を行なう。 In S330, when the color of the non-skin color pixel belongs to the green region A2 in the ab plane, the conversion unit 21 converts the pixel value (RGB data) of the non-skin color pixel into the RGB data as the representative value P2 of the green region A2. (For example, R = 0, G = 255, B = 0.) Similarly, when the color of the non-skin color pixel belongs to the sky gamut A3, the conversion unit 21 replaces the pixel value with RGB data as the representative value P3 of the sky color gamut A3, and the color of the non-skin color pixel changes to the magenta gamut. If it belongs to A4, the pixel value is replaced with RGB data as the representative value P4 of the magenta color gamut A4. If the color of the non-skin color pixel does not belong to any of the skin color gamut and each color gamut corresponding to each representative value defined as described above, another predetermined representative value (for example, a value representing black) RGB Replace that pixel value with the data. As described above, in this embodiment, a plurality of representative colors (representative values) having different hues are set in advance, and the conversion unit 21 sets the pixel value of a non-skin color pixel to any representative value that is close to the pixel color. Perform replacement work.

Ｓ３４０では、変換部２１は、画像データＤＲを構成する全ての画素についてＳ３１０で一度ずつ選択したか否か判断し、未選択の画素が残存している場合にはＳ３１０に戻って新たに画素を１つ選択し、一方、未選択の画素が残存していない場合には、Ｓ３００の処理を終える。この結果、画像データＤＲにおいては、肌色域に属する画像領域以外の各画像領域であって色相が異なる各画像領域の色が、それぞれが対応する代表色で塗りつぶされた状態となる。なお変換部２１は、非肌色画素の画素値を代表値に置き換える際に、代表値の種類（上記の例で言えば、代表値Ｐ２，Ｐ３，Ｐ４等。）別に、置き換えた画素数を累計し、この累計数を内部メモリ１２の所定領域に記録する。また変換部２１は、肌色画素の数（Ｓ３２０において“Ｙｅｓ”の判定をした回数）についても累計し、この累計数を内部メモリ１２の所定の領域に記録する。 In S340, the conversion unit 21 determines whether or not all the pixels constituting the image data DR have been selected once in S310. If unselected pixels remain, the conversion unit 21 returns to S310 and newly selects a pixel. If one is selected, but no unselected pixel remains, the process of S300 ends. As a result, in the image data DR, the colors of the image regions having different hues in the image regions other than the image region belonging to the skin color gamut are filled with the corresponding representative colors. When the conversion unit 21 replaces the pixel value of the non-skin color pixel with the representative value, the conversion unit 21 accumulates the number of replaced pixels for each type of representative value (in the above example, representative values P2, P3, P4, etc.). The cumulative number is recorded in a predetermined area of the internal memory 12. The conversion unit 21 also accumulates the number of skin color pixels (the number of times “Yes” is determined in S320), and records the accumulated number in a predetermined area of the internal memory 12.

Ｓ４００（図２）では、変換部２１は、Ｓ３００の処理後の画像データＤＲをグレー画像へ変換する。つまり変換部２１は、画像データＤＲの各画素のＲＧＢデータを輝度値Ｙ（０〜２５５）に変換し、画素毎に１つの輝度値Ｙを有するモノクロ画像としての画像データＤＲを生成する。輝度値Ｙは一般的に、Ｒ，Ｇ，Ｂを所定の重み付けで加算することにより求めることができる。本実施形態では、Ｓ２００〜Ｓ４００を前処理と呼ぶ。ただし前処理において、Ｓ２００（画像データＤの縮小）は必須ではない。そのため、Ｓ２００を実行しない場合には、変換部２１は、画像データＤを対象としてＳ３００，Ｓ４００さらには後述のＳ５００，Ｓ６００を実行する。またＳ４００（画像データＤＲまたは画像データＤのグレー画像への変換）は、後述するオブジェクト検出処理の便宜を考慮して予め行なうものであるが、かかるＳ４００も前処理において必須と言うわけではなくスキップしてもよい。 In S400 (FIG. 2), the conversion unit 21 converts the image data DR processed in S300 into a gray image. That is, the conversion unit 21 converts the RGB data of each pixel of the image data DR into a luminance value Y (0 to 255), and generates image data DR as a monochrome image having one luminance value Y for each pixel. The luminance value Y can generally be obtained by adding R, G, and B with a predetermined weight. In the present embodiment, S200 to S400 are referred to as preprocessing. However, S200 (reduction of image data D) is not essential in the preprocessing. Therefore, when not executing S200, the conversion unit 21 executes S300 and S400 on the image data D, and further executes S500 and S600 described later. Further, S400 (conversion of image data DR or image data D into a gray image) is performed in advance in consideration of the convenience of object detection processing described later. However, such S400 is not essential in the preprocessing and is skipped. May be.

２‐２．オブジェクトの有無判定の要否判断：
Ｓ５００では、オブジェクト検出部２０はオブジェクト検出処理を実行する。概略的には、オブジェクト検出部２０は、前処理が行なわれた画像データＤＲ（または前処理が行なわれた画像データＤ。以下同様。）において検出窓ＳＷを設定し、検出窓ＳＷ内での画素値のばらつきを求め、当該ばらつきが所定値以上である場合に検出窓ＳＷ内の画像を対象としてオブジェクト（顔画像）の有無を判定する処理を、検出窓ＳＷ毎に繰り返す。 2-2. Determining whether or not an object exists:
In S500, the object detection unit 20 executes an object detection process. Schematically, the object detection unit 20 sets a detection window SW in the preprocessed image data DR (or preprocessed image data D; the same applies hereinafter), and the detection window SW within the detection window SW is set. The process of obtaining the pixel value variation and determining the presence or absence of an object (face image) for the image in the detection window SW when the variation is equal to or greater than a predetermined value is repeated for each detection window SW.

図５は、Ｓ５００の詳細をフローチャートにより示している。
Ｓ５１０では、オブジェクト検出部２０の検出窓設定部２２が、前処理後の画像データＤＲにおいて検出窓ＳＷを１つ設定する。検出窓ＳＷの設定方法は特に限られないが、検出窓設定部２２は一例として、以下のように検出窓ＳＷを設定する。
図６は、画像データＤＲにおいて検出窓ＳＷを設定する様子を示している。検出窓設定部２２は、１回目のＳ５１０では、画像内の先頭位置（例えば、画像の左上の角位置）に複数の画素を含む所定の大きさの矩形状の検出窓ＳＷ（２点鎖線）を設定する。検出窓設定部２２は、２回目以降のＳ５１０の度に、それまで検出窓ＳＷを設定していた位置から検出窓ＳＷを画像の横方向およびまたは縦方向に所定距離（所定画素数分）移動させ、移動先の位置において検出窓ＳＷを新たに１つ設定する。検出窓設定部２２は、検出窓ＳＷの大きさを維持した状態で画像データＤＲの最終位置（例えば、画像の右下の角位置）まで検出窓ＳＷを移動させながら繰り返し検出窓ＳＷを設定したら、先頭位置に戻って検出窓ＳＷを設定する。 FIG. 5 is a flowchart showing details of S500.
In S510, the detection window setting unit 22 of the object detection unit 20 sets one detection window SW in the preprocessed image data DR. The setting method of the detection window SW is not particularly limited, but the detection window setting unit 22 sets the detection window SW as follows as an example.
FIG. 6 shows how the detection window SW is set in the image data DR. In the first S510, the detection window setting unit 22 is a rectangular detection window SW (two-dot chain line) having a predetermined size including a plurality of pixels at the head position in the image (for example, the upper left corner of the image). Set. The detection window setting unit 22 moves the detection window SW from the position where the detection window SW has been set up to a predetermined distance (a predetermined number of pixels) in the horizontal direction and / or the vertical direction of the image every time S510 after the second time. And one new detection window SW is set at the position of the movement destination. When the detection window setting unit 22 repeatedly sets the detection window SW while moving the detection window SW to the final position of the image data DR (for example, the lower right corner position of the image) while maintaining the size of the detection window SW. Returning to the head position, the detection window SW is set.

検出窓設定部２２は、検出窓ＳＷを先頭位置に戻した場合には、それまでよりも矩形の大きさを縮小した検出窓ＳＷを設定する。その後、検出窓設定部２２は上記と同様に、検出窓ＳＷの大きさを維持した状態で画像データＤＲの最終位置まで検出窓ＳＷを移動させつつ、各位置において検出窓ＳＷ設定する。検出窓設定部２２は、検出窓ＳＷの大きさを予め決められた回数だけ段階的に縮小しながら、このような検出窓ＳＷの移動と設定を繰り返す。ただし本実施形態では、検出窓ＳＷを移動させる過程で画像データＤＲのあらゆる領域を検出窓ＳＷの設定対象とするのではなく、一部の領域に関しては避けて検出窓ＳＷを設定することもある。このようにＳ５１０において検出窓ＳＷが１つ設定される度に、Ｓ５２０以降の処理が行なわれる。 When the detection window SW is returned to the head position, the detection window setting unit 22 sets the detection window SW having a smaller rectangular size than before. Thereafter, in the same manner as described above, the detection window setting unit 22 sets the detection window SW at each position while moving the detection window SW to the final position of the image data DR while maintaining the size of the detection window SW. The detection window setting unit 22 repeats such movement and setting of the detection window SW while stepwise reducing the size of the detection window SW by a predetermined number of times. However, in the present embodiment, in the process of moving the detection window SW, not all areas of the image data DR are set as detection window SW setting targets, but the detection window SW may be set avoiding some areas. . In this manner, every time one detection window SW is set in S510, the processing after S520 is performed.

Ｓ５２０では、要否判断部２３が、直近のＳ５１０で前処理後の画像データＤＲに対して設定された検出窓ＳＷ内の画素値のばらつきを算出する。ばらつきの算出対象となる画素値は各画素のＲＧＢでもよいが、本実施形態では前処理において基本的にグレー画像への変換が行なわれているため、各画素の輝度値Ｙのばらつきを算出する。要否判断部２３が算出するばらつきとしては、例えば、検出窓ＳＷ内のコントラストの幅（検出窓ＳＷ内の最大輝度値と最小輝度値との差）が該当する。あるいは、要否判断部２３は、検出窓ＳＷ内の輝度値Ｙの分散σ²や、輝度値Ｙの標準偏差σを求め、これら分散σ²や、標準偏差σをばらつきとしてもよい。分散σ²は、次式（１）によって求めることができる。

ｎは、検出窓ＳＷ内の画素数である。Ｙ_iは、ｎ個の画素中のｉ番目の画素の輝度値Ｙであり、ｍは、Ｙ_iの平均値である。 In S520, the necessity determination unit 23 calculates the dispersion of the pixel values in the detection window SW set for the image data DR after the preprocessing in the latest S510. The pixel value for which the variation is calculated may be RGB of each pixel, but in the present embodiment, since the conversion into a gray image is basically performed in the preprocessing, the variation in the luminance value Y of each pixel is calculated. . The variation calculated by the necessity determination unit 23 corresponds to, for example, the width of the contrast in the detection window SW (the difference between the maximum luminance value and the minimum luminance value in the detection window SW). Alternatively, the necessity determining unit 23 may obtain the variance σ ² of the luminance value Y in the detection window SW and the standard deviation σ of the luminance value Y, and the variance σ ² and the standard deviation σ may be used as variations. The variance σ ² can be obtained by the following equation (1).

n is the number of pixels in the detection window SW. Y _i is the luminance value Y of the i-th pixel among n pixels, and m is the average value of Y _i .

図７Ａは、前処理後の画像データＤＲの一部領域であって、共通の代表値に置き換えられた非肌色画素からなる領域を示している。図７Ｂは、前処理後の画像データＤＲの一部領域であって、肌色画素からなる領域を示している。図７Ａ、図７Ｂはいずれも各画素の輝度値Ｙを示している。図７から明らかなように、１つの代表値の色で塗りつぶされている領域内では輝度値Ｙの変化は無く、輝度値Ｙのばらつきは０となる。一方、肌色画素が集まる領域では、各画素の輝度値Ｙは変化に富んでおり、輝度値Ｙのばらつきもある程度大きな値となる。従って、検出窓ＳＷ内の上記ばらつきの値が大きいほど、検出窓ＳＷ内には肌色画素が多く含まれていると判断することができる。 FIG. 7A shows a partial area of the preprocessed image data DR, which is an area composed of non-skin color pixels replaced with a common representative value. FIG. 7B shows a partial area of the preprocessed image data DR, which is an area composed of skin color pixels. 7A and 7B show the luminance value Y of each pixel. As is clear from FIG. 7, there is no change in the luminance value Y within the area painted with one representative value color, and the variation in the luminance value Y is zero. On the other hand, in the region where the skin color pixels gather, the luminance value Y of each pixel is rich in change, and the variation of the luminance value Y is also somewhat large. Therefore, it can be determined that the larger the variation value in the detection window SW, the more skin color pixels are included in the detection window SW.

そこでＳ５３０では、要否判断部２３は、直近のＳ５２０で算出されたばらつき（例えば、分散σ²）と、所定のしきい値Ｔｈとを比較する。そして、ばらつきがしきい値Ｔｈ以上である場合には、Ｓ５４０に進む。すなわち、ばらつきがしきい値Ｔｈ以上であれば、検出窓ＳＷに肌色画素が多く存在し顔画像の有無を判定する必要性がある（検出窓ＳＷが顔画像を含んでいる可能性が有る）と言えるため、要否判断部２３はＳ５４０に進む。一方、要否判断部２３は、ばらつきがしきい値Ｔｈ未満である場合には、Ｓ５４０，Ｓ５５０をスキップしてＳ５６０に進む。すなわち、ばらつきがしきい値Ｔｈ未満であれば、検出窓ＳＷに肌色画素が存在していないか存在していても非常に少なく、そのため顔画像の有無を判定する必要性が無いと言えるため、要否判断部２３はＳ５６０に進む。本実施形態では、例えば、サンプル用の複数の顔画像毎に算出した輝度値の分散σ²より小さく、かつ上記代表値（単数あるいは複数の代表値）で塗りつぶされた複数のサンプル用の画像毎に算出した輝度値の分散σ²よりも大きい値を、予め実験によって求め、当該求めた値をしきい値Ｔｈとして内部メモリ１２等に記録し、かかるしきい値Ｔｈを要否判断部２３が利用するようにしている。あるいは、ユーザが操作部１４を操作することにより、しきい値Ｔｈをプリンタ１０に与えるとしてもよい。 Therefore, in S530, the necessity determination unit 23 compares the variation (for example, variance σ ² ) calculated in the most recent S520 with a predetermined threshold Th. If the variation is equal to or greater than the threshold value Th, the process proceeds to S540. That is, if the variation is equal to or greater than the threshold Th, it is necessary to determine the presence or absence of a face image because there are many skin color pixels in the detection window SW (the detection window SW may include a face image). Therefore, the necessity determination unit 23 proceeds to S540. On the other hand, if the variation is less than the threshold value Th, the necessity determination unit 23 skips S540 and S550 and proceeds to S560. That is, if the variation is less than the threshold value Th, it can be said that there is no need to determine the presence or absence of a face image because there is very little skin color pixels in the detection window SW or even if they exist. The necessity determining unit 23 proceeds to S560. In the present embodiment, for example, for each of the plurality of sample images that are smaller than the luminance value variance σ ² calculated for each of the plurality of sample face images and are filled with the representative value (one or more representative values). A value larger than the variance σ ² of the calculated luminance value is obtained in advance by experiment, and the obtained value is recorded in the internal memory 12 or the like as the threshold Th, and the necessity determining unit 23 determines the threshold Th. I am trying to use it. Alternatively, the threshold value Th may be given to the printer 10 by the user operating the operation unit 14.

２‐３．オブジェクトの有無判定から印刷まで：
Ｓ５４０では、検出実行部２４が、直近のＳ５１０で設定された検出窓ＳＷ内の画像を対象として、顔画像の有無の判定（顔判定）を行なう。そして、顔画像が存在すると判定した場合にはＳ５５０に進み、顔画像が存在しないと判定した場合にはＳ５５０をスキップしてＳ５６０に進む。検出実行部２４はＳ５４０において、顔画像が存在するか否かを判定可能な手法であればあらゆる手法を採用可能であるが、本実施形態では一例として、ニューラルネットワークＮＮを利用した判定を行なう。 2-3. From object presence determination to printing:
In S540, the detection execution unit 24 determines the presence / absence of a face image (face determination) for the image in the detection window SW set in the latest S510. If it is determined that a face image exists, the process proceeds to S550. If it is determined that no face image exists, S550 is skipped and the process proceeds to S560. In S540, the detection execution unit 24 can employ any method as long as it can determine whether or not a face image exists. In this embodiment, for example, the detection execution unit 24 performs determination using the neural network NN.

図８は、検出実行部２４が実行するＳ５４０の詳細をフローチャートにより示している。検出実行部２４は、Ｓ５４１において、直近のＳ５１０で設定された検出窓ＳＷ内の画素からなる画像データ（窓画像データ）ＸＤを取得すると、Ｓ５４２において、窓画像データＸＤから複数の特徴量を算出する。これらの特徴量は、窓画像データＸＤに対して各種のフィルタを適用し、当該フィルタ内の輝度平均やエッジ量やコントラスト等の画像的特徴を示す特徴量（平均値や最大値や最小値や標準偏差等）を算出することにより得られる。 FIG. 8 is a flowchart showing details of S540 executed by the detection execution unit 24. In S541, the detection execution unit 24 acquires image data (window image data) XD composed of pixels in the detection window SW set in the latest S510, and in S542, calculates a plurality of feature amounts from the window image data XD. To do. For these feature amounts, various filters are applied to the window image data XD, and feature amounts (average value, maximum value, minimum value, and the like) indicating image characteristics such as luminance average, edge amount, and contrast in the filter are applied. (Standard deviation, etc.).

図９は、窓画像データＸＤから特徴量を算出する様子を示している。同図において、画像データＸＤとの相対的な大きさおよび位置が異なる多数のフィルタＦＴが用意されており、各フィルタＦＴを順次窓画像データＸＤに適用し、各フィルタＦＴ内の画像的特徴に基づいて、複数の特徴量ＣＡ，ＣＡ，ＣＡ…を算出する。特徴量ＣＡ，ＣＡ，ＣＡ…が算出できると、検出実行部２４は、Ｓ５４３において、特徴量ＣＡ，ＣＡ，ＣＡ…を、予め用意したニューラルネットワークＮＮに入力し、その出力として顔画像が存在する／しないの判定結果を算出する。 FIG. 9 shows how the feature amount is calculated from the window image data XD. In the same figure, a number of filters FT having different relative sizes and positions with respect to the image data XD are prepared, and each filter FT is sequentially applied to the window image data XD, and image characteristics in each filter FT are obtained. Based on this, a plurality of feature amounts CA, CA, CA... Are calculated. If the feature amounts CA, CA, CA... Can be calculated, the detection execution unit 24 inputs the feature amounts CA, CA, CA... Into the prepared neural network NN in S543, and a face image exists as an output thereof. A determination result of whether or not is calculated is calculated.

図１０は、ニューラルネットワークＮＮの構造の一例を示している。ニューラルネットワークＮＮは、前段層のユニットＵの値の線形結合（添え字ｉは前段層のユニットＵの識別番号。）によって後段層のユニットＵの値が決定される基本構造を有している。さらに、線形結合によって得られた値をそのまま次の層のユニットＵの値としてもよいが、線形結合によって得られた値を例えばハイパボリックタンジェント関数のような非線形関数によって変換して次の層のユニットＵの値を決定することにより、非線形特性を与えてもよい。ニューラルネットワークＮＮは、最外の入力層と出力層と、これらに挟まれた中間層から構成されている。各特徴量ＣＡ，ＣＡ，ＣＡ…がニューラルネットワークＮＮの入力層に入力可能となっており、出力層では出力値Ｋ（０〜１に正規化された値）を出力することが可能となっている。Ｓ５４４では、検出実行部２４は、例えばニューラルネットワークＮＮの出力値Ｋが０．５以上であれば窓画像データＸＤに顔画像が存在することを示す値であると判定し、Ｓ５５０に進む。一方、検出実行部２４は、出力値Ｋが０．５未満であれば窓画像データＸＤに顔画像が存在しないことを示す値であると判定し、Ｓ５６０に進む。 FIG. 10 shows an example of the structure of the neural network NN. The neural network NN has a basic structure in which the value of the unit U in the subsequent layer is determined by a linear combination of the values of the unit U in the previous layer (the suffix i is the identification number of the unit U in the previous layer). Further, the value obtained by the linear combination may be used as the value of the unit U of the next layer as it is, but the value obtained by the linear combination is converted by a non-linear function such as a hyperbolic tangent function, for example. By determining the value of U, non-linear characteristics may be provided. The neural network NN is composed of an outermost input layer and output layer, and an intermediate layer sandwiched between them. Each feature quantity CA, CA, CA... Can be input to the input layer of the neural network NN, and an output value K (value normalized to 0 to 1) can be output from the output layer. Yes. In S544, for example, if the output value K of the neural network NN is 0.5 or more, the detection execution unit 24 determines that the value indicates that a face image exists in the window image data XD, and the process proceeds to S550. On the other hand, if the output value K is less than 0.5, the detection execution unit 24 determines that the face image data does not exist in the window image data XD, and proceeds to S560.

図１１は、ニューラルネットワークＮＮを学習によって構築する様子を模式的に示している。本実施形態では、誤差逆伝搬（error back propagation）法によってニューラルネットワークＮＮの学習を行うことにより、各ユニットＵの数や、各ユニットＵ間における線形結合の際の重みｗの大きさやバイアスｂの値が最適化される。誤差逆伝搬法による学習においては、まず各ユニットＵ間における線形結合の際の重みｗの大きさやバイアスｂの値を適当な値に初期設定する。そして、顔画像が存在しているか否かが既知の学習用画像データについてＳ５４２，Ｓ５４３と同様の手順で特徴量ＣＡ，ＣＡ，ＣＡ…を算出し、当該特徴量ＣＡ，ＣＡ，ＣＡ…を初期設定されたニューラルネットワークＮＮに入力し、その出力値Ｋを取得する。本実施形態では、顔画像が存在している学習用画像データについては出力値Ｋとして１が出力されるのが望ましく、顔画像が存在していない学習用画像データについて出力値Ｋとして０が出力されるのが望ましい。しかしながら、各ユニットＵ間における線形結合の際の重みｗの大きさやバイアスｂの値を適当な値に初期設定したに過ぎないため、実際の出力値Ｋと理想的な値との間には誤差が生じることとなる。このような誤差を極小化させる各ユニットＵについての重みｗやバイアスｂを、勾配法等の数値最適化手法を用いて算出する。以上のような誤差は、後段の層から前段の層に伝搬され、後段のユニットＵについて重みｗやバイアスｂが順に最適化されていく。 FIG. 11 schematically shows how the neural network NN is constructed by learning. In the present embodiment, by learning the neural network NN by the error back propagation method, the number of units U, the size of the weight w at the time of linear combination between the units U, and the bias b are determined. The value is optimized. In learning by the back propagation method, first, the magnitude of the weight w and the value of the bias b at the time of linear combination between the units U are initially set to appropriate values. Then, for learning image data for which it is known whether or not a face image exists, feature amounts CA, CA, CA... Are calculated in the same procedure as S542 and S543, and the feature amounts CA, CA, CA. Input to the set neural network NN and obtain the output value K. In this embodiment, it is desirable that 1 is output as the output value K for learning image data in which a face image exists, and 0 is output as the output value K for learning image data in which no face image exists. It is desirable to be done. However, since the weight w and the value of the bias b at the time of linear combination between the units U are merely set to appropriate values, there is an error between the actual output value K and the ideal value. Will occur. The weight w and the bias b for each unit U that minimizes such an error are calculated using a numerical optimization method such as a gradient method. The error as described above is propagated from the subsequent layer to the previous layer, and the weight w and the bias b are sequentially optimized for the subsequent unit U.

このような学習を複数の上記学習用画像データを用いて行なうことで最適化がなされたニューラルネットワークＮＮを、内部メモリ１２に予め用意しておくことにより、顔画像が窓画像データＸＤに存在するか否かを特徴量ＣＡ，ＣＡ，ＣＡ…に基づいて判定することが可能となる。
Ｓ５５０（図５）では、検出実行部２４は、直近のＳ５４０で顔画像が存在すると判定された検出窓ＳＷの位置（例えば、画像データＤＲ上における検出窓ＳＷの中心位置）および当該検出窓ＳＷの矩形の大きさを、内部メモリ１２の所定領域に記録する。このように検出窓ＳＷの位置や大きさを記録する行為が、顔画像の検出行為の一例に該当する。 A face image exists in the window image data XD by preparing in advance in the internal memory 12 a neural network NN that has been optimized by performing such learning using a plurality of learning image data. It can be determined based on the feature quantities CA, CA, CA.
In S550 (FIG. 5), the detection execution unit 24 determines the position of the detection window SW (for example, the center position of the detection window SW on the image data DR) determined to have a face image in the latest S540 and the detection window SW. Are recorded in a predetermined area of the internal memory 12. Thus, the act of recording the position and size of the detection window SW corresponds to an example of the face image detection act.

Ｓ５６０では、検出窓設定部２２が、図６を用いて説明した検出窓ＳＷの設定方法の思想の下、検出窓ＳＷを移動させ更にその大きさを縮小したりして未だ検出窓ＳＷを設定する余地があれば、Ｓ５１０に戻り、新たに検出窓ＳＷを画像データＤＲ上に１つ設定する。一方、検出窓ＳＷの縮小を上記予め決められた回数分重ね、可能な検出窓ＳＷの設定を全て終えた場合には、オブジェクト検出部２０は、Ｓ５００の処理を終える。
本実施形態では、１つの検出窓ＳＷを設定し（Ｓ５１０）、当該検出窓ＳＷに関して、上記ばらつきがしきい値Ｔｈ未満であると判定された場合（Ｓ５３０において“Ｎｏ”）には、当該検出窓ＳＷが画像データＤＲにおいて占めていた領域については、以降のＳ５１０において、検出窓ＳＷの設定対象とはしないものとする。つまり、一旦、顔判定を行なう必要性が無いと判断された領域については、それ以降は避けて検出窓ＳＷを設定することにより、できるだけ検出窓ＳＷの設定回数を減らし、処理の高速化を図っている。 In S560, the detection window setting unit 22 still sets the detection window SW by moving the detection window SW and further reducing its size under the concept of the detection window SW setting method described with reference to FIG. If there is room to do so, the process returns to S510, and one detection window SW is newly set on the image data DR. On the other hand, when the reduction of the detection window SW is repeated by the predetermined number of times and all the possible detection window SW settings are completed, the object detection unit 20 ends the process of S500.
In the present embodiment, one detection window SW is set (S510), and when it is determined that the variation is less than the threshold Th with respect to the detection window SW (“No” in S530), the detection window SW is detected. It is assumed that the area occupied by the window SW in the image data DR is not set as the detection window SW setting target in the subsequent S510. In other words, once the area is determined not to require face determination, the detection window SW is set to avoid the subsequent areas, thereby reducing the number of detection window SW settings as much as possible to speed up the processing. ing.

Ｓ６００（図２）では、画像補正部３０の補正情報決定部３１が、入力画像に対する補正に用いられる補正情報を決定する。入力画像に対する補正とは、例えば、明るさ補正や、コントラスト補正や、彩度補正や、特定の記憶色に対する補正などが該当する。本実施形態では、補正情報決定部３１は、Ｓ５００において顔画像が検出された場合には、少なくとも当該顔画像に基づいて補正情報を決定する。具体的には、補正情報決定部３１は、内部メモリ１２に、顔画像として検出された検出窓ＳＷの位置および大きさの情報が記録されている場合には、画像データＤＲからこの検出窓ＳＷの位置および大きさの情報が示す範囲の画像データ（顔画像データと呼ぶ。）を抽出する。顔画像データの抽出対象となる画像データＤＲは、前処理後の画像データＤＲでもよいし、非肌色画素についての代表値への置き換えが行なわれる前のＳ２００直後の画像データＤＲであってもよい。補正情報決定部３１は、顔画像データに基づいて補正情報（補正パラメータ）を算出する。例えば、補正情報決定部３１は、顔画像データ内の輝度の平均値を算出し、当該平均値と、所定の目標値との差分を算出し、当該算出した差分を補正情報とする。 In S600 (FIG. 2), the correction information determination unit 31 of the image correction unit 30 determines correction information used for correction of the input image. Examples of correction for an input image include brightness correction, contrast correction, saturation correction, and correction for a specific memory color. In the present embodiment, when a face image is detected in S500, the correction information determination unit 31 determines correction information based on at least the face image. Specifically, when the position and size information of the detection window SW detected as a face image is recorded in the internal memory 12, the correction information determination unit 31 uses the detection window SW from the image data DR. The image data in the range indicated by the position and size information (referred to as face image data) is extracted. The image data DR from which face image data is to be extracted may be pre-processed image data DR, or image data DR immediately after S200 before replacement with a representative value for a non-skin color pixel. . The correction information determination unit 31 calculates correction information (correction parameters) based on the face image data. For example, the correction information determination unit 31 calculates an average value of luminance in the face image data, calculates a difference between the average value and a predetermined target value, and uses the calculated difference as correction information.

Ｓ７００では、補正実行部３２が、Ｓ６００で決定された補正情報に基づいて、前処理が行なわれる前の画像データＤの少なくとも一部を補正する。例えば、補正情報が上述したような顔画像データ内の輝度の平均値と目標値との差分であれば、当該差分に相当する輝度を、画像データＤ上の顔画像データに対応する領域（画像データＤに対する位置および大きさが、画像データＤＲに対する顔画像データの位置および大きさの関係と等しい領域）の各画素に対して足す。その結果、画像データＤ上の顔画像の明るさを向上させることができる。また、上記差分の大きさに基づいてトーンカーブの湾曲度合いを決定し、当該トーンカーブを用いて画像データＤの各画素値を補正するとしてもよい。むろん、Ｓ６００で決定する補正情報の種類やＳ７００で行なう補正の種類は上述したものに限られない。 In S700, the correction execution unit 32 corrects at least a part of the image data D before the preprocessing is performed based on the correction information determined in S600. For example, if the correction information is the difference between the average value of the luminance in the face image data and the target value as described above, the luminance corresponding to the difference is set in an area (image corresponding to the face image data on the image data D). The position and the size with respect to the data D are added to each pixel in the region where the relationship between the position and the size of the face image data with respect to the image data DR is equal. As a result, the brightness of the face image on the image data D can be improved. Further, the curve degree of the tone curve may be determined based on the magnitude of the difference, and each pixel value of the image data D may be corrected using the tone curve. Of course, the type of correction information determined in S600 and the type of correction performed in S700 are not limited to those described above.

さらにＳ７００では、上述した補正に加え、あるいは入力画像から顔画像が検出されなかった場合には単独で、以下のように入力画像のシーンに応じた補正を行うとしてもよい。上述したように変換部２１は、代表値に置き換えた画素について、代表値の種類別に画素数（累計数）を内部メモリ１２に記録しており、また肌色画素の累計数も内部メモリ１２に記録している。そこで補正実行部３２は、かかる累計数に基づいて入力画像のシーンを判定し、判定したシーンに応じた補正を行なう。例えば、ある１つの色域の代表値に置き換えられた画素数が、画像データＤＲの全画素数に対する一定の割合（例えば５０％）を超える場合には、当該１つの色域の色の傾向が強いシーンであると判定し、当該判定したシーンに対応した補正を行なう。例えば、補正実行部３２は、空色域の代表値に置き換えられた画素数が上記一定割合を超えている場合には、入力画像が青空や海を多く含むシーンを撮影した画像であるとみなし、青空や海のブルーの発色度合いを上げるような色補正を画像データＤに対して行なう。また補正実行部３２は、肌色画素の累計数と各代表値毎の累計数との比率に応じて、上記顔画像データから算出した補正情報に基づく補正の度合いを強めたり弱めたりする（肌色画素の累計数が少ない場合には、補正情報に基づく補正の度合いを弱める）としてもよい。 Further, in S700, in addition to the above-described correction, or when a face image is not detected from the input image, correction according to the scene of the input image may be performed alone as follows. As described above, the conversion unit 21 records the number of pixels (cumulative number) for each representative value type in the internal memory 12 for the pixel replaced with the representative value, and also records the cumulative number of flesh color pixels in the internal memory 12. is doing. Therefore, the correction execution unit 32 determines the scene of the input image based on the cumulative number, and performs correction according to the determined scene. For example, when the number of pixels replaced with a representative value of a certain color gamut exceeds a certain ratio (for example, 50%) with respect to the total number of pixels of the image data DR, the color tendency of the one color gamut is It is determined that the scene is strong, and correction corresponding to the determined scene is performed. For example, when the number of pixels replaced with the representative value of the sky color gamut exceeds the certain ratio, the correction execution unit 32 regards the input image as an image obtained by capturing a scene including a lot of blue sky and the sea, Color correction is performed on the image data D so as to increase the degree of color development of blue sky and sea blue. The correction execution unit 32 increases or decreases the degree of correction based on the correction information calculated from the face image data according to the ratio between the cumulative number of skin color pixels and the cumulative number for each representative value (skin color pixel). If the cumulative number of the images is small, the degree of correction based on the correction information may be weakened).

このようにオブジェクトを検出するまでの処理過程で得られた情報に基づいて、画像補正時のシーン判定を行うことができる。そのため、シーン判定のために必要な情報を画像補正部３０が別途生成する必要もなくなり、画像補正処理も高速化される。
Ｓ８００では、印刷処理部５０が、プリンタエンジン１６を制御して、入力画像の印刷を行う。すなわち印刷処理部５０は、補正が施された後の画像データＤに、解像度変換処理や色変換処理やハーフトーン処理など必要な各処理を施して印刷データを生成する。生成された印刷データは、印刷処理部５０からプリンタエンジン１６に供給され、プリンタエンジン１６は印刷データに基づいた印刷を実行する。これにより、入力画像の印刷が完了する。 In this way, scene determination at the time of image correction can be performed based on the information obtained in the process until the object is detected. This eliminates the need for the image correction unit 30 to separately generate information necessary for scene determination, and speeds up the image correction process.
In S800, the print processing unit 50 controls the printer engine 16 to print the input image. That is, the print processing unit 50 performs necessary processes such as resolution conversion process, color conversion process, and halftone process on the corrected image data D to generate print data. The generated print data is supplied from the print processing unit 50 to the printer engine 16, and the printer engine 16 executes printing based on the print data. Thereby, the printing of the input image is completed.

このように本実施形態によれば、入力画像に対して検出窓ＳＷを設定して検出窓ＳＷ毎にオブジェクト（顔画像）の有無を判定する処理の前処理として、入力画像の画素について肌色域に属するか否か判定し、非肌色画素については画素値を代表値に置き換えることにより、肌色域に属する画像領域以外の画像領域を代表値で塗りつぶすとした。そして、前処理を施した入力画像上において検出窓ＳＷを設定し、検出窓ＳＷ内における画素値のばらつきの大きさを評価し、ばらつきがしきい値Ｔｈ未満である場合には、当該検出窓ＳＷについては顔判定を行なうことなく、入力画像上の他の位置に検出窓ＳＷを新たに設定するとした。すなわち、入力画像上の各箇所のうちオブジェクトの有無を判定する必要性の無い箇所については、当該判定を行なう対象から外すようにした。そのため、オブジェクトの検出精度を落とすことなく、入力画像において検出窓ＳＷの設定とオブジェクトの有無の判定とを繰り返す処理の全体量を大幅に減らすことができ、その結果、オブジェクト検出処理が非常に高速化される。 As described above, according to the present embodiment, as a pre-process of the process of setting the detection window SW for the input image and determining the presence / absence of an object (face image) for each detection window SW, the skin color gamut for the pixels of the input image In the case of non-skin color pixels, the pixel value is replaced with a representative value so that an image area other than the image area belonging to the skin color gamut is filled with the representative value. Then, the detection window SW is set on the preprocessed input image, the magnitude of the variation of the pixel value in the detection window SW is evaluated, and when the variation is less than the threshold value Th, the detection window SW With respect to SW, the detection window SW is newly set at another position on the input image without performing face determination. That is to say, of the locations on the input image, locations where there is no need to determine the presence or absence of an object are excluded from the subject of the determination. Therefore, it is possible to greatly reduce the overall amount of processing for repeating the setting of the detection window SW and the determination of the presence / absence of an object in the input image without degrading the object detection accuracy. As a result, the object detection processing is very fast. It becomes.

３．変形例：
本実施形態において、色域定義情報１４ｂが定義する肌色域は１つに限られない。色域定義情報１４ｂは、広さが異なる複数種類の肌色域を予め定義した情報であってもよい。そして変換部２１は、入力画像の特性に応じて、Ｓ３２０（図３）の判定に用いる肌色域を変更するとしてもよい。この場合、変換部２１は入力画像の解析を所定のタイミングで行なう。例えば変換部２１は、Ｓ２００とＳ３００（図２）との間のタイミングにおいて、画像データＤＲのＲ，Ｇ，Ｂ毎の度数分布（ヒストグラム）を生成する。そして、Ｒ，Ｇ，Ｂそれぞれのヒストグラムにおいて特徴量、例えば平均値（メジアンや最大分布値であってもよい。）Ｒave，Ｇave，Ｂaveを算出し、これら平均値Ｒave，Ｇave，Ｂave間の相対的な位置関係に基づいて、入力画像がいわゆる色かぶり画像であるか否かを判定する。 3. Variations:
In the present embodiment, the skin color gamut defined by the color gamut definition information 14b is not limited to one. The color gamut definition information 14b may be information in which a plurality of types of skin color gamuts having different sizes are defined in advance. The conversion unit 21 may change the skin color gamut used for the determination in S320 (FIG. 3) according to the characteristics of the input image. In this case, the conversion unit 21 analyzes the input image at a predetermined timing. For example, the conversion unit 21 generates a frequency distribution (histogram) for each of R, G, and B of the image data DR at a timing between S200 and S300 (FIG. 2). Then, feature values, for example, average values (may be median or maximum distribution value) Rave, Gave, Bave are calculated in the respective R, G, B histograms, and the relative values between these average values Rave, Gave, Bave are calculated. It is determined whether or not the input image is a so-called color cast image based on the specific positional relationship.

図１２Ａ、図１２Ｂ、図１２Ｃはそれぞれ、画像データＤＲにおけるＲ，Ｇ，Ｂ毎のヒストグラムの例を示している。例えば、変換部２１は、平均値Ｒave，Ｇave，Ｂave間の差分｜Ｒave−Ｇave｜、｜Ｒave−Ｂave｜、｜Ｂave−Ｇave｜のうち、｜Ｒave−Ｇave｜および｜Ｒave−Ｂave｜が｜Ｂave−Ｇave｜よりもある値以上の差をもって大きく、かつＲave＞Ｇave、かつＲave＞Ｂaveであれば、入力画像はいわゆる赤かぶりあるいはオレンジかぶり（色かぶりの一種）の状態であると判定できる。また色変換部２１は、入力画像がいわゆる逆光画像であるか否かも入力画像を解析して判定する。変換部２１は、例えば、画像データＤＲの輝度のヒストグラムを生成し、輝度のヒストグラムの形状を解析することにより、逆光画像か否かの判定を行なう。例えば、輝度のヒストグラムにおいて、低輝度側の所定の輝度範囲と高輝度側の所定の輝度範囲とに分かれて２つの分布の山が存在し、２つの山をそれぞれ構成する画素数が所定の基準数を超えている場合に逆光画像であると判定する。あるいは、逆光画像においては画像の中央部が暗くなっていることが多いため、変換部２１は、画像データＤＲ上の所定の中央領域から画素をサンプリングし、このサンプリングした画素の平均輝度がある基準値より低い場合に、入力画像は逆光画像であると判定してもよい。入力画像が、色かぶり画像であるか否かの判定手法および、逆光画像であるか否かの判定手法は、上述した手法に限られない。 12A, 12B, and 12C show examples of histograms for R, G, and B in the image data DR, respectively. For example, the conversion unit 21 may calculate | Rave-Gave | and | Rave-Bave | out of the differences | Rave-Gave |, | Rave-Bave |, | Bave-Gave | between the average values Rave, Gave, and Bave. If Bave−Gave | is greater than a certain value by a certain difference, and Rave> Gave and Rave> Bave, it can be determined that the input image is in a so-called red or orange fog (a type of color fog). The color conversion unit 21 also determines whether the input image is a so-called backlight image by analyzing the input image. For example, the converter 21 determines whether or not the image is a backlight image by generating a luminance histogram of the image data DR and analyzing the shape of the luminance histogram. For example, in a luminance histogram, there are two distribution peaks divided into a predetermined luminance range on the low luminance side and a predetermined luminance range on the high luminance side, and the number of pixels constituting each of the two peaks is a predetermined reference. When the number is exceeded, it is determined that the image is a backlight image. Alternatively, in the backlight image, the central portion of the image is often dark, so the conversion unit 21 samples pixels from a predetermined central region on the image data DR, and the reference pixel has an average luminance of the sampled pixels. When the value is lower than the value, the input image may be determined to be a backlight image. The method for determining whether or not the input image is a color cast image and the method for determining whether or not the input image is a backlight image are not limited to the methods described above.

図１３は、色域定義情報１４ｂがＬａｂ表色系のａｂ平面において定義する２つの肌色域Ａ１Ｓ（第一の色域）および肌色域Ａ１Ｌ（第二の色域）を例示している。肌色域Ａ１Ｓの範囲は、図４に示した肌色域Ａ１と同じである。一方、肌色域Ａ１Ｌの範囲は、肌色域Ａ１Ｓの全体を含み、Ｌａｂ表色系における明度軸（グレー軸）を含み、さらに肌色域Ａ１Ｓよりもａ軸の＋側（赤方向）およびｂ軸の＋側（黄色方向）にそれぞれ拡大された範囲である。変換部２１は、入力画像が色かぶり画像または逆光画像であると判定した場合は、Ｓ３２０において、色域定義情報１４ｂが定義する肌色域Ａ１Ｓ，Ａ１Ｌのうち肌色域Ａ１Ｌを選択し、肌色域Ａ１Ｌに色が属する画素を肌色画素であると判定する。このように入力画像が色かぶり画像であったり逆光画像である場合に自動的に肌色域を大きな範囲に変更することにより、Ｓ３３０において代表値へ置き換えられる画素数が減り、Ｓ５００において顔判定が行なわれる検出窓ＳＷの数が増える（Ｓ５３０において“Ｙｅｓ”と判定される検出窓ＳＷの数が増える）。 FIG. 13 illustrates two skin color gamuts A1S (first color gamut) and skin color gamut A1L (second color gamut) defined by the color gamut definition information 14b in the ab plane of the Lab color system. The range of the skin color gamut A1S is the same as the skin color gamut A1 shown in FIG. On the other hand, the range of the skin color gamut A1L includes the entire skin color gamut A1S, includes the lightness axis (gray axis) in the Lab color system, and further, the + side (red direction) of the a axis and the b axis of the skin color gamut A1S. It is the range expanded to the + side (yellow direction). If the conversion unit 21 determines that the input image is a color cast image or a backlight image, in S320, the conversion unit 21 selects the skin color gamut A1L from the skin color gamuts A1S and A1L defined by the color gamut definition information 14b, and the skin color gamut A1L. The pixel to which the color belongs is determined to be a skin color pixel. As described above, when the input image is a color cast image or a backlight image, the skin color gamut is automatically changed to a large range, thereby reducing the number of pixels replaced with the representative value in S330 and performing face determination in S500. The number of detection windows SW that are detected increases (the number of detection windows SW that are determined as “Yes” in S530 increases).

この結果、入力画像において、一般的な肌色よりもオレンジっぽい色で顔が表されている状態（顔部分が赤かぶり（あるいはオレンジかぶり）の状態）であったり、一般的な肌色よりも低彩度の色で顔が表されている状態（顔部分が逆光状態）であっても、的確に顔画像が検出される。なおプリンタ１０は、Ｓ３２０の処理において肌色域Ａ１Ｓ，Ａ１Ｌのいずれを用いるかを、ユーザが操作部１４を操作して入力した指示に応じて決定してもよい。また、ユーザが速度を優先する処理処理（高速印刷モード）をプリンタ１０に対して設定した場合には、プリンタ１０は、肌色域Ａ１Ｓを選択するとしてもよい。なお色域定義情報１４ｂが定義する肌色域の種類は上記のように２種類に限られるものではなく、例えば、入力画像が色かぶり画像である場合に選択する肌色域と、逆光画像である場合に選択する肌色域とを分けて定義しておいてもよい。 As a result, in the input image, the face is expressed in an orange-like color than the general skin color (the face part is in a red fog (or orange fog) state) or lower than the general skin color. Even when the face is represented by the color of saturation (the face portion is backlit), the face image is accurately detected. Note that the printer 10 may determine which of the skin color gamuts A1S and A1L is used in the process of S320 according to an instruction input by the user operating the operation unit 14. In addition, when the user sets processing processing (high-speed printing mode) giving priority to speed to the printer 10, the printer 10 may select the skin color gamut A1S. Note that the types of skin color gamuts defined by the color gamut definition information 14b are not limited to the two types as described above. For example, when the input image is a color cast image, the skin color gamut to be selected is a backlight image. The skin color gamut to be selected may be defined separately.

Ｓ５４０において検出実行部２４が実行する顔判定であって、ニューラルネットワークＮＮを利用した手法以外の手法について説明する。
図１４は、検出実行部２４が行なう顔判定の手法の一例を模式的に示している。図１４に示す例では、複数の判定器Ｊ，Ｊ…を複数段カスケード状に接続した判定手段を使用する。ここで言う複数の判定器Ｊからなる判定手段は、実体的な装置であってもよいし、複数の判定器Ｊに相当する以下の判定機能を有したプログラムであってもよい。各判定器Ｊ，Ｊ…は、顔判定の対象となった窓画像データＸＤから、それぞれ異なる種類（例えばフィルタが異なる）の単数または複数の特徴量ＣＡ，ＣＡ，ＣＡ…をそれぞれ入力し、それぞれ正または否の判定を出力する。各判定器Ｊ，Ｊ…は、それぞれ特徴量ＣＡ，ＣＡ，ＣＡ…の大小比較や閾値判定等の判定アルゴリズムを有しており、それぞれ窓画像データＸＤが顔らしい（正）か顔らしくない（否）かの独自の判定を実行する。次の段の各判定器Ｊ，Ｊ…は、前の段の判定器Ｊ，Ｊ…の正の出力に接続されており、前の段の判定器Ｊ，Ｊ…の出力が正であった場合のみ次の段の判定器Ｊ，Ｊ…が判定を実行する。いずれの段においても否の出力がなされた時点で顔判定を終了させ、顔画像が存在しない旨の判定を出力する（Ｓ５４０において“Ｎｏ”）。一方、各段の判定器Ｊ，Ｊ…がすべて正の出力をした場合には、顔判定を終了させ、顔画像が存在する旨の判定を出力する（Ｓ５４０において“Ｙｅｓ”）。 A method other than the method using the neural network NN, which is the face determination performed by the detection execution unit 24 in S540, will be described.
FIG. 14 schematically illustrates an example of a face determination technique performed by the detection execution unit 24. In the example shown in FIG. 14, determination means in which a plurality of determination devices J, J. The determination means including the plurality of determination devices J mentioned here may be a substantial device or a program having the following determination functions corresponding to the plurality of determination devices J. Each of the determiners J, J... Inputs one or more feature amounts CA, CA, CA,... Of different types (for example, different filters) from the window image data XD subjected to face determination. Outputs positive or negative judgment. Each of the determiners J, J... Has a determination algorithm such as a feature size comparison between CA, CA, CA... And a threshold determination, and the window image data XD is a face (positive) or not a face ( Execute the original determination of NO). Each of the next stage determiners J, J... Is connected to the positive output of the previous stage determiners J, J..., And the output of the previous stage determiners J, J. Only when this is the case, the next stage decision devices J, J... Face determination is terminated at the point in time when no is output in any stage, and a determination that a face image does not exist is output (“No” in S540). On the other hand, when all of the determination devices J, J... At each stage output a positive value, the face determination is ended and a determination that a face image exists is output (“Yes” in S540).

図１５は、上記判定手段における判定特性を示している。同図においては、上述した各判定器Ｊ，Ｊ…において使用される特徴量ＣＡ，ＣＡ，ＣＡ…の軸で定義される特徴量空間を示しており、最終的に顔画像が存在すると判定される窓画像データＸＤから得られる特徴量ＣＡ，ＣＡ，ＣＡ…の組み合わせで表される特徴量空間内の座標をプロットしている。顔画像が存在すると判定される窓画像データＸＤは一定の特徴を有しているため、特徴量空間における一定の領域に分布が見られると考えることができる。各判定器Ｊ，Ｊ…は、このような特徴量空間において境界平面を生成し、当該境界平面で区切られた空間のうち、前記分布が属する空間に判定対象の特徴量ＣＡ，ＣＡ，ＣＡ…の座標が存在している場合には、正を出力する。従って、各判定器Ｊ，Ｊ…をカスケード状に接続することにより、徐々に正と出力される空間を絞り込んでいくことができる。複数の境界平面によれば、複雑な形状の前記分布についても精度よく判定を行うことができる。 FIG. 15 shows the determination characteristics of the determination means. This figure shows a feature space defined by the axes of the feature values CA, CA, CA,... Used in each of the above-described determiners J, J ..., and finally determines that a face image exists. Coordinates in the feature amount space represented by a combination of feature amounts CA, CA, CA... Obtained from the window image data XD. Since the window image data XD determined that the face image exists has a certain feature, it can be considered that a distribution is seen in a certain region in the feature amount space. Each of the determiners J, J... Generates a boundary plane in such a feature amount space, and among the spaces partitioned by the boundary plane, the determination target feature amounts CA, CA, CA. If the coordinate of exists, positive is output. Therefore, by connecting the determination devices J, J... In a cascade, it is possible to gradually narrow down the space in which the positive output is made. According to the plurality of boundary planes, it is possible to accurately determine the distribution having a complicated shape.

なお、以上においては、本発明のオブジェクト検出装置およびオブジェクト検出方法がプリンタ１０によって具現化される例を示したが、例えばオブジェクト検出装置およびオブジェクト検出方法は、コンピュータや、デジタルスチルカメラや、スキャナ等の画像機器において実現されてもよい。さらに、プリンタのように印刷用紙上に画像処理結果を出力するものに限らず、フォトビューワのようにディスプレイ上に画像処理結果を出力する装置においても本発明を実現することができる。さらに、人物認証を行うＡＴＭ（Automated Teller Machine）等においても本発明を適用することができる。さらに、検出実行部２４が実行する顔判定は、上述した特徴量の特徴量空間における種々の判別手法を用いることも可能である。例えば、サポートベクタマシンを利用してもよい。 In the above, an example in which the object detection device and the object detection method of the present invention are embodied by the printer 10 has been described. For example, the object detection device and the object detection method may be a computer, a digital still camera, a scanner, or the like. It may be realized in the image equipment. Further, the present invention can be realized not only in a printer that outputs image processing results on a printing sheet but also in an apparatus that outputs image processing results on a display such as a photo viewer. Furthermore, the present invention can also be applied to an ATM (Automated Teller Machine) that performs person authentication. Furthermore, the face determination performed by the detection execution unit 24 can use various determination methods in the feature amount space of the feature amount described above. For example, a support vector machine may be used.

プリンタの概略構成を示すブロック図である。FIG. 2 is a block diagram illustrating a schematic configuration of a printer. プリンタが実行する処理を示すフローチャートである。4 is a flowchart illustrating processing executed by a printer. 代表値への置き換え処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the replacement process to a representative value. 色域定義情報が定義する肌色域等を示す図である。It is a figure which shows the skin color gamut etc. which color gamut definition information defines. オブジェクト検出処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of an object detection process. 検出窓を設定する様子を示す図である。It is a figure which shows a mode that a detection window is set. 前処理後の画像データ上の一部領域を示す図である。It is a figure which shows the one part area | region on the image data after a pre-processing. ニューラルネットワークを用いた顔判定を示すフローチャートである。It is a flowchart which shows the face determination using a neural network. 窓画像データから特徴量を算出する様子を示す図である。It is a figure which shows a mode that a feature-value is calculated from window image data. ニューラルネットワークの構造の一例を示す図である。It is a figure which shows an example of the structure of a neural network. ニューラルネットワークを学習する様子を模式的に示す図である。It is a figure which shows typically a mode that a neural network is learned. ＲＧＢ毎のヒストグラムを示す図である。It is a figure which shows the histogram for every RGB. 変形例において色域定義情報が定義する複数の肌色域を示す図である。It is a figure which shows the several skin color gamut which color gamut definition information defines in a modification. 変形例にかかる顔判定処理を模式的に示す図である。It is a figure which shows typically the face determination process concerning a modification. 変形例にかかる顔判定処理の判定特性を示す図である。It is a figure which shows the determination characteristic of the face determination process concerning a modification.

Explanation of symbols

１０…プリンタ、１１…ＣＰＵ、１２…内部メモリ、１４ｂ…色域定義情報、１６…プリンタエンジン、１７…カードＩ／Ｆ、２０…オブジェクト検出部、２１…変換部、２２…検出窓設定部、２３…要否判断部、２４…検出実行部、３０…画像補正部、３１…補正情報決定部、３２…補正実行部、５０…印刷処理部、１７２…カードスロット DESCRIPTION OF SYMBOLS 10 ... Printer, 11 ... CPU, 12 ... Internal memory, 14b ... Color gamut definition information, 16 ... Printer engine, 17 ... Card I / F, 20 ... Object detection part, 21 ... Conversion part, 22 ... Detection window setting part, 23 ... Necessity determination unit, 24 ... Detection execution unit, 30 ... Image correction unit, 31 ... Correction information determination unit, 32 ... Correction execution unit, 50 ... Print processing unit, 172 ... Card slot

Claims

An object detection device for detecting a predetermined object from an input image,
A conversion unit that replaces pixel values of an image area other than an image area belonging to a color gamut corresponding to the object in an image area of the input image with a predetermined representative value;
A detection window is set on the input image on which the replacement has been performed, and a variation in the pixel value in the detection window is obtained. When the variation is a predetermined value or more, the image in the detection window is used as a target. An object detection apparatus comprising: a determination unit that determines presence or absence of an object.

The conversion unit replaces the pixel value of each image region having a different hue other than the image region belonging to the color gamut corresponding to the object with each representative value corresponding to the representative color for each image region. The object detection apparatus according to claim 1.

3. The object detection according to claim 1, wherein the conversion unit analyzes the input image and changes a color gamut corresponding to the object based on the analysis result. apparatus.

When the conversion unit generates a histogram of pixel values of the input image and determines that the input image is a color cast image or a backlight image based on the analysis result of the histogram, the color gamut corresponding to the object The object detection apparatus according to claim 3, wherein a second color gamut is selected from the first color gamut and a second color gamut wider than the first color gamut.

The determination unit repeatedly sets the detection window on the input image while changing at least one of the position and size of the detection window in the input image, and the variation obtained by setting the detection window is less than a predetermined value. The object detection apparatus according to claim 1, wherein the detection window is set for the subsequent area while avoiding the area.

The said determination part determines the presence or absence of an object by utilizing the neural network which inputs the information concerning the image in a detection window, and outputs the information which shows the presence or absence of the said object. Item 6. The object detection device according to any one of Items 5 to 6.

An object detection method for detecting a predetermined object from an input image,
A conversion step of replacing pixel values of an image area other than an image area belonging to a color gamut corresponding to the object in an image area of the input image with a predetermined representative value;
A detection window is set on the input image on which the replacement has been performed, and a variation in the pixel value in the detection window is obtained. When the variation is a predetermined value or more, the image in the detection window is used as a target. And a determination step of determining the presence or absence of an object.

An object detection program for causing a computer to execute processing for detecting a predetermined object from an input image,
A conversion function for replacing pixel values of image areas other than the image area belonging to the color gamut corresponding to the object among the image areas in the input image with predetermined representative values;
A detection window is set on the input image on which the replacement has been performed, and a variation in the pixel value in the detection window is obtained. When the variation is a predetermined value or more, the image in the detection window is used as a target. An object detection program for executing a determination function for determining the presence or absence of an object.

A printing apparatus that detects a predetermined object from an input image and executes printing based on the input image,
A conversion unit that replaces pixel values of an image area other than an image area belonging to a color gamut corresponding to the object in an image area of the input image with a predetermined representative value;
A detection window is set on the input image on which the replacement has been performed, and a variation in the pixel value in the detection window is obtained. When the variation is a predetermined value or more, the image in the detection window is used as a target. A determination unit for determining the presence or absence of an object;
Printing in which at least a part of the input image is corrected according to correction information determined based on an image in the detection window determined that the object is present by the determination unit, and printing is performed based on the corrected input image A printing apparatus comprising: a control unit.