JP7581558B1

JP7581558B1 - Information processing system, information processing method, and program

Info

Publication number: JP7581558B1
Application number: JP2024104043A
Authority: JP
Inventors: 貴博橋本; 総一朗沖; 賢史中島; 慶子碇石; 辰真松木; 祐貴前田; 牧子平田; 裕也今野; 有希木村; 裕也寺田
Original assignee: セーフィー株式会社
Filing date: 2024-06-27
Publication date: 2024-11-12
Anticipated expiration: 2044-06-27

Abstract

[Problem] As the number of locations where cameras are installed increases, there is a growing need to be able to search for specific people from images captured by the cameras.
According to one aspect of the present disclosure, there is provided an information processing system for displaying video captured by a camera on an information terminal, which searches for video containing a facial image from among videos, displays a plurality of videos as a result of the search, selects a video from the plurality of videos in response to a user's instruction, displays the facial image contained in the video together with the selected video, and registers the facial image in response to the user's instruction.
[Selected figure] Figure 5

Description

本発明は、情報処理システム、情報処理方法及びプログラムに関する。 The present invention relates to an information processing system, an information processing method, and a program.

特許文献１には表示装置に複数のカメラアイコンを表示させる処理と、表示装置に表示された複数のカメラアイコンの中から、少なくとも二つ以上のカメラアイコンを囲う入力を受け付ける処理と、少なくとも二つ以上のカメラアイコンを囲う入力を受け付けたことに応じて、少なくとも二つ以上のカメラアイコンのそれぞれに対応するカメラが撮影した映像に基づいて、人物の検索を行う処理と、を一以上のコンピュータに実行させるプログラムが開示されている。 Patent document 1 discloses a program that causes one or more computers to execute the following processes: displaying multiple camera icons on a display device; receiving an input to surround at least two or more camera icons from among the multiple camera icons displayed on the display device; and searching for a person based on images captured by cameras corresponding to each of the at least two or more camera icons in response to receiving the input to surround at least two or more camera icons.

特開２０２３―１２９４２９号公報JP 2023-129429 A

カメラの設置場所の拡大に伴い、カメラで撮影された映像から特定の人物を探したいというニーズが高まっている。 As the number of locations where cameras are installed increases, there is a growing need to search for specific people using footage captured by the cameras.

カメラで撮影された映像を情報端末で表示させる情報処理システムが提供される。情報処理システムは、映像から顔画像を含む映像を検索し、検索の結果として複数の映像を表示し、ユーザーの指示に応じて複数の映像から映像を選択し、選択された映像と共に映像に含まれる顔画像を表示し、ユーザーの指示に応じて顔画像を登録する。 An information processing system is provided that displays video captured by a camera on an information terminal. The information processing system searches for video containing a facial image from among the videos, displays multiple videos as a result of the search, selects a video from the multiple videos in response to a user's instruction, displays the facial image contained in the video together with the selected video, and registers the facial image in response to the user's instruction.

図１は、情報処理システムのシステム構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a system configuration of an information processing system. 図２は、サーバー装置のハードウェア構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of a hardware configuration of the server device. 図３は、クライアント装置のハードウェア構成の一例を示す図である。FIG. 3 is a diagram illustrating an example of a hardware configuration of the client device. 図４は、監視カメラのハードウェア構成の一例を示す図である。FIG. 4 is a diagram illustrating an example of a hardware configuration of a surveillance camera. 図５は、情報処理システムにおける顔画像の登録に係る情報処理の一例を示すフローチャートである。FIG. 5 is a flowchart showing an example of information processing related to registration of a face image in the information processing system. 図６は、一覧表示画面の一例を示す図である。FIG. 6 is a diagram showing an example of the list display screen. 図７は、カレンダー画面を重畳表示させた一例を示す図である。FIG. 7 is a diagram showing an example of a calendar screen being superimposed and displayed. 図８は、映像詳細画面の一例を示す図（その１）である。FIG. 8 is a diagram showing an example of the video detail screen (part 1). 図９は、映像詳細画面の一例を示す図（その２）である。FIG. 9 is a diagram (part 2) showing one example of the video detail screen. 図１０は、人物登録画面の一例を示す図である。FIG. 10 is a diagram showing an example of the person registration screen. 図１１は、編集画面の一例を示す図である。FIG. 11 is a diagram showing an example of the editing screen. 図１２は、ムービークリップの作成画面の一例を示す図である。FIG. 12 is a diagram showing an example of a movie clip creation screen. 図１３は、情報処理システムにおける顔画像検索に係る情報処理の一例を示すフローチャートである。FIG. 13 is a flowchart showing an example of information processing related to a face image search in the information processing system. 図１４は、人物検索画面の一例を示す図である。FIG. 14 is a diagram showing an example of the person search screen. 図１５は、一覧表示画面の一例を示す図である。FIG. 15 is a diagram showing an example of the list display screen. 図１６は、映像詳細画面の一例を示す図である。FIG. 16 is a diagram showing an example of the video details screen. 図１７は、通知設定画面の一例を示す図である。FIG. 17 is a diagram showing an example of the notification setting screen. 図１８は、通知設定画面の一例を示す図である。FIG. 18 is a diagram showing an example of the notification setting screen.

以下、図面を用いて本発明の実施形態について説明する。以下に示す実施形態中で示した各種特徴事項は、互いに組み合わせることができる。 The following describes embodiments of the present invention with reference to the drawings. The various features shown in the following embodiments can be combined with each other.

＜実施形態１＞
１．システム構成図
図１は、情報処理システム１０００のシステム構成の一例を示す図である。図１に示されるように、情報処理システム１０００は、システム構成として、サーバー装置１００と、クライアント装置１１０と、クライアント装置１２０と、複数の監視カメラ１６０と、を含む。サーバー装置１００と、クライアント装置１１０と、クライアント装置１２０と、監視カメラ１６０とは、ネットワーク１５０を介して通信可能に接続されている。ネットワーク１５０は、ＷＡＮ（ＷｉｄｅＡｒｅａＮｅｔｗｏｒｋ）及びインターネットの何れか又は双方を含んでもよい。ネットワーク１５０はネットワーク１５０に接続される複数の装置同士を有線及び無線を介して通信可能に構成されている。 <Embodiment 1>
1. System Configuration Diagram FIG. 1 is a diagram showing an example of a system configuration of an information processing system 1000. As shown in FIG. 1, the information processing system 1000 includes a server device 100, a client device 110, a client device 120, and a plurality of surveillance cameras 160 as a system configuration. The server device 100, the client device 110, the client device 120, and the surveillance cameras 160 are communicatively connected via a network 150. The network 150 may include either or both of a WAN (Wide Area Network) and the Internet. The network 150 is configured to enable communication between a plurality of devices connected to the network 150 via wired and wireless communication.

情報処理システム１０００は、映像検索のサービスを提供するシステムである。 The information processing system 1000 is a system that provides video search services.

サーバー装置１００は、情報処理システム１０００の主な機能を提供する装置であり、以下に示す実施形態の主な処理を実行する。サーバー装置１００は、複数の監視カメラ１６０で撮影された映像をクライアント装置１１０又はクライアント装置１２０で出力（表示）させる装置である。 The server device 100 is a device that provides the main functions of the information processing system 1000, and executes the main processing of the embodiment described below. The server device 100 is a device that outputs (displays) images captured by multiple surveillance cameras 160 on the client device 110 or the client device 120.

監視カメラ１６０は、監視及び／又は記録を目的として設置されるカメラである。明細書では所定の店舗の複数個所それぞれに監視カメラ１６０が設置されているものとして説明を行う。複数個所としては例えば、店舗の複数の出入り口（例えば、入店口東口、入店口西口、出口専用口等）である。なお、これらは例であって監視カメラ１６０が設置される場所を限定するものではない。図１では簡略化のため監視カメラ１６０を３台しか示していないが、２台であってもよいし、４台以上であってもよい。複数の監視カメラ１６０が情報処理システム１０００に含まれていればよい。また明細書では１つの店舗に複数のカメラが設置されているものとして説明を行うが、複数の店舗それぞれに複数のカメラが設置されていてもよい。 The surveillance cameras 160 are cameras installed for the purpose of monitoring and/or recording. In the specification, the surveillance cameras 160 are installed in multiple locations in a specific store. For example, multiple locations may be multiple entrances to the store (e.g., an east entrance, a west entrance, a dedicated exit, etc.). Note that these are examples and do not limit the locations where the surveillance cameras 160 are installed. For simplicity, only three surveillance cameras 160 are shown in FIG. 1, but there may be two, four or more. It is sufficient that multiple surveillance cameras 160 are included in the information processing system 1000. In the specification, the surveillance cameras 160 are installed in one store, but multiple stores may each have multiple cameras.

クライアント装置１１０は、監視カメラ１６０の所有者又は管理者（以下、単に所有者という）が操作する端末装置である。クライアント装置１１０には、後述する図１７及び図１８に示されるような画面が表示される。 The client device 110 is a terminal device operated by the owner or manager (hereinafter simply referred to as the owner) of the surveillance camera 160. The client device 110 displays screens such as those shown in Figs. 17 and 18, which will be described later.

クライアント装置１２０は、監視カメラ１６０が設置される店舗の管理者等が操作する端末装置である。クライアント装置１２０には、後述する図６～図１２、図１４～図１６に示されるような画面が表示される。 The client device 120 is a terminal device operated by a manager or the like of the store in which the surveillance camera 160 is installed. The client device 120 displays screens such as those shown in Figures 6 to 12 and Figures 14 to 16, which will be described later.

ここで、特許請求の範囲に記載の情報処理システムは、複数の装置で構成されてもよいし、一つの装置で構成されてもよい。特許請求の範囲に記載の情報処理システムが一つの装置で構成される場合、その装置の一例はサーバー装置１００である。特許請求の範囲に記載の情報処理システムが複数の装置で構成される場合、複数の装置の例は、サーバー装置１００及び複数の監視カメラ１６０のうち少なくとも１つ以上の監視カメラ１６０及びクライアント装置１１０又はクライアント装置１２０、又はサーバー装置１００の機能を提供する複数のサーバー装置で構成されたクラウドサーバー等である。 Here, the information processing system described in the claims may be composed of multiple devices, or may be composed of one device. When the information processing system described in the claims is composed of one device, an example of that device is the server device 100. When the information processing system described in the claims is composed of multiple devices, an example of the multiple devices is the server device 100 and at least one of the multiple surveillance cameras 160 and the client device 110 or 120, or a cloud server composed of multiple server devices that provide the functions of the server device 100.

２．ハードウェア構成
（１）サーバー装置１００のハードウェア構成
図２は、サーバー装置１００のハードウェア構成の一例を示す図である。
図２に示されるように、サーバー装置１００は、ハードウェア構成として、制御部２１０と、記憶部２２０と、通信部２３０と、内部バス２４０と、を含む。制御部２１０と、記憶部２２０と、通信部２３０と、は内部バス２４０を介して電気的に接続されている。 2. Hardware Configuration (1) Hardware Configuration of Server Apparatus 100 FIG. 2 is a diagram showing an example of the hardware configuration of the server apparatus 100. As shown in FIG.
2, the server device 100 includes, as its hardware configuration, a control unit 210, a storage unit 220, a communication unit 230, and an internal bus 240. The control unit 210, the storage unit 220, and the communication unit 230 are electrically connected via the internal bus 240.

制御部２１０は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）等であって、サーバー装置１００の全体を制御する。 The control unit 210 is a CPU (Central Processing Unit) or the like, and controls the entire server device 100.

記憶部２２０は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＳＳＤ（ＳｏｌｉｄＳａｔｅＤｒｉｖｅ）等の何れか、又はこれらの任意の組み合わせであって、プログラム、制御部２１０がプログラムに基づき処理を実行する際に利用するデータ等を記憶する。記憶部２２０は、記憶媒体の一例である。制御部２１０がプログラムに基づき処理を実行する際に利用するデータとしては、例えば、監視カメラ１６０から送られてきた映像データ、検知対象の顔画像等、検知対象に関するデータ、後述する図１７及び図１８等で設定される送信先に関するデータ等がある。 The storage unit 220 is any one of a hard disk drive (HDD), a read only memory (ROM), a random access memory (RAM), a solid state drive (SSD), etc., or any combination of these, and stores programs, data used by the control unit 210 when executing processing based on the programs, etc. The storage unit 220 is an example of a storage medium. Examples of data used by the control unit 210 when executing processing based on the programs include video data sent from the surveillance camera 160, facial images of the detection target, data related to the detection target, data related to the transmission destination set in Figures 17 and 18, etc., described below, etc.

明細書では制御部２１０がプログラムに基づき処理を実行する際に利用するデータは記憶部２２０に記憶されるものとして説明するが、サーバー装置１００と通信可能な他の装置の記憶部等に記憶されていてもよい。データは、制御部２１０が参照又は取得可能であればどの装置の記憶部に記憶されていてもよい。制御部２１０が、記憶部２２０に記憶されているプログラムに基づき、処理を実行することによって、サーバー装置１００の機能及び後述する図５及び図１３に示されるフローチャートの処理等が実現される。なお、この処理は主にサーバー装置１００が実行するものとして説明するが、その代わりにクライアント装置１１０又はクライアント装置１２０、複数の監視カメラ１６０の何れかが実行することとしてもよい。 In the specification, the data used by the control unit 210 when executing processing based on the program is described as being stored in the storage unit 220, but it may also be stored in a storage unit of another device that can communicate with the server device 100. The data may be stored in the storage unit of any device as long as the control unit 210 can refer to or obtain the data. The control unit 210 executes processing based on the program stored in the storage unit 220, thereby realizing the functions of the server device 100 and the processing of the flowcharts shown in Figures 5 and 13 described below. Note that, although this processing is described as being mainly executed by the server device 100, it may instead be executed by either the client device 110 or client device 120, or one of the multiple surveillance cameras 160.

通信部２３０は、サーバー装置１００をネットワーク１５０に接続し、他の装置との通信を司る。 The communication unit 230 connects the server device 100 to the network 150 and manages communication with other devices.

なお、制御部２１０、記憶部２２０、通信部２３０の各ハードウェア構成は１つに限られない。例えば、複数の制御部がサーバー装置１００に含まれてもよい。以下に示すクライアント装置１１０及びクライアント装置１２０も同様である。 Note that the hardware configuration of each of the control unit 210, the storage unit 220, and the communication unit 230 is not limited to one. For example, multiple control units may be included in the server device 100. The same applies to the client device 110 and the client device 120 described below.

（２）クライアント装置１１０／１２０のハードウェア構成
図３は、クライアント装置１１０／１２０のハードウェア構成の一例を示す図である。
図３に示されるように、クライアント装置１１０は、ハードウェア構成として、制御部３１０と、記憶部３２０と、入力部３３０と、出力部３４０と、通信部３５０と、内部バス３６０と、を含む。制御部３１０と、記憶部３２０と、入力部３３０と、出力部３４０と、通信部３５０と、は内部バス３６０を介して電気的に接続されている。 (2) Hardware Configuration of the Client Device 110/120 FIG. 3 is a diagram showing an example of the hardware configuration of the client device 110/120.
3, the client device 110 includes, as its hardware configuration, a control unit 310, a storage unit 320, an input unit 330, an output unit 340, a communication unit 350, and an internal bus 360. The control unit 310, the storage unit 320, the input unit 330, the output unit 340, and the communication unit 350 are electrically connected via the internal bus 360.

制御部３１０は、ＣＰＵ等であって、クライアント装置１１０の全体を制御する。 The control unit 310 is a CPU or the like, and controls the entire client device 110.

記憶部３２０は、ＨＤＤ、ＲＯＭ、ＲＡＭ、ＳＳＤ等の何れか、又はこれらの任意の組み合わせであって、プログラム、制御部３１０がプログラムに基づき処理を実行する際に利用するデータ等を記憶する。記憶部３２０は、記憶媒体の一例である。 The storage unit 320 is an HDD, ROM, RAM, SSD, etc., or any combination of these, and stores programs, data used by the control unit 310 when executing processing based on the programs, etc. The storage unit 320 is an example of a storage medium.

明細書では制御部３１０がプログラムに基づき処理を実行する際に利用するデータは記憶部３２０に記憶されるものとして説明するが、クライアント装置１１０と通信可能な他の装置の記憶部等に記憶されていてもよい。データは、制御部３１０が参照又は取得可能であればどの装置の記憶部に記憶されていてもよい。制御部３１０が、記憶部３２０に記憶されているプログラムに基づき、処理を実行することによって、クライアント装置１１０の機能等が実現される。 In the specification, the data used by the control unit 310 when executing processing based on a program is described as being stored in the memory unit 320, but the data may also be stored in a memory unit of another device that can communicate with the client device 110. The data may be stored in the memory unit of any device as long as the control unit 310 can refer to or obtain the data. The functions of the client device 110 are realized by the control unit 310 executing processing based on the program stored in the memory unit 320.

入力部３３０は、操作者の操作に応じて情報をクライアント装置１１０に入力する装置である。入力部３３０は、ユーザーによってなされた操作入力を受け付ける。操作入力は、命令信号として内部バス３６０を介して制御部３１０に転送される。制御部３１０は、必要に応じて、転送された命令信号に基づいて所定の制御及び／又は演算を実行し得る。入力部３３０は、クライアント装置１１０の筐体に含まれるものであってもよいし、外付けされるものであってもよい。例えば、入力部３３０は、出力部３４０と一体となってタッチパネルとして実施されてもよい。入力部３３０がタッチパネルとして実施される場合、ユーザーは、入力部３３０に対してタップ操作、スワイプ操作等を入力することができる。入力部３３０としては、タッチパネルに代えて、スイッチボタン、マウス、トラックパッド、キーボード等が採用することができる。 The input unit 330 is a device that inputs information to the client device 110 in response to an operation by an operator. The input unit 330 accepts an operation input made by a user. The operation input is transferred as a command signal to the control unit 310 via the internal bus 360. The control unit 310 may execute a predetermined control and/or calculation based on the transferred command signal as necessary. The input unit 330 may be included in the housing of the client device 110 or may be externally attached. For example, the input unit 330 may be implemented as a touch panel integrated with the output unit 340. When the input unit 330 is implemented as a touch panel, the user can input a tap operation, a swipe operation, or the like to the input unit 330. Instead of a touch panel, a switch button, a mouse, a track pad, a keyboard, or the like can be adopted as the input unit 330.

出力部３４０は、例えば、ディスプレイに代表される表示部であって、ユーザーが操作可能なグラフィカルユーザインターフェース（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ：ＧＵＩ）の画面として情報を出力（表示）する装置である。出力部３４０は、クライアント装置１１０の筐体に含まれるものであってもよいし、外付けされるものであってもよい。より具体的には、出力部３４０は、液晶ディスプレイ、有機ＥＬ（Ｅｌｅｃｔｒｏｎ－Ｌｕｍｉｎｅｓｃｅｎｃｅ）ディスプレイ、又はプラズマディスプレイ等の表示デバイスとして実施され得る。これらの表示デバイスは、クライアント装置１１０の種類に応じて使い分けて実施されることが好ましい。 The output unit 340 is, for example, a display unit such as a display, and is a device that outputs (displays) information as a screen of a graphical user interface (GUI) that can be operated by a user. The output unit 340 may be included in the housing of the client device 110, or may be attached externally. More specifically, the output unit 340 may be implemented as a display device such as a liquid crystal display, an organic EL (Electron-Luminescence) display, or a plasma display. It is preferable that these display devices are implemented by using different devices depending on the type of client device 110.

通信部３５０は、クライアント装置１１０をネットワーク１５０に接続し、他の装置との通信を司る。 The communication unit 350 connects the client device 110 to the network 150 and manages communication with other devices.

クライアント装置１２０のハードウェア構成もクライアント装置１１０のハードウェア構成と同様である。 The hardware configuration of client device 120 is similar to that of client device 110.

なお、明細書ではクライアント装置１１０及びクライアント装置１２０の例としてＰＣ（ＰｅｒｓｏｎａｌＣｏｍｐｕｔｅｒ）を用いて説明する。しかし、クライアント装置１１０及びクライアント装置１２０は、スマートフォン、タブレット型コンピュータ等であってもよい。クライアント装置は、後述するような画面を表示し、画面等を介したユーザー操作を受け付け、サーバー装置１００に情報を送信することができればどのような装置であってもよい。 In the specification, the client device 110 and the client device 120 are described using a PC (Personal Computer) as an example. However, the client device 110 and the client device 120 may be a smartphone, a tablet computer, or the like. The client device may be any device that can display a screen as described below, accept user operations via the screen, or the like, and transmit information to the server device 100.

（３）監視カメラ１６０のハードウェア構成
図４は、監視カメラ１６０のハードウェア構成の一例を示す図である。
図４に示されるように、監視カメラ１６０は、ハードウェア構成として、制御部４１０と、記憶部４２０と、撮影部４３０と、通信部４４０と、内部バス４５０と、を含む。制御部４１０と、記憶部４２０と、撮影部４３０と、通信部４４０と、は内部バス４５０を介して電気的に接続されている。 (3) Hardware Configuration of Surveillance Camera 160 FIG. 4 is a diagram showing an example of the hardware configuration of the surveillance camera 160. As shown in FIG.
4, surveillance camera 160 includes, as its hardware configuration, a control unit 410, a storage unit 420, an image capturing unit 430, a communication unit 440, and an internal bus 450. Control unit 410, storage unit 420, image capturing unit 430, and communication unit 440 are electrically connected via internal bus 450.

制御部４１０は、ＣＰＵ等であって、監視カメラ１６０の全体を制御する。 The control unit 410 is a CPU or the like, and controls the entire surveillance camera 160.

記憶部４２０は、ＨＤＤ、ＲＯＭ、ＲＡＭ、ＳＳＤ等の何れか、又はこれらの任意の組み合わせであって、プログラム、制御部４１０がプログラムに基づき処理を実行する際に利用するデータ等を記憶する。記憶部４２０は、記憶媒体の一例である。 The storage unit 420 is an HDD, ROM, RAM, SSD, or any combination of these, and stores programs, data used by the control unit 410 when executing processing based on the programs, and the like. The storage unit 420 is an example of a storage medium.

明細書では制御部４１０がプログラムに基づき処理を実行する際に利用するデータは記憶部４２０に記憶されるものとして説明するが、監視カメラ１６０と通信可能な他の装置の記憶部等に記憶されていてもよい。データは、制御部３１０が参照又は取得可能であればどの装置の記憶部に記憶されていてもよい。制御部４１０が、記憶部４２０に記憶されているプログラムに基づき、処理を実行することによって、監視カメラ１６０の機能等が実現される。 In the specification, the data used by the control unit 410 when executing processing based on a program is described as being stored in the memory unit 420, but the data may also be stored in a memory unit of another device that can communicate with the surveillance camera 160. The data may be stored in the memory unit of any device as long as the control unit 310 can refer to or obtain the data. The functions of the surveillance camera 160 are realized by the control unit 410 executing processing based on the program stored in the memory unit 420.

撮影部４３０は、被写体を撮影するカメラである。カメラには例えばイメージセンサー、レンズ及びＩＲカットフィルター等が含まれる。 The photographing unit 430 is a camera that photographs a subject. The camera includes, for example, an image sensor, a lens, and an IR cut filter.

通信部４４０は、監視カメラ１６０をネットワーク１５０に接続し、他の装置との通信を司る。 The communication unit 440 connects the surveillance camera 160 to the network 150 and handles communication with other devices.

３．情報処理
以下、実施形態１の情報処理を説明する。 3. Information Processing The information processing of the first embodiment will now be described.

（１）処理の概要
（１－１）顔画像登録
制御部２１０は、情報処理システム１０００に含まれる監視カメラ１６０で撮影されている映像から顔画像が含まれる映像を検索し、顔画像が含まれる複数の映像を検索結果として出力（表示）する。制御部２１０は、検索結果に含まれる複数の映像からユーザーの選択操作に応じて映像を選択し、選択された映像と共に映像に含まれる人物の顔画像を出力（表示）する。制御部２１０は、ユーザーの指示に応じて顔画像を登録する。 (1) Overview of Processing (1-1) Facial Image Registration The control unit 210 searches for images containing facial images from images captured by the surveillance camera 160 included in the information processing system 1000, and outputs (displays) multiple images containing facial images as search results. The control unit 210 selects an image from multiple images included in the search results in response to a user's selection operation, and outputs (displays) the facial image of a person included in the image together with the selected image. The control unit 210 registers the facial image in response to a user's instruction.

このような処理を実行することによって、簡単に映像から人物を探し出し、気になる人物の顔画像を情報処理システム１０００に登録することができる。 By performing this type of processing, you can easily find people from the video and register facial images of people of interest in the information processing system 1000.

（１－２）顔画像検索
制御部２１０は、登録された複数の顔画像を出力（表示）する。制御部２１０は、ユーザーの選択操作に応じて複数の顔画像から顔画像を選択し、選択された顔画像を含む映像を検索し、選択された顔画像が含まれる複数の映像を検索結果として出力（表示）する。制御部２１０は、検索結果に含まれる複数の映像からユーザーの選択操作に応じて映像を選択し、選択された映像を出力（表示）する。 (1-2) Facial Image Search The control unit 210 outputs (displays) the multiple registered facial images. The control unit 210 selects a facial image from the multiple facial images in response to a user's selection operation, searches for a video including the selected facial image, and outputs (displays) the multiple videos including the selected facial image as a search result. The control unit 210 selects a video from the multiple videos included in the search result in response to the user's selection operation, and outputs (displays) the selected video.

このような処理を実行することによって、登録された顔画像から選択した顔画像の人物の映像を検出し、出力することができる。 By performing this type of processing, it is possible to detect and output an image of the person whose face image is selected from the registered face images.

（２）処理の詳細
（２－０）前処理
まず、事前の設定として、顔の正面角度と判定する角度、正面角度と十分差があると判定する角度差、信頼のおける画像pixel数、ベストショット取得指定時間、等を設定しておく。これらの設定は、例えばクライアント装置１２０を介したユーザーの指示に応じて設定され、各監視カメラ１６０（又はサーバー装置１００）に記憶される。 (2) Processing Details (2-0) Pre-processing First, as pre-settings, the angle to be determined as the front angle of the face, the angle difference to be determined as being sufficiently different from the front angle, the number of reliable image pixels, the designated time for obtaining the best shot, etc. These settings are set, for example, according to the user's instructions via the client device 120, and are stored in each surveillance camera 160 (or the server device 100).

そして、本システムでは基本的に各監視カメラ１６０が映像を撮影してサーバー装置１００に送信して記憶する処理を常時実行するが、その際にベストショットの抽出処理を併せて行うものとする。ここで、ベストショットとは、同一人物を異なる所定の角度から撮影した顔画像（又は全身画像）として最適なものであり、後述する図８及び／又は図１０の顔画像（７４０、７５０、７６０、７７０、９１０、９２０、９３０、９４０）に対応する。なお、この処理は主に各監視カメラ１６０側で実行するが、サーバー装置１００側で実行してもよい。カメラ側で処理を実行するメリットは、その分、サーバー側の処理負荷及び利用料金が軽減されることにあるが、その反面カメラ側で該当する処理を行うためのスペックが要求されることになる。 In this system, each security camera 160 basically always captures video and transmits it to the server device 100 for storage, but at the same time, a best shot extraction process is also performed. Here, the best shot is the optimal face image (or whole body image) of the same person captured from different specified angles, and corresponds to the face images (740, 750, 760, 770, 910, 920, 930, 940) in FIG. 8 and/or FIG. 10 described below. This process is mainly performed on the security camera 160 side, but may also be performed on the server device 100 side. The advantage of performing the process on the camera side is that the processing load and usage fee on the server side are reduced accordingly, but on the other hand, the camera side is required to have specifications to perform the corresponding process.

ベストショットの抽出処理の詳細は、次の通りである。まず、監視カメラ１６０は、カメラの映像から、人の全身像をトラッキングする。次に、全身像のデータから顔を検出し、追跡する。そして、追跡した顔から角度を検出し、顔の正面角度範囲内に収まっている画像を、ベストショットの1stとし、顔の角度を記録する。ベストショットの1stは、例えば顔の正面画像を検出するAIモデルを用いて、そのスコアが最も高くなったものを採用してもよい。さらに、追跡した顔から角度を検出し続け、ベストショット1stの顔の角度に対して正面角度との角度差以上の顔があったら、ベストショット2ndとして保存する。ベストショットの2ndは、例えばカメラに対する顔の左向き（又は右向き）画像を検出するAIモデルを用いて、そのスコアが最も高くなったものを採用してもよい。さらに、ベストショット1st,2nd両方と角度差がある画像をベストショット3rdとして保存する。ベストショットの3rdは、例えばカメラに対する顔の右向き（又は左向き）画像を検出するAIモデルを用いて、そのスコアが最も高くなったものを採用してもよい。さらに、ベストショット1st,2nd,3rdのうち少なくとも何れか１つ以上に対応する人物について、その人物の全身画像（又はバストアップ画像等）をベストショット4thとして保存してもよい。バストアップ画像とは、人物の上半身が写った画像のことである。ベストショットの4thは、例えば全身画像（又はバストアップ画像等）を検出するAIモデルを用いて、そのスコアが最も高くなったものを採用してもよい。なお、追跡した顔を検出し続け、正面角度範囲内に収まっている画像で、信頼のおける画像pixel数を上回る画像があった場合、ベストショット1stを入れ替えてもよい。同様に、2nd,3rd,4thも入れ替えてもよい。また、ベストショット取得の指定時間を超えた場合、ベストショット抽出を停止してもよい。また、ベストショット1st,2nd,3rd,4thは夫々の観点で最もスコアが高かったものを採用するとしたが、これに加え夫々において次点以降のスコアのものも含めた複数採用することとしてもよい。そして、トラッキングが終了した時点で、ベストショット1st,2nd,3rd,4thを、夫々の画像を撮影したカメラ及び日時を示すメタデータとともに、サーバー装置１００に送信する。監視カメラ１６０は、ベストショット1st,2nd,3rd,4thと、夫々の画像を撮影したカメラ及び日時を示すメタデータとを受信したうえで、それら画像同士／各画像とメタデータとを対応付けて記憶する。なお、この段階で、ベストショット1st,2nd,3rdに基づいて顔画像の平均処理を行い、顔の特徴量を計算したうえで、画像及びメタデータと対応付けて記憶しておいてもよい。以下、これら対応付けられた情報のセット（ベストショット1st,2nd,3rd,4th／夫々の画像を撮影したカメラ及び日時を示すメタデータ／顔の特徴量）を、ベストショット情報と呼ぶ。ベストショット情報は、人物ごと／シーンごとに、互いに異なる複数のベストショット情報として記憶される。 The details of the best shot extraction process are as follows. First, the surveillance camera 160 tracks the whole body image of a person from the camera image. Next, a face is detected from the data of the whole body image and tracked. Then, an angle is detected from the tracked face, and an image that falls within the range of the front angle of the face is set as the 1st best shot, and the angle of the face is recorded. The 1st best shot may be the one with the highest score, for example, using an AI model that detects a front image of a face. Furthermore, the angle is continuously detected from the tracked face, and if there is a face whose angle difference with the angle of the face of the 1st best shot is equal to or greater than the front angle, it is saved as the 2nd best shot. The 2nd best shot may be the one with the highest score, for example, using an AI model that detects an image of a face facing left (or right) relative to the camera. Furthermore, an image that has an angle difference with both the 1st and 2nd best shots is saved as the 3rd best shot. The 3rd best shot may be the one with the highest score, for example, using an AI model that detects an image of a face facing right (or left) relative to the camera. Furthermore, for a person corresponding to at least one of the 1st, 2nd, and 3rd best shots, a full-body image (or a bust-up image, etc.) of that person may be saved as the 4th best shot. A bust-up image is an image in which the upper half of a person's body is shown. The 4th best shot may be, for example, the one with the highest score using an AI model that detects a full-body image (or a bust-up image, etc.). If the tracked face is continuously detected and an image that is within the front angle range and has a number of images that exceeds the reliable image pixel count is found, the 1st best shot may be replaced. Similarly, the 2nd, 3rd, and 4th best shots may also be replaced. In addition, if the designated time for obtaining the best shot is exceeded, the extraction of the best shot may be stopped. In addition, although the 1st, 2nd, 3rd, and 4th best shots are selected from the highest scores from each perspective, multiple images including those with the next highest scores from each perspective may be selected. Then, when tracking is completed, the first, second, third, and fourth best shots are sent to the server device 100 together with metadata indicating the camera and date and time when each image was taken. After receiving the first, second, third, and fourth best shots and the metadata indicating the camera and date and time when each image was taken, the surveillance camera 160 stores the images in association with each other/with each image and the metadata. At this stage, the face images may be averaged based on the first, second, and third best shots, and the facial features may be calculated and stored in association with the images and metadata. Hereinafter, this set of associated information (first, second, third, and fourth best shots/metadata indicating the camera and date and time when each image was taken/facial features) is referred to as best shot information. The best shot information is stored as multiple best shot information that are different from each other for each person/scene.

（２－１）顔画像登録
図５は、情報処理システム１０００における顔画像の登録に係る情報処理の一例を示すフローチャートである。なお、この処理は主にサーバー装置１００が実行するものとして説明するが、その代わりにクライアント装置１１０又はクライアント装置１２０、複数の監視カメラ１６０の何れかが実行することとしてもよい。 5 is a flowchart showing an example of information processing related to registration of a face image in the information processing system 1000. Note that, although this processing will be described as being mainly executed by the server device 100, it may instead be executed by any of the client device 110 or 120, or the multiple surveillance cameras 160.

ステップＳ５１０において、制御部２１０は、情報処理システム１０００に含まれる監視カメラ１６０で撮影された映像から顔画像が含まれる映像を検索する。具体的には、まず、前述した前処理にて記憶された複数のベストショット情報を読み出す。そして、夫々のベストショット情報について、ベストショット1st(又は4th）を撮影したカメラ及び日時を示すメタデータに基づいて、映像を特定する。 In step S510, the control unit 210 searches for footage containing a facial image from footage captured by the surveillance camera 160 included in the information processing system 1000. Specifically, first, the control unit 210 reads out a number of pieces of best shot information stored in the pre-processing described above. Then, for each piece of best shot information, the image is identified based on metadata indicating the camera and date and time that captured the first (or fourth) best shot.

なお、ステップＳ５１０は次のように実現してもよい。情報処理システム１０００に含まれる複数の監視カメラ１６０で撮影された映像は、ネットワーク１５０を介してサーバー装置１００に送信され、記憶部２２０等の所定の記憶領域に記憶される。映像は、どの監視カメラ１６０で撮影された映像か識別可能な態様で記憶部２２０等の所定の記憶領域に記憶される。どの監視カメラ１６０で撮影された映像か識別可能な態様としては、監視カメラごとに異なるフォルダ（ディレクトリ）が記憶部２２０等に作成され、フォルダ内に映像が記憶されてもよいし、映像のファイル名に監視カメラを識別する識別情報が記述されてもよいし、映像のファイルのメタデータ内に監視カメラを識別する識別情報が記述されてもよい。また、映像は、いつ撮影された映像か識別可能な態様で記憶部２２０等の所定の記憶領域に記憶される。いつ撮影された映像か識別可能な態様としては、日付ごとに異なるフォルダ（ディレクトリ）が記憶部２２０等に作成され、フォルダ内に映像が記憶されてもよいし、映像のファイル名に映像が撮影された日時情報が記述されてもよいし、映像のファイルのメタデータ内に映像が撮影された日時情報が記述されてもよい。制御部２１０は、記憶部２２０等の所定の記憶領域に記憶された映像から顔画像が含まれる映像を検索し、検索結果を取得する。 Note that step S510 may be realized as follows. Videos captured by the multiple surveillance cameras 160 included in the information processing system 1000 are transmitted to the server device 100 via the network 150 and stored in a predetermined storage area such as the storage unit 220. The video is stored in a predetermined storage area such as the storage unit 220 in a manner that allows identification of which surveillance camera 160 the video was captured by. As a manner that allows identification of which surveillance camera 160 the video was captured by, a different folder (directory) may be created in the storage unit 220 for each surveillance camera and the video may be stored in the folder, identification information that identifies the surveillance camera may be written in the file name of the video, or identification information that identifies the surveillance camera may be written in the metadata of the video file. In addition, the video is stored in a predetermined storage area such as the storage unit 220 in a manner that allows identification of when the video was captured. As a mode for identifying when the video was shot, a different folder (directory) may be created in the storage unit 220 or the like for each date, and the video may be stored in the folder, or the date and time information when the video was shot may be described in the file name of the video, or the date and time information when the video was shot may be described in the metadata of the video file. The control unit 210 searches for videos that contain facial images from the videos stored in a specified storage area such as the storage unit 220, and obtains the search results.

例えば、制御部２１０は、映像を学習済みモデルに入力する。この学習済みモデルは、映像を入力データ、映像に顔画像が含まれるか否か、及び、映像に顔画像が含まれる場合は顔画像の人物が出現してから消失するまでの時間情報を出力データとして学習された学習済みモデルである。さらに、学習済みモデルは、同じ映像に複数の顔画像が含まれる場合には複数の顔画像それぞれの人物が出現してから消失するまでの時間情報を出力データとして学習された学習済みモデルである。
制御部２１０は、学習済みモデルから映像に顔画像が含まれるか否かの情報、及び、映像に顔画像が含まれる場合は顔画像の人物が出現してから顔画像の人物が消失するまでの時間情報を検索結果として取得する。 For example, the control unit 210 inputs a video to the trained model. This trained model is trained using the video as input data, whether or not the video contains a face image, and, if the video contains a face image, time information from when a person in the face image appears to when it disappears as output data. Furthermore, if the same video contains multiple face images, the trained model is trained using time information from when a person in each of the multiple face images appears to when it disappears as output data.
The control unit 210 obtains, from the learned model, information as to whether or not the video contains a facial image, and, if the video contains a facial image, time information from when a person in the facial image appears to when the person in the facial image disappears as a search result.

ステップＳ５２０において、制御部２１０は、検索結果に基づき、顔画像が含まれる映像の顔画像が出現したタイミングの縮小画像（以下、単にサムネイルともいう）を含む一覧表示画面を生成する。制御部２１０は、生成した一覧表示画面をクライアント装置１２０に送信し、クライアント装置１２０の出力部に表示されるよう制御する。 In step S520, the control unit 210 generates a list display screen including a reduced image (hereinafter also simply referred to as a thumbnail) of the timing at which the facial image appears in the video including the facial image, based on the search result. The control unit 210 transmits the generated list display screen to the client device 120, and controls the output unit of the client device 120 to display it.

図６は、一覧表示画面６００の一例を示す図である。図６に示される一覧表示画面６００には、人物映像タグ６１０と、人物映像領域６７０と、が含まれる。人物映像タグ６１０は、画面のサイドメニューに含まれる。なお、サイドメニューが含まれる画面においては、サイドメニューを含めた全体を画面と言ってもよいし、サイドメニューは含めない部分を画面と言ってもよい。以下の画面においても同様である。人物映像タグ６１０が選択されると、上述したステップＳ５１０及びステップＳ５２０の処理が実行され、人物映像領域６７０に検索結果のサムネイルが表示される。図６の例では一覧表示画面６００において人物映像タグ６１０が選択された状態であることが示されている。 Figure 6 is a diagram showing an example of a list display screen 600. The list display screen 600 shown in Figure 6 includes a person video tag 610 and a person video area 670. The person video tag 610 is included in a side menu of the screen. In addition, in a screen that includes a side menu, the entire screen including the side menu may be referred to as the screen, or the portion not including the side menu may be referred to as the screen. The same applies to the following screens. When the person video tag 610 is selected, the processes of steps S510 and S520 described above are executed, and thumbnails of the search results are displayed in the person video area 670. The example of Figure 6 shows that the person video tag 610 is selected on the list display screen 600.

人物映像領域６７０のデバイスセクション６２０には入店口東口の監視カメラ１６０で撮影された映像から検知された人物のサムネイルの一覧が表示されている。サムネイルのそれぞれには人物のトラッキング開始時間（不図示）が表示されてもよい。監視カメラ１６０で撮影された映像から検知された人物のサムネイルの一覧として図６では３つが表示されているがこれに限定されるものではない。画面の大きさに応じて画面に表示するサムネイルの数は変化させてもよい。また「もっと見るボタン」が選択されると、制御部２１０は、監視カメラ１６０で撮影された映像から検知された人物のサムネイルをさらに表示させるよう制御する。 The device section 620 of the person video area 670 displays a list of thumbnails of people detected from the video captured by the surveillance camera 160 at the east entrance. Each thumbnail may display the person's tracking start time (not shown). Although three thumbnails are displayed in FIG. 6 as a list of thumbnails of people detected from the video captured by the surveillance camera 160, this is not limited to this. The number of thumbnails displayed on the screen may be changed depending on the size of the screen. Furthermore, when the "View more button" is selected, the control unit 210 controls the display to display further thumbnails of people detected from the video captured by the surveillance camera 160.

人物映像領域６７０のデバイスセクション６３０には入店口西口の監視カメラ１６０で撮影された映像から検知された人物のサムネイルの一覧が表示されている。
人物映像領域６７０のデバイスセクション６４０には出口専用口の監視カメラ１６０で撮影された映像から検知された人物のサムネイルの一覧が表示されている。
なお、画面に含まれるデバイスセクションの並びは、サムネイルがあるデバイス（監視カメラ１６０）、かつ、サムネイルに表示されるトラッキング開始時間が新しい順に一覧表示画面６００の先頭（画面の一番上）から順に表示される。後述する図１５の一覧表示画面１３００においても同様である。 The device section 630 of the person image area 670 displays a list of thumbnails of people detected in the image captured by the security camera 160 at the west entrance of the store.
The device section 640 of the person image area 670 displays a list of thumbnails of people detected in the image captured by the surveillance camera 160 at the dedicated exit entrance.
The device sections included in the screen are arranged in order from the top (the top of the screen) of the list display screen 600 in order of devices with thumbnails (surveillance cameras 160) and the tracking start times displayed in the thumbnails. The same applies to the list display screen 1300 in Fig. 15 described later.

一覧表示画面６００においては、検索結果として出力される複数の映像は、複数の映像それぞれを撮影した撮影デバイスごとに縮小表示画像の態様で出力（表示）されている。
ステップＳ５２０の処理は、顔画像が含まれる複数の映像それぞれの縮小画像を検索結果として出力（表示）する処理の一例である。 On the list display screen 600, a plurality of videos output as search results are output (displayed) in the form of reduced images for each imaging device that captured the plurality of videos.
The process of step S520 is an example of a process of outputting (displaying) reduced images of each of a plurality of videos including a face image as a search result.

一覧表示画面６００には、キーワード入力領域６５０と、期間入力領域６６０と、が含まれる。キーワード入力領域６５０は、キーワードを入力可能な領域である。キーワード入力領域６５０においてキーワードが入力されると、制御部２１０は、キーワードに基づき、監視カメラ１６０のデバイス名、監視カメラ１６０を識別可能な監視カメラ１６０のシリアル番号、及び監視カメラ１６０に設定されているデバイスタグ等を検索する。そして、制御部２１０は、該当する監視カメラ１６０の映像から顔画像を検索し、検索結果を人物映像領域６７０に表示させる。この処理は、画面を介して入力された検索条件に基づき顔画像が含まれる映像を検索する処理の一例である。 The list display screen 600 includes a keyword input area 650 and a period input area 660. The keyword input area 650 is an area where a keyword can be input. When a keyword is input in the keyword input area 650, the control unit 210 searches for the device name of the surveillance camera 160, the serial number of the surveillance camera 160 that can identify the surveillance camera 160, the device tag set in the surveillance camera 160, and the like, based on the keyword. The control unit 210 then searches for facial images from the video of the corresponding surveillance camera 160, and displays the search results in the person video area 670. This process is an example of a process of searching for video containing facial images based on search conditions input via the screen.

期間入力領域６６０は、期間を入力可能な領域である。制御部２１０は、期間入力領域６６０が選択されると、一覧表示画面６００にカレンダー画面を重畳表示させる。
図７は、カレンダー画面１５１０を重畳表示させた一例を示す図である。
カレンダー画面１５１０は、日付及び曜日を表形式で表示する画面であり、検索対象の映像の開始時間と終了時間とを選択可能に構成されている。カレンダー画面１５１０において期間（開始時間と終了時間と）が入力されると、制御部２１０は、入力された期間において撮影された映像を検索対象として検索を行い、検索結果を人物映像領域６７０に表示させる。この処理は、画面を介して入力された検索条件に基づき顔画像が含まれる映像を検索する処理の一例である。図７の例では、期間として２０２４年３月２９日から２０２４年４月１６日までが選択されている。なお、期間入力領域６６０は期間として２０２４年３月２９日以降等と指定可能に構成されていてもよい。すなわち、期間としてある時点以降と指定可能に構成されていてもよい。同様に期間入力領域６６０は期間として２０２４年４月１６日以前等と指定可能に構成されてもよい。すなわち、期間としてある時点までと指定可能に構成されていてもよい。 The period input area 660 is an area in which a period can be input. When the period input area 660 is selected, the control unit 210 causes the list display screen 600 to be displayed with a calendar screen superimposed thereon.
FIG. 7 is a diagram showing an example of a calendar screen 1510 being displayed in an overlapping manner.
The calendar screen 1510 is a screen that displays dates and days of the week in a table format, and is configured to allow the start time and end time of the video to be searched to be selected. When a period (start time and end time) is input on the calendar screen 1510, the control unit 210 searches for the video shot during the input period as the search target, and displays the search results in the person video area 670. This process is an example of a process for searching for a video containing a face image based on the search conditions input via the screen. In the example of FIG. 7, the period is selected from March 29, 2024 to April 16, 2024. Note that the period input area 660 may be configured to allow the period to be specified as after March 29, 2024, etc. That is, the period may be configured to allow the period to be specified as after a certain point in time. Similarly, the period input area 660 may be configured to allow the period to be specified as before April 16, 2024, etc. That is, the period may be configured to allow the period to be specified as up to a certain point in time.

上述したように、一覧表示画面６００は、期間と、キーワードと、のうち少なくとも１つ以上の情報を検索条件として入力可能に構成されている。なお、一覧表示画面６００を映像が撮影された撮影デバイスを検索条件として入力可能に構成するようにしてもよい。 As described above, the list display screen 600 is configured to allow at least one of the time period and keywords to be input as a search criterion. The list display screen 600 may also be configured to allow the imaging device that captured the video to be input as a search criterion.

図６の人物映像領域６７０においてサムネイルの一つが選択されると、選択されたサムネイルを識別する識別情報を含むサムネイル選択情報がクライアント装置１２０からサーバー装置１００に送信される。 When one of the thumbnails is selected in the person image area 670 in FIG. 6, thumbnail selection information including identification information that identifies the selected thumbnail is transmitted from the client device 120 to the server device 100.

図５のステップＳ５３０において、制御部２１０は、サムネイル選択情報を受信したか否かを判定する。制御部２１０は、サムネイル選択情報を受信した場合には、ステップＳ５４０に処理を進め、サムネイル選択情報を受信していない場合には、ステップＳ５２０に処理を戻す。 In step S530 of FIG. 5, the control unit 210 determines whether or not thumbnail selection information has been received. If thumbnail selection information has been received, the control unit 210 proceeds to step S540, and if thumbnail selection information has not been received, the control unit 210 returns to step S520.

ステップＳ５４０において、制御部２１０は、サムネイル選択情報で識別されるサムネイルの映像を選択する。
ステップＳ５５０において、制御部２１０は、選択した映像のタイムラインと共に映像に含まれる人物の顔画像を含む映像詳細画面を生成する。より具体的には、制御部２１０は、Ｓ５４０で選択された映像について、ベストショット情報に基づいて対応するベストショット1st,2nd,3rd,4thを特定し、映像のタイムラインとともに表示するための映像詳細画面を生成する。制御部２１０は、生成した映像詳細画面をクライアント装置１２０に送信し、クライアント装置１２０の出力部に表示されるよう制御する。 In step S540, control unit 210 selects the image of the thumbnail identified by the thumbnail selection information.
In step S550, the control unit 210 generates a video detail screen including a timeline of the selected video and facial images of people included in the video. More specifically, the control unit 210 identifies the corresponding best shots 1st, 2nd, 3rd, and 4th for the video selected in S540 based on the best shot information, and generates a video detail screen for displaying the best shots together with the timeline of the video. The control unit 210 transmits the generated video detail screen to the client device 120 and controls the output unit of the client device 120 to display the video detail screen.

図８は、映像詳細画面７００の一例を示す図（その１）である。映像詳細画面７００には表示領域７９０が含まれる。表示領域７９０には、該当する監視カメラ１６０で撮影された映像が表示される。映像には人物７３５が含まれる。人物７３５はサムネイルで選択された映像に含まれる人物である。また図８では簡略化のため省略してあるが映像には背景７３６が含まれる。背景７３６は該当する監視カメラ１６０で撮影される街並み、室内、自然風景等が該当する。表示領域７９０に人物７３５及び背景７３６が含まれることにより、操作者は、人物がどのような状況で何をしているのか等を把握することができる。 Figure 8 is a diagram (part 1) showing an example of a video detail screen 700. The video detail screen 700 includes a display area 790. The video captured by the corresponding surveillance camera 160 is displayed in the display area 790. The video includes a person 735. The person 735 is a person included in the video selected by the thumbnail. The video also includes a background 736, which is omitted in Figure 8 for simplification. The background 736 corresponds to a streetscape, an interior of a room, a natural landscape, etc. captured by the corresponding surveillance camera 160. By including the person 735 and the background 736 in the display area 790, the operator can understand what the person is doing and in what situation.

映像詳細画面７００にはタイムライン７１０が含まれる。タイムライン７１０は、表示領域７９０に表示される映像の時間を示すものである。より具体的に説明すると、タイムラインは、映像の開始時点から映像の終了時点まで時間軸を帯状に表示したものである。本実施形態のタイムラインには、目盛りとして開始時点からの経過時間を表示している。ただし、タイムラインには、目盛りとして時刻又は日時を表示してもよい。再生位置表示オブジェクト７２０は、表示領域７９０に表示されている映像の撮影日時を示すオブジェクト（ＧＵＩ部品）である。再生位置表示オブジェクト７２０が一番左にあるときは映像の再生の開始位置であり、一番右にあるときは映像の再生の終了位置である。再生位置表示オブジェクト７２０は、表示領域７９０で再生されている映像のタイムライン７１０上の位置を示す。クライアント装置１２０の操作者（監視カメラ１６０が設置される店舗の管理者等）は、再生位置表示オブジェクト７２０を操作して、表示領域７９０に表示されている映像の撮影日時を指定することもできる。フラグ７３０は、タイムライン７１０に表示される、映像に人物が映っている期間を示すフラグである。フラグ７３０は一番左の位置の時刻に人物が現れたことを示し、一番右の位置の時刻に人物が消えたことを示している。
制御部２１０は、選択された映像に関するタイムラインを出力する。タイムラインは登録された顔画像の人物が映っている部分と顔画像の人物が映っていない部分とを識別可能な態様で出力される。 The video detail screen 700 includes a timeline 710. The timeline 710 indicates the time of the video displayed in the display area 790. More specifically, the timeline displays a time axis in a strip shape from the start point of the video to the end point of the video. The timeline of this embodiment displays the elapsed time from the start point as a scale. However, the timeline may display the time or date as a scale. The playback position display object 720 is an object (GUI part) that indicates the shooting date and time of the video displayed in the display area 790. When the playback position display object 720 is at the leftmost position, it is the start position of the video playback, and when it is at the rightmost position, it is the end position of the video playback. The playback position display object 720 indicates the position on the timeline 710 of the video being played in the display area 790. The operator of the client device 120 (such as the manager of the store where the surveillance camera 160 is installed) can also operate the playback position display object 720 to specify the shooting date and time of the video displayed in the display area 790. Flag 730 is a flag that indicates a period during which a person appears in the video, and is displayed on timeline 710. Flag 730 indicates that a person appeared at the leftmost position at the time, and indicates that the person disappeared at the rightmost position at the time.
The control unit 210 outputs a timeline relating to the selected video. The timeline is output in a manner that allows distinction between a portion showing the person of the registered face image and a portion not showing the person of the face image.

顔画像表示領域７９５には、表示領域７９０に表示されている映像に映っている人物の顔画像が表示される。図８の例では、顔画像表示領域７９５に、顔画像７４０、顔画像７５０、顔画像７６０、顔画像７７０が含まれている。
顔画像７４０は、監視カメラ１６０に対して正面を向いている顔画像である。これは、表示された映像に対応するベストショット1stである。両目が映っている顔画像は監視カメラ１６０に対して正面を向いている顔画像と言ってもよい。また顔の向きが監視カメラ１６０に向いている顔画像を、正面を向いている顔画像と言ってもよい。また視点が監視カメラ１６０の方向にある顔画像を、正面を向いている顔画像と言ってもよい。
顔画像７５０は、監視カメラ１６０に対して左を向いている顔画像である。これは、表示された映像に対応するベストショット2ndである。両目ではなく右目だけが映っている顔画像は監視カメラ１６０に対して左を向いている顔画像と言ってもよい。また顔の右側が撮影されており顔の左側が撮影されていない画像を監視カメラ１６０に対して左を向いている顔画像と言ってもよい。
顔画像７６０は、監視カメラ１６０に対して右を向いている顔画像である。これは、表示された映像に対応するベストショット3rdである。両目ではなく左目だけが映っている顔画像は監視カメラ１６０に対して右を向いている顔画像と言ってもよい。また顔の左側が撮影されており顔の右側が撮影されていない画像を監視カメラ１６０に対して右を向いている顔画像と言ってもよい。
なお、顔の向きとしては正面／左向き／右向き以外として、例えば斜め左向き／斜め右向き／後向き等の向きを、追加的／代替的に用いることとしてもよい。
顔画像７７０は、人物の体全体を含んでいる顔画像である。これは、表示された映像に対応するベストショット4thである。体全体とは顔画像のみではなく人物の胴体、手、足等が含まれることをいう。
なお、体全体以外に、例えばバストアップ等を、追加的／代替的に用いることとしてもよい。 In the face image display area 795, face images of people appearing in the video displayed in the display area 790 are displayed. In the example of Fig. 8, the face image display area 795 includes a face image 740, a face image 750, a face image 760, and a face image 770.
Facial image 740 is a facial image facing forward with respect to surveillance camera 160. This is the best shot 1st corresponding to the displayed video. A facial image showing both eyes may be said to be a facial image facing forward with respect to surveillance camera 160. A facial image in which the face is oriented toward surveillance camera 160 may also be said to be a facial image facing forward. A facial image with the viewpoint in the direction of surveillance camera 160 may also be said to be a facial image facing forward.
Facial image 750 is a facial image facing left with respect to surveillance camera 160. This is the second best shot corresponding to the displayed video. A facial image in which only the right eye is shown instead of both eyes may be said to be a facial image facing left with respect to surveillance camera 160. Also, an image in which the right side of the face is captured but the left side of the face is not captured may be said to be a facial image facing left with respect to surveillance camera 160.
Facial image 760 is a facial image facing right with respect to surveillance camera 160. This is the third best shot corresponding to the displayed video. A facial image in which only the left eye is shown instead of both eyes may be said to be a facial image facing right with respect to surveillance camera 160. Also, an image in which the left side of the face is captured but not the right side of the face may be said to be a facial image facing right with respect to surveillance camera 160.
In addition, the face direction may be other than forward/left/right, and other directions such as diagonally left/right/backward may be used additionally/alternatively.
Facial image 770 is a facial image including the entire body of a person. This is the fourth best shot corresponding to the displayed video. The entire body includes not only the facial image but also the torso, hands, legs, etc. of the person.
In addition to the whole body, for example, a bust-up shot may be used in addition/alternatively.

顔画像表示領域７９５には、複数の顔画像には、正面を向いている顔画像と、右を向いている顔画像と、左を向いている顔画像と、体全体を含んでいる顔画像と、のうち少なくとも１つ以上の顔画像が含まれる。
顔画像表示領域７９５には、同一人物を互いに異なる角度から撮影された複数の顔画像が含まれる、ということもできる。 In the face image display area 795, the multiple face images include at least one of a face image facing forward, a face image facing right, a face image facing left, and a face image including the entire body.
It can also be said that the face image display area 795 contains a plurality of face images of the same person taken from different angles.

制御部２１０は、選択された映像に含まれる人物の画像のうち人物の顔が判別可能な画像を映像と共に出力（表示）する。 The control unit 210 outputs (displays) images of people included in the selected video in which the person's face is identifiable, together with the video.

制御部２１０は、フラグ７３０の一番左の位置の時刻に顔画像表示領域７９５に顔画像を表示し、フラグ７３０の一番右の位置の時刻に顔画像表示領域７９５の顔画像を非表示とする（又は削除する）ようにしてもよい。制御部２１０は、表示領域７９０に表示されている映像に人物が映っている間は、その人物の顔画像を顔画像表示領域７９５に表示するようにしてもよい。このような処理を実行することによって、制御部２１０は、選択された映像に含まれる人物であることが分かる態様で人物の顔画像を表示することができる。
選択された映像に含まれる人物であることが分かる態様とは、例えば、同じ画面に表示領域７９０（又は人物７３５）及び顔画像表示領域７９５が表示される、又は表示領域７９０に重畳されて顔画像表示領域７９５が表示される等の態様である。 The control unit 210 may display a face image in the face image display area 795 at the time of the leftmost position of the flag 730, and may hide (or delete) the face image in the face image display area 795 at the time of the rightmost position of the flag 730. The control unit 210 may display the face image of a person in the face image display area 795 while the person appears in the video displayed in the display area 790. By executing such processing, the control unit 210 can display the face image of the person in a manner that makes it clear that the person is included in the selected video.
Examples of manners in which it can be recognized that the person is included in the selected video include when display area 790 (or person 735) and facial image display area 795 are displayed on the same screen, or when facial image display area 795 is displayed superimposed on display area 790.

ステップＳ５５０の処理は、人物映像領域６７０において選択された映像と映像のタイムラインとを含む画面を出力（表示）する処理の一例である。
またステップＳ５５０の処理は、人物映像領域６７０において選択されたサムネイルの映像と共に、映像に含まれる人物と同一人物の複数の顔画像を出力（表示）する処理の一例である。 The process of step S550 is an example of a process of outputting (displaying) a screen including the video selected in person video area 670 and a timeline of the video.
The process of step S550 is an example of a process of outputting (displaying) a plurality of facial images of the same person as the person included in the image of the thumbnail selected in person image area 670 together with the image.

顔画像表示領域７９５には、さらに「人物を登録」ボタン７７５が含まれる。表示領域７９０に表示されている映像に映っている人物が情報処理システム１０００にまだ登録されていない人物であった場合、制御部２１０は、顔画像表示領域７９５に「人物を登録」ボタン７７５を含めて出力する。この処理は、表示領域７９０に表示されている映像に映っている人物が情報処理システム１０００にまだ登録されていない人物であった場合、制御部２１０は、「人物を登録」ボタン７７５を活性化させて顔画像表示領域７９５に表示する、ともいえる。ボタンを活性化させるとは、ボタンを選択可能にすることである。
映像に含まれる人物の顔画像が登録済みでないでない場合、制御部２１０は、顔画像を登録する登録ボタンをユーザーが選択可能な態様で出力（表示）する。 The facial image display area 795 further includes a "Register Person" button 775. If the person appearing in the video displayed in the display area 790 is not yet registered in the information processing system 1000, the control unit 210 outputs the "Register Person" button 775 in addition to the facial image display area 795. This process can also be said to be that if the person appearing in the video displayed in the display area 790 is not yet registered in the information processing system 1000, the control unit 210 activates the "Register Person" button 775 and displays it in the facial image display area 795. Activating a button means making the button selectable.
If the face image of the person included in the video has not been registered, the control unit 210 outputs (displays) a registration button for registering the face image in a user-selectable manner.

表示領域７９０に表示されている映像に映っている人物が情報処理システム１０００にすでに登録されている人物であった場合、制御部２１０は、顔画像表示領域７９５に「登録済み」ボタン８１０を含めて出力する。
図９は、映像詳細画面７００の一例を示す図（その２）である。「登録済み」ボタン８１０は、顔画像表示領域７９５に含まれる顔画像の人物はすでに情報処理システム１０００に検索対象の人物として登録されていることを示すボタンである。なお、顔画像表示領域７９５に含まれる顔画像の人物が情報処理システム１０００にすでに登録されていることを示すことができればボタンに限られずどのようなＧＵＩ部品又は画像等であってもよい。
映像に含まれる人物の顔画像はすでに登録済みか否かを識別可能な態様で出力（表示）される。 If the person appearing in the image displayed in the display area 790 is a person who has already been registered in the information processing system 1000, the control unit 210 outputs the face image display area 795 including a “Registered” button 810.
9 is a diagram (part 2) showing one example of the video detail screen 700. The "registered" button 810 is a button indicating that the person whose face image is included in the face image display area 795 has already been registered as a person to be searched in the information processing system 1000. Note that this is not limited to a button and may be any GUI component, image, or the like, as long as it can indicate that the person whose face image is included in the face image display area 795 has already been registered in the information processing system 1000.
The facial image of a person included in the video is output (displayed) in a manner that makes it possible to identify whether or not the person has already been registered.

図８の「人物を登録」ボタン７７５が選択されると、制御部２１０は、人物登録画面を生成する。制御部２１０は、生成した人物登録画面をクライアント装置１２０に送信し、クライアント装置１２０の出力部に表示されるよう制御する。 When the "Register Person" button 775 in FIG. 8 is selected, the control unit 210 generates a person registration screen. The control unit 210 transmits the generated person registration screen to the client device 120 and controls the output unit of the client device 120 to display it.

図１０は、人物登録画面９００の一例を示す図である。図１０に示されるように、人物登録画面９００には、顔画像表示領域７９５に表示されている顔画像と同じ顔画像の顔画像９１０、顔画像９２０、顔画像９３０及び顔画像９４０が含まれる。制御部２１０は、クライアント装置１２０の操作者によるドラッグ操作に基づき、顔画像（顔画像９１０、顔画像９２０、顔画像９３０、顔画像９４０等）の位置を入れ替えることができる。顔画像９１０の位置にある画像がカバー写真として用いられる。カバー写真は、例えば、後述する図１４、図１５等に示される保存人物の顔画像として画面等に用いられる。
また、制御部２１０は、クライアント装置１２０の操作者が各顔画像（顔画像９１０、顔画像９２０、顔画像９３０、顔画像９４０等）の右上にある「×」ボタンをクリックしたことに応じて、その顔画像を削除することができる。これは、顔画像の中に誤って所望の人物とは異なる顔画像が含まれていた場合に、後述のステップS１１４０における検索の精度低下を防止するためのものである。なお、検索の精度向上のため、単に顔画像を削除するのみならず、その代わりに例えば同じ向きの顔画像で次なる候補となるものを自動的にセットしたり、同じ向きの顔画像で次なる候補となるものを幾つか一覧表示してユーザーが選択したものをセットしたりしてもよい。 FIG. 10 is a diagram showing an example of a person registration screen 900. As shown in FIG. 10, the person registration screen 900 includes face images 910, 920, 930, and 940 that are the same as the face image displayed in the face image display area 795. The control unit 210 can replace the positions of the face images (face image 910, face image 920, face image 930, face image 940, etc.) based on a drag operation by the operator of the client device 120. The image at the position of the face image 910 is used as a cover photo. The cover photo is used on a screen or the like as a face image of a saved person shown in, for example, FIG. 14, FIG. 15, etc. described later.
In addition, the control unit 210 can delete each face image (face image 910, face image 920, face image 930, face image 940, etc.) in response to the operator of the client device 120 clicking the "x" button in the upper right corner of the face image. This is to prevent a decrease in the accuracy of the search in step S1140 described later when a face image other than that of a desired person is mistakenly included in the face images. Note that, in order to improve the accuracy of the search, instead of simply deleting the face image, it is also possible to automatically set a face image with the same orientation as the next candidate, or to display a list of several face images with the same orientation as the next candidate and set the one selected by the user.

また、人物登録画面９００には、プラスを示すＧＵＩ部品９５０が含まれる。ＧＵＩ部品９５０が選択されると、制御部２１０は、顔画像を追加するための画面を生成する。制御部２１０は、生成した画面をクライアント装置１２０に送信し、クライアント装置１２０に表示するよう制御する。制御部２１０は、顔画像を追加するための画面において、顔画像９１０、顔画像９２０、顔画像９３０及び顔画像９４０に表示されている顔画像と同じ人物の別の顔画像を選択可能に表示する。クライアント装置１２０の操作者はこの画面を介して、同じ人物の別の顔画像を選択し、顔画像を追加することができる。なお、これ以外の顔画像の追加方法として、例えばクライアント装置１２０において別途用意した顔画像をアップロード可能としてもよい。 The person registration screen 900 also includes a GUI component 950 indicating a plus sign. When the GUI component 950 is selected, the control unit 210 generates a screen for adding a face image. The control unit 210 transmits the generated screen to the client device 120 and controls it to be displayed on the client device 120. The control unit 210 displays, on the screen for adding a face image, another face image of the same person as the face image displayed in face image 910, face image 920, face image 930, and face image 940 in a selectable manner. The operator of the client device 120 can select another face image of the same person via this screen and add the face image. Note that as another method of adding a face image, for example, a face image separately prepared in the client device 120 may be uploaded.

また、制御部２１０は、顔画像９１０、顔画像９２０、顔画像９３０及び顔画像９４０の何れかの顔画像が選択され、所定の操作が行われると、制御部２１０は、顔画像を変更するための画面を生成する。制御部２１０は、生成した画面をクライアント装置１２０に送信し、クライアント装置１２０に表示するよう制御する。制御部２１０は、顔画像を変更するための画面において、選択された顔画像を選択された顔画像と同一の人物の別の顔画像を選択可能に表示する。クライアント装置１２０の操作者はこの画面を介して、顔画像を同じ人物の別の顔画像に変更することができる。
これらの処理は、ユーザーの所定の操作に基づいて登録された顔画像を変更、又は顔画像を追加／削除する処理の一例である。 Furthermore, when any one of the facial images 910, 920, 930, and 940 is selected and a predetermined operation is performed, the control unit 210 generates a screen for changing the facial image. The control unit 210 transmits the generated screen to the client device 120 and controls it to be displayed on the client device 120. The control unit 210 displays, on the screen for changing the facial image, another facial image of the same person as the selected facial image so that the selected facial image can be selected. The operator of the client device 120 can change the facial image to another facial image of the same person via this screen.
These processes are examples of processes for changing a registered face image or adding/deleting a face image based on a predetermined operation by the user.

クライアント装置１２０の操作者は、図１０に示されるような人物登録画面９００において、例えば、人物名、人物タグ、メモ等を入力し、検索対象の人物として人物の顔画像を登録する操作を行う。人物登録画面９００において「この内容で保存する」ボタンが選択されると、人物登録画面９００に含まれる、顔画像等を含む登録情報がクライアント装置１２０からサーバー装置１００に送信される。 The operator of the client device 120 enters, for example, a person's name, a person tag, a memo, etc., on a person registration screen 900 as shown in FIG. 10, and performs an operation to register a face image of the person as a person to be searched. When the "Save with this content" button is selected on the person registration screen 900, the registration information included in the person registration screen 900, including the face image, etc., is transmitted from the client device 120 to the server device 100.

図５のステップＳ５６０において、制御部２１０は、登録情報を受信したか否かを判定する。制御部２１０は、登録情報を受信した場合には、ステップＳ５７０に処理を進め、サムネイル選択情報を受信していない場合には、ステップＳ５５０の処理を繰り返す。 In step S560 of FIG. 5, the control unit 210 determines whether or not registration information has been received. If registration information has been received, the control unit 210 proceeds to step S570, and if thumbnail selection information has not been received, the control unit 210 repeats the process of step S550.

ステップＳ５７０において、制御部２１０は、登録情報に含まれる顔画像を検索対象の人物（保存人物ともいう）として登録する。具体的には、複数のベストショット情報のうち、Ｓ５６０で受信した登録情報に対応するものについて、フラグを立てる等して識別可能にする。 In step S570, the control unit 210 registers the face image included in the registration information as a person to be searched for (also called a saved person). Specifically, among the multiple best shot information, the one that corresponds to the registration information received in S560 is made identifiable by setting a flag, etc.

人物登録画面９００上においてクライアント装置１２０の操作者により所定の操作（例えば、顔画像上で右クリックし、表示されたメニューから編集ボタンを選択する操作等）を受けると、制御部２１０は、顔画像を編集可能な編集画面を生成する。制御部２１０は、生成した編集画面をクライアント装置１２０に送信し、クライアント装置１２０の出力部に表示されるよう制御する。 When the operator of the client device 120 performs a predetermined operation on the person registration screen 900 (for example, right-clicking on a face image and selecting an edit button from the displayed menu), the control unit 210 generates an editing screen on which the face image can be edited. The control unit 210 transmits the generated editing screen to the client device 120 and controls it so that it is displayed on the output unit of the client device 120.

図１１は、編集画面１０１０の一例を示す図である。編集画面１０１０には人物登録画面９００で選択された顔画像が含まれる。編集画面１０１０では顔画像のトリミング、回転・傾き補正、明るさ及びコントラストの調整等を行うことが可能に構成されている。 Figure 11 is a diagram showing an example of an editing screen 1010. The editing screen 1010 includes the face image selected on the person registration screen 900. The editing screen 1010 is configured to enable cropping of the face image, rotation and tilt correction, and adjustment of brightness and contrast.

図８の映像詳細画面７００は、「ムービークリップを作成」ボタン７８０が含まれる。「ムービークリップを作成」ボタン７８０が選択されると、制御部２１０は、ムービークリップの作成画面１７００を作成する。そして、制御部２１０は、ムービークリップの作成画面１７００をクライアント装置１２０に送信し、クライアント装置１２０の出力部に表示されるよう制御する。 The video detail screen 700 in FIG. 8 includes a "Create Movie Clip" button 780. When the "Create Movie Clip" button 780 is selected, the control unit 210 creates a movie clip creation screen 1700. The control unit 210 then transmits the movie clip creation screen 1700 to the client device 120, and controls it so that it is displayed on the output unit of the client device 120.

図１２は、ムービークリップの作成画面１７００の一例を示す図である。ムービークリップの作成画面１７００は、ムービークリップを作成可能に構成されている。ムービークリップとは、映像から残しておきたい時間を指定して映像を切り取り、記憶部２２０等に保存しておく機能である。なお、ムービークリップには、指定された映像である保存用動画と指定された映像の早送り動画であるタイムラプスとが含まれていてもよい。 Figure 12 is a diagram showing an example of a movie clip creation screen 1700. The movie clip creation screen 1700 is configured to enable the creation of a movie clip. A movie clip is a function that cuts out a video by specifying the time that you want to keep from the video, and stores the video in the storage unit 220 or the like. Note that a movie clip may include a video to be saved, which is the specified video, and a time lapse, which is a fast-forwarded video of the specified video.

ムービークリップの作成画面１７００には、映像のタイムライン１７１０が表示されている。操作者は、タイムライン１７１０上のアンカー１７４０及びアンカー１７５０を操作することで切り取る映像の始まりの時間と終わりの時間とを指定することができる。アンカー１７４０が映像の始まりの時間を指定するＧＵＩ部品である。アンカー１７５０が映像の終わりの時間を指定するＧＵＩ部品である。アンカー１７４０で指定された時間の映像の静止画が表示領域１７２０に表示される。アンカー１７５０で指定された時間の映像の静止画が表示領域１７３０に表示される。 Movie clip creation screen 1700 displays a video timeline 1710. The operator can specify the start and end times of the video to be cut out by operating anchors 1740 and 1750 on timeline 1710. Anchor 1740 is a GUI component that specifies the start time of the video. Anchor 1750 is a GUI component that specifies the end time of the video. A still image of the video at the time specified by anchor 1740 is displayed in display area 1720. A still image of the video at the time specified by anchor 1750 is displayed in display area 1730.

制御部２１０は、クライアント装置１２０の操作者の操作に基づき、映像のムービークリップを作成し、記憶部２２０等の所定の記憶領域に保存することができる。 The control unit 210 can create movie clips of video based on operations by the operator of the client device 120 and store them in a specified memory area such as the memory unit 220.

制御部２１０は、選択された映像に関するタイムラインを出力し、タイムラインを含む画面を介した所定の操作に基づき映像に関するムービークリップを作成する画面を出力する。 The control unit 210 outputs a timeline related to the selected video, and outputs a screen for creating a movie clip related to the video based on a specified operation via the screen including the timeline.

（２－２）顔画像検索
図１３は、情報処理システム１０００における顔画像検索に係る情報処理の一例を示すフローチャートである。なお、この処理は主にサーバー装置１００が実行するものとして説明するが、その代わりにクライアント装置１１０又はクライアント装置１２０、複数の監視カメラ１６０の何れかが実行することとしてもよい。
ステップＳ１１１０において、制御部２１０は、クライアント装置１２０からの要求に応じて、登録されている複数の顔画像を含む画面（保存人物表示画面）を生成する。具体的には、複数のベストショット情報のうち、ステップＳ５７０でフラグが立てられたものを抽出し、夫々ベストショット1stを特定したうえで、それらを表示するための保存人物表示画面を生成する。制御部２１０は、生成した保存人物表示画面をクライアント装置１２０に送信する。 13 is a flowchart showing an example of information processing related to face image search in the information processing system 1000. Note that, although this processing will be described as being mainly executed by the server device 100, it may instead be executed by any of the client device 110 or 120, or the multiple surveillance cameras 160.
In step S1110, the control unit 210 generates a screen (saved person display screen) including a plurality of registered face images in response to a request from the client device 120. Specifically, from among the plurality of best shot information, those flagged in step S570 are extracted, and the best shot 1 for each is identified, and a saved person display screen for displaying them is generated. The control unit 210 transmits the generated saved person display screen to the client device 120.

図１４は、人物検索画面１２００の一例を示す図である。図１４に示されるように、人物検索画面１２００には、保存人物タブ１２１０と、保存人物表示領域１２５０と、が含まれる。保存人物タブ１２１０が選択されると、保存事物表示画面の表示要求がクライアント装置１２０からサーバー装置１００に送信される。そして、表示要求に応じて、保存人物表示領域１２５０にステップＳ５７０で顔画像が登録された人物の顔画像の一覧が表示される。図１４では保存人物タブ１２１０が選択されていることが示されている。図１４の保存人物表示領域１２５０には、一例として、顔画像１２２０、顔画像１２３０、顔画像１２４０が保存人物の一覧として表示されている。 Figure 14 is a diagram showing an example of a person search screen 1200. As shown in Figure 14, the person search screen 1200 includes a saved person tab 1210 and a saved person display area 1250. When the saved person tab 1210 is selected, a display request for the saved object display screen is sent from the client device 120 to the server device 100. Then, in response to the display request, a list of face images of people whose face images were registered in step S570 is displayed in the saved person display area 1250. Figure 14 shows that the saved person tab 1210 has been selected. As an example, the saved person display area 1250 in Figure 14 displays face image 1220, face image 1230, and face image 1240 as a list of saved people.

クライアント装置１２０の操作者（監視カメラ１６０の所有者等）は、保存人物表示領域１２５０に表示されている顔画像から検索対象とする人物の顔画像を選択する。クライアント装置１２０の操作者が顔画像を選択すると、選択した顔画像を識別する識別情報等を含む選択情報がクライアント装置１２０からサーバー装置１００に送信される。 The operator of the client device 120 (such as the owner of the surveillance camera 160) selects the face image of the person to be searched from the face images displayed in the saved person display area 1250. When the operator of the client device 120 selects a face image, selection information including identification information for identifying the selected face image is transmitted from the client device 120 to the server device 100.

ステップＳ１１２０において、制御部２１０は、クライアント装置１２０から選択情報を受信したか否かを判定する。制御部２１０は、選択情報を受信した場合には、ステップＳ１１３０に処理を進め、選択情報を受信していない場合には、ステップＳ１１２０の処理を繰り返す。 In step S1120, the control unit 210 determines whether or not selection information has been received from the client device 120. If selection information has been received, the control unit 210 proceeds to step S1130, and if selection information has not been received, the control unit 210 repeats the process of step S1120.

ステップＳ１１３０において、制御部２１０は、選択情報に含まれる識別情報に基づき顔画像を選択する。
ステップＳ１１４０において、制御部２１０は、選択した顔画像を含む、監視カメラ１６０で撮影された映像を検索する。具体的には、まず、ステップＳ１１３０で選択された顔画像のベストショット情報を参照し、対応する顔の特徴量を用いて、ＡＩモデル等によりその人物が含まれる映像を特定する。なお、前述した前処理において顔の特徴量を算出しなかった場合、この時点で同様にして顔の特徴量を算出すればよい。
なお、ステップＳ１１４０は次のように実現してもよい。例えば、制御部２１０は、顔画像と、記憶部２２０等に記憶されている複数の映像と、を入力データとして学習済みモデルに入力する。この学習済みモデルは、顔画像と、複数の映像とを入力データ、映像に顔画像の人物と同一人物が含まれるか否か、及び、映像に顔画像の人物と同一人物が含まれる場合は顔画像の人物が出現してから消失するまでの時間情報を出力データとして学習された学習済みモデルである。さらに、学習済みモデルは、同じ映像に複数の顔画像が含まれる場合には複数の顔画像それぞれが出現してから消失するまでの時間情報を出力データとして学習された学習済みモデルである。 In step S1130, the control unit 210 selects a face image based on the identification information included in the selection information.
In step S1140, the control unit 210 searches for video captured by the surveillance camera 160 that includes the selected face image. Specifically, first, best shot information of the face image selected in step S1130 is referenced, and a video in which the person is included is identified using the corresponding face feature amount by an AI model or the like. Note that if the face feature amount was not calculated in the pre-processing described above, the face feature amount may be calculated in a similar manner at this point.
Note that step S1140 may be realized as follows. For example, the control unit 210 inputs a face image and multiple videos stored in the storage unit 220 or the like as input data to the trained model. This trained model is a trained model trained with the face image and multiple videos as input data, and whether or not the video includes a person identical to the person in the face image, and, if the video includes the person identical to the person in the face image, time information from when the person in the face image appears to when it disappears as output data. Furthermore, if multiple face images are included in the same video, the trained model is a trained model trained with time information from when each of the multiple face images appears to when it disappears as output data.

制御部２１０は、学習済みモデルから映像に顔画像の人物と同一人物が含まれるか否かの情報、及び、映像に顔画像の人物と同一人物が含まれる場合はこの人物が出現してから消失するまでの時間情報を検索結果として取得する。 The control unit 210 obtains, from the trained model, information on whether the video contains a person identical to the person in the facial image, and, if the video contains a person identical to the person in the facial image, time information from when this person appeared to when he or she disappeared, as search results.

ステップＳ１１５０において、制御部２１０は、検索結果に基づき、選択された顔画像の人物の同一の人物が含まれる映像の顔画像が出現したタイミングのサムネイルを含む一覧表示画面を生成する。制御部２１０は、生成した一覧表示画面をクライアント装置１２０に送信し、クライアント装置１２０の出力部に表示されるよう制御する。 In step S1150, the control unit 210 generates a list display screen including thumbnails of the timing when a facial image of a video including the same person as the person in the selected facial image appears based on the search results. The control unit 210 transmits the generated list display screen to the client device 120 and controls the output unit of the client device 120 to display it.

図１５は、一覧表示画面１３００の一例を示す図である。図１５に示される一覧表示画面１３００には、保存人物表示領域１３２０が含まれる。保存人物表示領域１３２０には、人物検索画面１２００で選択された顔画像１３１０と、顔画像１３１０の人物と同一の人物が含まれる映像のサムネイルの一覧が表示されている。サムネイルの一覧は、デバイスごとに表示される。図１５では簡略化のため、デバイスセクション１３４０として、入店口東口の監視カメラ１６０で撮影された映像のサムネイルの一覧のみが表示されているが、これに限定されるものではない。図６の一覧表示画面６００に示したのと同様、デバイスセクションごとにサムネイルの一覧が表示される。 Figure 15 is a diagram showing an example of a list display screen 1300. The list display screen 1300 shown in Figure 15 includes a saved person display area 1320. The saved person display area 1320 displays the face image 1310 selected on the person search screen 1200 and a list of thumbnails of videos that include the same person as the person in the face image 1310. The list of thumbnails is displayed for each device. For simplicity's sake, in Figure 15, only a list of thumbnails of videos captured by the surveillance camera 160 at the east entrance of the store is displayed as the device section 1340, but this is not limited to this. As shown in the list display screen 600 of Figure 6, a list of thumbnails is displayed for each device section.

図１５の一覧表示画面１３００においてサムネイルの一つが選択されると、選択されたサムネイルを識別する識別情報を含む、サムネイル選択情報がクライアント装置１２０からサーバー装置１００に送信される。 When one of the thumbnails is selected on the list display screen 1300 in FIG. 15, thumbnail selection information including identification information for identifying the selected thumbnail is sent from the client device 120 to the server device 100.

図１３のステップＳ１１６０において、制御部２１０は、サムネイル選択情報を受信したか否かを判定する。制御部２１０は、サムネイル選択情報を受信した場合には、ステップＳ１１７０に処理を進め、サムネイル選択情報を受信していない場合には、ステップＳ１１５０及びステップＳ１１６０の処理を繰り返す。 In step S1160 of FIG. 13, the control unit 210 determines whether or not thumbnail selection information has been received. If thumbnail selection information has been received, the control unit 210 proceeds to step S1170, and if thumbnail selection information has not been received, the control unit 210 repeats the processes of steps S1150 and S1160.

ステップＳ１１７０において、制御部２１０は、サムネイル選択情報で識別されるサムネイルの映像を選択する。
ステップＳ１１８０において、制御部２１０は、選択した映像のタイムラインと共に映像に含まれる人物の顔画像を含む映像詳細画面を生成する。制御部２１０は、生成した映像詳細画面をクライアント装置１２０に送信し、クライアント装置１２０の出力部に表示されるよう制御する。 In step S1170, control unit 210 selects the image of the thumbnail identified by the thumbnail selection information.
In step S1180, the control unit 210 generates a video detail screen including a timeline of the selected video and facial images of people included in the video. The control unit 210 transmits the generated video detail screen to the client device 120 and controls the output unit of the client device 120 to display the video detail screen.

図１６は、映像詳細画面１４００の一例を示す図である。映像詳細画面１４００は、図８に示した映像詳細画面７００及び図９に示した映像詳細画面７００とほぼ同様である。ただし、映像詳細画面１４００に含まれる顔画像表示領域１４１０には図８の映像詳細画面７００に含まれる「人物を登録」ボタン７７５及び図９の「登録済み」ボタン８１０は含まれない。 Figure 16 is a diagram showing an example of a video detail screen 1400. The video detail screen 1400 is almost the same as the video detail screen 700 shown in Figure 8 and the video detail screen 700 shown in Figure 9. However, the face image display area 1410 included in the video detail screen 1400 does not include the "Register Person" button 775 included in the video detail screen 700 in Figure 8 and the "Already Registered" button 810 in Figure 9.

（２－３）送信先設定
制御部２１０は、クライアント装置１１０からの要求に応じて、通知設定（送信先設定ともいう）画面を生成し、クライアント装置１１０に送信する。 (2-3) Destination Setting In response to a request from the client device 110, the control unit 210 generates a notification setting (also called destination setting) screen and transmits it to the client device 110.

図１７は、通知設定画面１８００の一例を示す図である。図１７に示されるように、通知設定画面１８００には、通知設定タグ１８１０と、通知設定領域１８２０と、が含まれる。通知設定タグ１８１０が選択されると、通知設定画面の生成要求がクライアント装置１１０からサーバー装置１００に送信される。そして、生成要求に応じて、通知設定領域１８２０にデバイスごとに送信先を設定可能なＧＵＩ部品１８３０が表示される。図１７の例では通知設定タグ１８１０が選択されていることが示されている。
この処理は、登録された顔画像の人物が映像より検知された場合に、顔画像の人物が検知されたことを示す情報を送信する送信先の通知設定画面１８００を出力する処理の一例である。 Fig. 17 is a diagram showing an example of a notification setting screen 1800. As shown in Fig. 17, the notification setting screen 1800 includes a notification setting tag 1810 and a notification setting area 1820. When the notification setting tag 1810 is selected, a request for generating a notification setting screen is transmitted from the client device 110 to the server device 100. Then, in response to the generation request, a GUI part 1830 capable of setting a destination for each device is displayed in the notification setting area 1820. The example of Fig. 17 shows that the notification setting tag 1810 has been selected.
This process is an example of a process that outputs a notification setting screen 1800 for a destination to which information indicating that a person in a registered facial image has been detected has been sent when a person in the facial image has been detected in video.

ＧＵＩ部品１８３０では、デバイス名ごとにメール及びＷｅｂｈｏｏｋ連携の連携先を設定可能に構成されている。なお、デバイス名はデバイスを一意に識別可能なものであってもよいし、デバイスに付されたタグ（デバイスタグ）を識別可能なものであってもよい。デバイスタグは、監視カメラ１６０のグループに対して付することができるものである。通知設定画面１８００は、複数のカメラそれぞれごと、又はカメラのグループごとに送信先を設定可能に構成されている。 The GUI component 1830 is configured to allow the destination of email and Webhook integration to be set for each device name. The device name may be capable of uniquely identifying the device, or may be capable of identifying a tag (device tag) attached to the device. A device tag can be attached to a group of surveillance cameras 160. The notification setting screen 1800 is configured to allow the destination to be set for each of multiple cameras or for each group of cameras.

ＧＵＩ部品１８３０においてデバイス名（例えば「入店口東口」１８２０）が選択されると、制御部２１０は、通知設定画面１９００を生成し、クライアント装置１１０の出力部３４０に表示されるよう制御する。 When a device name (e.g., "Store Entrance East Entrance" 1820) is selected in the GUI component 1830, the control unit 210 generates a notification setting screen 1900 and controls it to be displayed on the output unit 340 of the client device 110.

図１８は、通知設定画面１９００の一例を示す図である。項目１９１０では通知メールの送信先が設定可能になっている。項目１９２０ではＷｅｂｈｏｏｋ連携の連携先を設定可能になっている。Ｗｅｂｈｏｏｋ連携の連携先としては複数の連絡先が設定可能となっている。
図１８では通知設定画面１９００では、送信先としてユーザーを設定可能に構成されているといえる。 18 is a diagram showing an example of a notification setting screen 1900. A destination of a notification email can be set in an item 1910. A destination of a Webhook link can be set in an item 1920. A plurality of contacts can be set as a destination of a Webhook link.
In FIG. 18, the notification setting screen 1900 is configured so that a user can be set as a destination.

以上、実施形態１によれば、簡単に映像から人物を探し出し、気になる人物の顔画像を情報処理システム１０００に登録することができる。また、実施形態１によれば、登録された顔画像から選択した顔画像の人物の映像を検出し、出力することができる。 As described above, according to the first embodiment, it is possible to easily find a person from a video and register a facial image of the person of interest in the information processing system 1000. Furthermore, according to the first embodiment, it is possible to detect and output a video of a person whose facial image is selected from the registered facial images.

＜付記＞
次に記載の各態様で提供されるようにしてもよい。
（付記１）
カメラで撮影された映像を情報端末で表示させる情報処理システムであって、
前記映像から顔画像を含む映像を検索し、
前記検索の結果として複数の映像を表示し、
ユーザーの指示に応じて前記複数の映像から映像を選択し、
前記選択された映像と共に前記映像に含まれる顔画像を表示し、
ユーザーの指示に応じて前記顔画像を登録する、
情報処理システム。
（付記２）
カメラで撮影された映像を情報端末で表示させる情報処理システムであって、
登録された複数の顔画像を表示し、
ユーザーの指示に応じて前記複数の顔画像から顔画像を選択し、
前記選択された顔画像に対応する人物を含む映像を検索し、
前記検索の結果として複数の映像を表示し、
ユーザーの指示に応じて前記複数の映像から映像を選択し、
前記選択された映像を表示する、
情報処理システム。
（付記３）
付記１に記載の情報処理システムであって、
前記表示する顔画像は、同一人物の複数の顔画像である、
情報処理システム。
（付記４）
付記３に記載の情報処理システムであって、
前記複数の顔画像は、同一人物を互いに異なる角度から撮影した顔画像を含む、
情報処理システム。
（付記５）
付記４に記載の情報処理システムであって、
前記複数の顔画像は、正面を向いている顔画像と、右を向いている顔画像と、左を向いている顔画像と、体全体を含んでいる顔画像と、のうち少なくとも１つ以上を含む、
情報処理システム。
（付記６）
付記１に記載の情報処理システムであって、
前記顔画像と共に前記顔画像が登録済か否かを識別可能な態様で表示する、
情報処理システム。
（付記７）
付記１に記載の情報処理システムであって、
前記顔画像と共に、前記顔画像を登録するためのボタンを表示する、
情報処理システム。
（付記８）
付記１又は付記２に記載の情報処理システムであって、
前記映像と共に、前記映像に関するタイムラインを表示する、
情報処理システム。
（付記９）
付記８に記載の情報処理システムであって、
前記タイムラインは、登録された顔画像が映っている部分を識別可能な態様で表示する、
情報処理システム。
（付記１０）
付記８に記載の情報処理システムであって、
前記タイムラインを含む画面を介した所定の操作に基づき前記映像に関するムービークリップを作成する画面を表示する、
情報処理システム。
（付記１１）
付記１又は付記２に記載の情報処理システムであって、
前記顔画像を含む映像の検索は、ユーザーにより入力された検索条件に基づいて実行し、
前記検索条件は、期間と、キーワードと、のうち少なくとも１つ以上を含む、
情報処理システム。
（付記１２）
付記１又は付記２に記載の情報処理システムであって、
前記複数の映像は、前記複数の映像それぞれを撮影したカメラごとに縮小表示画像の態様で表示される、
情報処理システム。
（付記１３）
付記１又は付記２に記載の情報処理システムであって、
登録された前記顔画像が映像より検知された場合に、ユーザーに通知する、
情報処理システム。
（付記１４）
付記１３に記載の情報処理システムであって、
登録された前記顔画像が映像より検知された場合の通知先は、前記複数のカメラそれぞれごと、又はカメラのグループごとに設定可能に構成される、
情報処理システム。
（付記１５）
付記１又は付記２に記載の情報処理システムであって、
登録された前記顔画像は、ユーザーの指示に基づいて変更、又は顔画像を追加することが可能である、
情報処理システム。
（付記１６）
付記１又は付記２に記載の情報処理システムであって、
前記顔画像は、前記選択された映像に含まれる人物であることが分かる態様で表示される、
情報処理システム。
（付記１７）
付記１又は付記２に記載の情報処理システムであって、
前記映像と共に、前記選択された映像に含まれる人物のうち顔が判別可能な人物の画像を表示する、
情報処理システム。
（付記１８）
カメラで撮影された映像を情報端末で表示させる情報処理システムが実行する情報処理方法であって、
前記映像から顔画像を含む映像を検索し、
前記検索の結果として複数の映像を表示し、
ユーザーの指示に応じて前記複数の映像から映像を選択し、
前記選択された映像と共に前記映像に含まれる顔画像を表示し、
ユーザーの指示に応じて前記顔画像を登録する、
情報処理方法。
（付記１９）
カメラで撮影された映像を情報端末で表示させる情報処理システムが実行する情報処理方法であって、
登録された複数の顔画像を表示し、
ユーザーの指示に応じて前記複数の顔画像から顔画像を選択し、
前記選択された顔画像に対応する人物を含む映像を検索し、
前記検索の結果として複数の映像を表示し、
ユーザーの指示に応じて前記複数の映像から映像を選択し、
前記選択された映像を表示する、
情報処理方法。
（付記２０）
カメラで撮影された映像を情報端末で表示させるコンピュータに、
前記映像から顔画像を含む映像を検索し、
前記検索の結果として複数の映像を表示し、
ユーザーの指示に応じて前記複数の映像から映像を選択し、
前記選択された映像と共に前記映像に含まれる顔画像を表示し、
ユーザーの指示に応じて前記顔画像を登録する、
処理を実行させるためのプログラム。
（付記２１）
カメラで撮影された映像を情報端末で表示させるコンピュータに、
登録された複数の顔画像を表示し、
ユーザーの指示に応じて前記複数の顔画像から顔画像を選択し、
前記選択された顔画像に対応する人物を含む映像を検索し、
前記検索の結果として複数の映像を表示し、
ユーザーの指示に応じて前記複数の映像から映像を選択し、
前記選択された映像を表示する、
処理を実行させるためのプログラム。 <Additional Notes>
The present invention may be provided in the following manner.
(Appendix 1)
An information processing system for displaying an image captured by a camera on an information terminal,
Searching the video for video containing a face image;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
displaying a facial image included in the selected image together with the selected image;
registering the face image in response to a user's instruction;
Information processing system.
(Appendix 2)
An information processing system for displaying an image captured by a camera on an information terminal,
Displaying multiple registered face images,
selecting a face image from the plurality of face images in response to a user instruction;
Searching for an image including a person corresponding to the selected face image;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
Displaying the selected image;
Information processing system.
(Appendix 3)
2. The information processing system according to claim 1,
The facial images to be displayed are multiple facial images of the same person.
Information processing system.
(Appendix 4)
4. The information processing system according to claim 3,
The plurality of facial images include facial images of the same person photographed from different angles.
Information processing system.
(Appendix 5)
5. The information processing system according to claim 4,
The plurality of face images include at least one of a face image facing forward, a face image facing right, a face image facing left, and a face image including the entire body.
Information processing system.
(Appendix 6)
2. The information processing system according to claim 1,
displaying, together with the face image, whether or not the face image has been registered in a manner that makes it possible to identify the face image;
Information processing system.
(Appendix 7)
2. The information processing system according to claim 1,
displaying a button for registering the face image together with the face image;
Information processing system.
(Appendix 8)
10. The information processing system according to claim 1,
Displaying a timeline relating to the video together with the video;
Information processing system.
(Appendix 9)
9. The information processing system according to claim 8,
the timeline displays a portion showing a registered face image in a identifiable manner;
Information processing system.
(Appendix 10)
9. The information processing system according to claim 8,
displaying a screen for creating a movie clip related to the video based on a predetermined operation via a screen including the timeline;
Information processing system.
(Appendix 11)
10. The information processing system according to claim 1,
The search for the video containing the face image is performed based on search conditions input by a user;
The search criteria include at least one of a period and a keyword.
Information processing system.
(Appendix 12)
10. The information processing system according to claim 1,
The plurality of images are displayed in the form of thumbnail images for each camera that captured the plurality of images.
Information processing system.
(Appendix 13)
10. The information processing system according to claim 1,
notifying the user when the registered face image is detected in the video;
Information processing system.
(Appendix 14)
14. The information processing system according to claim 13,
The notification destination when the registered face image is detected from the video can be set for each of the plurality of cameras or for each group of cameras.
Information processing system.
(Appendix 15)
10. The information processing system according to claim 1,
The registered face image can be changed or a face image can be added based on the user's instructions.
Information processing system.
(Appendix 16)
10. The information processing system according to claim 1,
The face image is displayed in such a manner that it is clear that the face image is a person included in the selected video.
Information processing system.
(Appendix 17)
10. The information processing system according to claim 1,
displaying an image of a person whose face is identifiable among the people included in the selected video together with the video;
Information processing system.
(Appendix 18)
An information processing method executed by an information processing system that displays an image captured by a camera on an information terminal, comprising:
Searching the video for images containing facial images;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
displaying a facial image included in the selected image together with the selected image;
registering the face image in response to a user's instruction;
Information processing methods.
(Appendix 19)
An information processing method executed by an information processing system that displays an image captured by a camera on an information terminal, comprising:
Displaying multiple registered face images,
selecting a face image from the plurality of face images in response to a user instruction;
Searching for an image including a person corresponding to the selected face image;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
Displaying the selected image;
Information processing methods.
(Appendix 20)
A computer that displays the images captured by the camera on an information terminal.
Searching the video for videos containing facial images;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
displaying a facial image included in the selected image together with the selected image;
registering the face image in response to a user's instruction;
A program for executing a process.
(Appendix 21)
A computer that displays the images captured by the camera on an information terminal.
Displaying multiple registered face images,
selecting a face image from the plurality of face images in response to a user instruction;
Searching for an image including a person corresponding to the selected face image;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
Displaying the selected image;
A program for executing a process.

本発明に係る種々の実施形態を説明したが、これらは、例として提示したものであり、発明の範囲を限定することは意図していない。新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。実施形態及び実施形態の変形は、発明の範囲及び要旨に含まれると共に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 Although various embodiments of the present invention have been described, these are presented as examples and are not intended to limit the scope of the invention. New embodiments can be embodied in various other forms, and various omissions, substitutions, and modifications can be made without departing from the gist of the invention. The embodiments and modifications of the embodiments are included within the scope and gist of the invention, and are included in the scope of the invention and its equivalents as set forth in the claims.

例えば、上述したサーバー装置１００の処理の一部をクライアント装置１１０及び／又はクライアント装置１２０が実行するようにしてもよい。また、上述したサーバー装置１００の処理の一部を監視カメラ１６０が実行するようにしてもよい。 For example, part of the processing of the server device 100 described above may be executed by the client device 110 and/or the client device 120. Also, part of the processing of the server device 100 described above may be executed by the surveillance camera 160.

１００：サーバー装置
１１０：クライアント装置
１５０：ネットワーク
１６０：監視カメラ
２１０：制御部
２２０：記憶部
２３０：通信部
１０００：情報処理システム 100: Server device 110: Client device 150: Network 160: Surveillance camera 210: Control unit 220: Storage unit 230: Communication unit 1000: Information processing system

Claims

An information processing system for displaying an image captured by a camera on an information terminal,
Searching the video for video containing a face image;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
displaying a facial image included in the selected image together with the selected image;
registering the face image in response to a user's instruction;
Information processing system.

An information processing system for displaying an image captured by a camera on an information terminal,
Displaying multiple registered face images,
selecting a face image from the plurality of face images in response to a user instruction;
Searching for an image including a person corresponding to the selected face image;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
Displaying the selected image;
Information processing system.

2. The information processing system according to claim 1,
The facial images to be displayed are multiple facial images of the same person.
Information processing system.

4. The information processing system according to claim 3,
The plurality of facial images include facial images of the same person photographed from different angles.
Information processing system.

5. The information processing system according to claim 4,
The plurality of face images include at least one of a face image facing forward, a face image facing right, a face image facing left, and a face image including the entire body.
Information processing system.

2. The information processing system according to claim 1,
displaying, together with the face image, whether or not the face image has been registered in a manner that makes it possible to identify the face image;
Information processing system.

2. The information processing system according to claim 1,
displaying a button for registering the face image together with the face image;
Information processing system.

3. The information processing system according to claim 1,
Displaying a timeline relating to the video together with the video;
Information processing system.

9. The information processing system according to claim 8,
the timeline displays a portion showing a registered face image in a identifiable manner;
Information processing system.

9. The information processing system according to claim 8,
displaying a screen for creating a movie clip related to the video based on a predetermined operation via a screen including the timeline;
Information processing system.

3. The information processing system according to claim 1,
The search for the video containing the face image is performed based on search conditions input by a user;
The search criteria include at least one of a period and a keyword.
Information processing system.

3. The information processing system according to claim 1,
The plurality of images are displayed in the form of thumbnail images for each camera that captured the plurality of images.
Information processing system.

3. The information processing system according to claim 1,
notifying the user when the registered face image is detected in the video;
Information processing system.

The information processing system according to claim 13,
The notification destination when the registered face image is detected from the video can be set for each of the plurality of cameras or for each group of cameras.
Information processing system.

3. The information processing system according to claim 1,
The registered face image can be changed or a face image can be added based on the user's instructions.
Information processing system.

3. The information processing system according to claim 1,
The face image is displayed in such a manner that it is clear that the face image is a person included in the selected video.
Information processing system.

3. The information processing system according to claim 1,
displaying an image of a person whose face is identifiable among the people included in the selected video together with the video;
Information processing system.

An information processing method executed by an information processing system that displays an image captured by a camera on an information terminal, comprising:
Searching the video for video containing a face image;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
displaying a facial image included in the selected image together with the selected image;
registering the face image in response to a user's instruction;
Information processing methods.

An information processing method executed by an information processing system that displays an image captured by a camera on an information terminal, comprising:
Displaying multiple registered face images,
selecting a face image from the plurality of face images in response to a user instruction;
Searching for an image including a person corresponding to the selected face image;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
Displaying the selected image;
Information processing methods.

A computer that displays the images captured by the camera on an information terminal.
Searching the video for video containing a face image;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
displaying a facial image included in the selected image together with the selected image;
registering the face image in response to a user's instruction;
A program for executing a process.

A computer that displays the images captured by the camera on an information terminal.
Displaying multiple registered face images,
selecting a face image from the plurality of face images in response to a user instruction;
Searching for an image including a person corresponding to the selected face image;
displaying a plurality of images as a result of said search;
Selecting an image from the plurality of images in response to a user instruction;
Displaying the selected image;
A program for executing a process.