WO2024252507A1

WO2024252507A1 - Information processing device, information processing system, location identification method, and non-transitory computer readable medium

Info

Publication number: WO2024252507A1
Application number: PCT/JP2023/020925
Authority: WO
Inventors: 聡辻; 茂央鈴木; 裕介國井
Original assignee: 日本電気株式会社
Priority date: 2023-06-06
Filing date: 2023-06-06
Publication date: 2024-12-12

Abstract

The purpose of the present invention is to provide an information processing device (10) for suppressing the degradation of estimation accuracy. An information processing device (10) as in the present disclosure is provided with: an estimation unit (11) that estimates at least one candidate location for the location at which captured was a first captured image captured from a mobile object, such estimation using the distance between an object of interest included in the first captured image and the location at which the first captured image was captured, as well as a location of the object of interest in reference data generated in advance as data indicating the location of an object in a predetermined space; and an identification unit (12) that identifies the location at which the first captured image was captured, by using the first captured image and an estimation image of when the predetermined space is imaged at the at least one candidate location.

Description

Information processing device, information processing system, location determination method, and non-transitory computer-readable medium

　本開示は、情報処理装置、情報処理システム、位置特定方法、及び非一時的なコンピュータ可読媒体に関する。 The present disclosure relates to an information processing device, an information processing system, a location identification method, and a non-transitory computer-readable medium.

　変電所等の施設内において作業を行う作業者は、作業を安全かつ効率的に行うために、施設内における作業者自身の位置を正確に把握する必要がある。 Workers working in facilities such as substations need to know their own location accurately within the facility in order to perform their work safely and efficiently.

　特許文献１には、車載カメラによる実撮影画像と、予め車両からの風景が撮影された撮影画像を含む参照データとをマッチングすることによって、実撮影画像を撮影した車載カメラが搭載された車両の位置を決定する自車位置検出システムの構成が開示されている。特許文献１の自車位置検出システムは、車両挙動に基づいて、参照データに含まれる撮影画像のうち、実撮影画像とのマッチング相手から除外する撮影画像を決定する。 Patent Document 1 discloses the configuration of a vehicle position detection system that determines the position of the vehicle equipped with the on-board camera that captured the actual image by matching the actual image captured by the on-board camera with reference data that includes images of the scenery captured in advance from the vehicle. The vehicle position detection system of Patent Document 1 determines, based on the vehicle behavior, which of the images included in the reference data should be excluded from matching with the actual image.

特開２０１１－２１５０５５号公報JP 2011-215055 A

　変電所等が存在する場所は、山奥等、周りの風景に特徴的な物体を含まない場合が多い。さらに、変電所の施設内には、同一形状の物体が多く含まれる。このような状況において、特許文献１における位置推定が行われた場合、変電所内には、風景等に含まれる特徴的な物体がないため、推定精度が劣化する可能性があるという問題がある。 The locations where substations and other facilities are located are often deep in the mountains and do not include distinctive objects in the surrounding scenery. Furthermore, substation facilities contain many objects of the same shape. In such a situation, if position estimation as described in Patent Document 1 is performed, there is a problem that the estimation accuracy may deteriorate because there are no distinctive objects in the scenery within the substation.

　本開示の目的は、上述した課題に鑑み、推定精度の劣化を抑える情報処理装置、情報処理システム、位置特定方法、及び非一時的なコンピュータ可読媒体を提供することにある。 In view of the above-mentioned problems, the objective of the present disclosure is to provide an information processing device, an information processing system, a position identification method, and a non-transitory computer-readable medium that suppress deterioration of estimation accuracy.

　本開示の第１の態様にかかる情報処理装置は、移動体から撮影された第１の撮影画像に含まれる注目物体と前記第１の撮影画像が撮影された位置との間の距離と、所定の空間における物体の位置を示すデータとして予め生成された参照データにおける前記注目物体の位置と、を用いて、前記第１の撮影画像が撮影された位置の少なくとも１つの候補位置を推定する推定部と、前記少なくとも１つの候補位置において前記所定の空間を撮影した場合の推定画像と、前記第１の撮影画像とを用いて、前記第１の撮影画像が撮影された位置を特定する特定部と、を備える。 The information processing device according to the first aspect of the present disclosure includes an estimation unit that estimates at least one candidate position for the position at which the first photographed image was taken, using the distance between a target object included in a first photographed image taken from a moving body and the position at which the first photographed image was taken, and the position of the target object in reference data that is generated in advance as data indicating the position of an object in a specified space, and an identification unit that identifies the position at which the first photographed image was taken, using an estimated image when the specified space is photographed at the at least one candidate position and the first photographed image.

　本開示の第２の態様にかかる情報処理システムは、移動体と共に移動して前記移動体の周囲を撮影して第１の撮影画像を生成する撮影装置と、所定の空間における物体の位置を示すデータとして予め参照データを生成する第１の情報処理装置と、前記第１の撮影画像に含まれる注目物体と前記第１の撮影画像が撮影された位置との間の距離と、前記参照データにおける前記注目物体の位置と、を用いて、前記第１の撮影画像が撮影された位置の少なくとも１つの候補位置を推定し、前記少なくとも１つの候補位置において前記所定の空間を撮影した場合の推定画像と、前記第１の撮影画像と、を用いて、前記第１の撮影画像が撮影された位置を特定する、第２の情報処理装置と、を備える。 The information processing system according to the second aspect of the present disclosure includes an image capturing device that moves with a moving object and captures an image of the surroundings of the moving object to generate a first captured image, a first information processing device that generates reference data in advance as data indicating the position of an object in a specified space, and a second information processing device that estimates at least one candidate position for the position where the first captured image was captured using the distance between a target object included in the first captured image and the position where the first captured image was captured and the position of the target object in the reference data, and identifies the position where the first captured image was captured using an estimated image when the specified space is captured at the at least one candidate position and the first captured image.

　本開示の第３の態様にかかる位置特定方法は、移動体から撮影された第１の撮影画像に含まれる注目物体と前記第１の撮影画像が撮影された位置との間の距離と、所定の空間における物体の位置を示すデータとして予め生成された参照データにおける前記注目物体の位置と、を用いて、前記第１の撮影画像が撮影された位置の少なくとも１つの候補位置を推定し、前記少なくとも１つの候補位置において前記所定の空間を撮影した場合の推定画像と、前記第１の撮影画像とを用いて、前記第１の撮影画像が撮影された位置を特定する。 The position identification method according to the third aspect of the present disclosure estimates at least one candidate position for the position at which the first photographed image was taken using the distance between a target object included in a first photographed image taken from a moving body and the position at which the first photographed image was taken, and the position of the target object in reference data that is generated in advance as data indicating the position of an object in a specified space, and identifies the position at which the first photographed image was taken using an estimated image of the specified space photographed at the at least one candidate position and the first photographed image.

　本開示の第４の態様にかかる非一時的なコンピュータ可読媒体は、移動体から撮影された第１の撮影画像に含まれる注目物体と前記第１の撮影画像が撮影された位置との間の距離と、所定の空間における物体の位置を示すデータとして予め生成された参照データにおける前記注目物体の位置と、を用いて、前記第１の撮影画像が撮影された位置の少なくとも１つの候補位置を推定し、前記少なくとも１つの候補位置において前記所定の空間を撮影した場合の推定画像と、前記第１の撮影画像とを用いて、前記第１の撮影画像が撮影された位置を特定する、ことをコンピュータに実行させるプログラムを格納する。 The non-transitory computer-readable medium according to the fourth aspect of the present disclosure stores a program that causes a computer to estimate at least one candidate position for the position at which the first photographed image was taken, using the distance between a target object included in a first photographed image taken from a moving body and the position at which the first photographed image was taken, and the position of the target object in reference data that is generated in advance as data indicating the position of an object in a specified space, and to identify the position at which the first photographed image was taken, using an estimated image of the specified space photographed at the at least one candidate position and the first photographed image.

　本開示により、推定精度の劣化を抑える情報処理装置、情報処理システム、位置特定方法、及び非一時的なコンピュータ可読媒体を提供することができる。 The present disclosure makes it possible to provide an information processing device, an information processing system, a location identification method, and a non-transitory computer-readable medium that suppress deterioration of estimation accuracy.

本開示にかかる情報処理装置の構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a configuration of an information processing device according to the present disclosure. 本開示にかかる情報処理装置において実行される位置特定処理の流れの一例を示す図である。1 is a diagram showing an example of a flow of a position identification process executed in an information processing device according to the present disclosure. 本開示にかかる情報処理装置において実行される位置特定処理の流れの一例を示す図である。1 is a diagram showing an example of a flow of a position identification process executed in an information processing device according to the present disclosure. 本開示にかかる情報処理装置の構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a configuration of an information processing device according to the present disclosure. 本開示にかかる情報処理装置において実行される位置特定処理の流れの一例を示す図である。1 is a diagram showing an example of a flow of a position identification process executed in an information processing device according to the present disclosure. 本開示にかかる情報処理装置において実行される位置特定処理の流れの一例を示す図である。1 is a diagram showing an example of a flow of a position identification process executed in an information processing device according to the present disclosure. 本開示にかかる情報処システムの構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a configuration of an information processing system according to the present disclosure. 本開示にかかる情報処理装置の構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a configuration of an information processing device according to the present disclosure.

　（実施の形態１）
　以下、図面を参照して本開示の実施の形態について説明する。図１を用いて情報処理装置１０の構成例について説明する。情報処理装置１０は、プロセッサがメモリに格納されたプログラムを実行することによって動作するコンピュータ装置であってもよい。情報処理装置１０は、推定部１１及び特定部１２を有している。推定部１１及び特定部１２は、プロセッサがメモリに格納されたプログラムを実行することによって処理が実行されるソフトウェアもしくはモジュールであってもよい。または、推定部１１及び特定部１２は、回路もしくはチップ等のハードウェアであってもよい。 (Embodiment 1)
Hereinafter, an embodiment of the present disclosure will be described with reference to the drawings. A configuration example of an information processing device 10 will be described with reference to FIG. 1. The information processing device 10 may be a computer device that operates when a processor executes a program stored in a memory. The information processing device 10 has an estimation unit 11 and an identification unit 12. The estimation unit 11 and the identification unit 12 may be software or a module that performs processing when a processor executes a program stored in a memory. Alternatively, the estimation unit 11 and the identification unit 12 may be hardware such as a circuit or a chip.

　推定部１１は、移動体が撮影画像を撮影した位置の少なくとも１つの候補位置を推定する。推定部１１は、撮影画像に含まれる注目物体と撮影画像が撮影された位置との間の距離と、所定の空間における物体の位置を示すデータとして予め生成された参照データにおける注目物体の位置と、を用いて候補位置を推定する。 The estimation unit 11 estimates at least one candidate position for the position where the moving object captured the captured image. The estimation unit 11 estimates the candidate position using the distance between the target object included in the captured image and the position where the captured image was captured, and the position of the target object in reference data that is generated in advance as data indicating the position of an object in a specified space.

　移動体は、移動しながら周囲の環境を撮影可能な撮影装置を有する。移動体は、撮影装置が搭載された車両であってもよい。車両は、自律的に走行可能な車両であってもよい。車両は、遠隔操作によって走行する車両であってもよい。車両は、運転手が直接操作することによって走行する車両であってもよい。または、移動体は、撮影装置を保持しながら移動する人間であってもよい。 The moving body has a camera that can capture images of the surrounding environment while moving. The moving body may be a vehicle equipped with a camera. The vehicle may be a vehicle that can run autonomously. The vehicle may be a vehicle that runs by remote control. The vehicle may be a vehicle that runs by being directly operated by a driver. Or, the moving body may be a human being that moves while holding a camera.

　移動体が撮影画像を撮影するとは、移動体に搭載された撮影装置が撮影することであってもよい。または、移動体が撮影画像を撮影するとは、人間が保持する撮影装置が撮影することであってもよい。候補位置及び所定の空間における位置は、いわゆるワールド座標系における位置であってもよい。または、候補位置及び所定の空間における位置は、任意に定められた独自の座標系における位置であってもよい。 When a moving body captures an image, it may mean that the image is captured by a camera mounted on the moving body. Alternatively, when a moving body captures an image, it may mean that the image is captured by a camera held by a human. The candidate positions and positions in the specified space may be positions in a so-called world coordinate system. Alternatively, the candidate positions and positions in the specified space may be positions in an arbitrarily defined unique coordinate system.

　注目物体は、他の物体と比較して、特徴的な形状を有している物体であってもよい。または、注目物体は、所定の空間内において、他の物体よりも存在する数が少ない物体であってもよい。注目物体は、情報処理装置１０を操作もしくは管理するユーザによって指定されてもよい。指定されるとは、選択される、決定される、等と言い換えられてもよい。または、注目物体は、情報処理装置１０とネットワークを介して接続されている異なるコンピュータ装置から、コンピュータ装置を操作するユーザによって指定されてもよい。または、注目物体は、情報処理装置１０が撮影画像を分析することによって、注目物体としての基準を満たす物体として情報処理装置１０によって特定された物体であってもよい。注目物体としての基準は、例えば、特徴的な形状を有しているか、もしくは、他の物体よりも存在する数が少ない物体であるか、等であってもよい。 The target object may be an object that has a characteristic shape compared to other objects. Or, the target object may be an object that exists less in number than other objects in a specified space. The target object may be designated by a user who operates or manages the information processing device 10. Designated may be rephrased as selected, determined, etc. Or, the target object may be designated by a user who operates a computer device from a different computer device connected to the information processing device 10 via a network. Or, the target object may be an object that is identified by the information processing device 10 as an object that meets the criteria for a target object by the information processing device 10 analyzing a captured image. The criteria for a target object may be, for example, that the object has a characteristic shape or that the object exists less in number than other objects, etc.

　所定の空間は、移動体が移動する対象となる空間であってもよい。所定の空間は、例えば、変電所等の施設であってもよい。参照データは、所定の空間における物体の位置もしくは配置を示す３次元データであってもよい。参照データは、所定の空間内において撮影された撮影画像を用いて生成されてもよい。もしくは、参照データは、所定の空間における設計データ等を用いて生成された３次元データであってもよい。参照データは、例えば、CAD（Computer Aided Design）データであってもよい。 The specified space may be a space through which a moving object moves. The specified space may be, for example, a facility such as a substation. The reference data may be three-dimensional data indicating the position or arrangement of an object in the specified space. The reference data may be generated using a photographed image taken within the specified space. Alternatively, the reference data may be three-dimensional data generated using design data or the like in the specified space. The reference data may be, for example, CAD (Computer Aided Design) data.

　候補位置は、所定の空間における注目物体から、撮影画像に含まれる注目物体と撮影画像が撮影された位置との間の距離だけ離れた位置として特定される少なくとも１つの位置であってもよい。候補位置間は、予め定められた距離以上離れた位置であってもよい。 The candidate position may be at least one position identified as a position that is separated from the target object in a specified space by a distance between the target object included in the captured image and the position at which the captured image was captured. The candidate positions may be positions that are separated by a predetermined distance or more.

　特定部１２は、少なくとも１つの候補位置において所定の空間を撮影した場合の推定画像と、撮影画像とを用いて、撮影画像が撮影された位置を特定する。推定画像は、例えば、候補位置において所定の画角を有し、所定の高さから所定の方向を撮影した場合に生成される画像として推定される画像であってもよい。画角及び高さは、変更可能である。または、推定画像は、事前に候補位置もしくは候補位置近傍においてされた撮影画像であってもよい。 The identification unit 12 identifies the location where the captured image was taken by using an estimated image obtained when a specific space is captured at at least one candidate location and the captured image. The estimated image may be, for example, an image estimated as an image that is generated when an image is captured at a candidate location with a specific angle of view and in a specific direction from a specific height. The angle of view and height can be changed. Alternatively, the estimated image may be an image captured in advance at or near the candidate location.

　特定部１２が推定画像と撮影画像とを用いるとは、推定画像と撮影画像とを比較することであってもよい。または、特定部１２が推定画像と撮影画像とを用いるとは、推定画像と撮影画像とを分析することであってもよい。特定部１２は、例えば、撮影画像と類似する推定画像に関連付けられた候補位置を、撮影画像が撮影された位置として特定してもよい。特定部１２が位置を特定するとは、特定部１２が複数の候補位置のなかから１の候補位置を選択することであってもよい。または、特定部１２が位置を特定するとは、特定部１２が複数の候補位置のなかから１の候補位置に絞り込むことであってもよい。 The identification unit 12 using the estimated image and the captured image may mean comparing the estimated image and the captured image. Alternatively, the identification unit 12 using the estimated image and the captured image may mean analyzing the estimated image and the captured image. The identification unit 12 may, for example, identify a candidate position associated with an estimated image similar to the captured image as the position where the captured image was captured. The identification unit 12 identifying a position may mean the identification unit 12 selecting one candidate position from among multiple candidate positions. Alternatively, the identification unit 12 identifying a position may mean the identification unit 12 narrowing down the multiple candidate positions to one candidate position.

　続いて、図２を用いて情報処理装置１０において実行される位置特定方法について説明する。はじめに、推定部１１は、移動体が撮影画像を撮影した位置の少なくとも１つの候補位置を推定する（Ｓ１１）。推定部１１は、撮影画像に含まれる注目物体と撮影画像が撮影された位置との間の距離と、所定の空間における物体の位置を示すデータとして予め生成された参照データにおける注目物体の位置と、を用いて候補位置を推定する。 Next, the position identification method executed in the information processing device 10 will be described with reference to FIG. 2. First, the estimation unit 11 estimates at least one candidate position of the position where the captured image was taken by a moving object (S11). The estimation unit 11 estimates the candidate position using the distance between the target object included in the captured image and the position where the captured image was taken, and the position of the target object in reference data that is generated in advance as data indicating the position of an object in a specified space.

　次に、特定部１２は、少なくとも１つの候補位置において所定の空間を撮影した場合の推定画像と、撮影画像とを用いて、撮影画像が撮影された位置を特定する（Ｓ１２）。 Next, the identification unit 12 identifies the position where the captured image was taken by using an estimated image obtained when a specific space is captured at at least one candidate position and the captured image (S12).

　以上説明したように、情報処理装置１０は、候補位置を、注目物体と撮影画像が撮影された位置との間の距離と、注目物体の位置とを用いて推定する。さらに、情報処理装置１０は、推定画像と撮影画像とを用いて、候補位置を絞り込む。これにより、情報処理装置１０は、撮影画像が撮影された位置を、高精度に推定することができる。 As described above, the information processing device 10 estimates the candidate positions using the distance between the target object and the position where the captured image was taken, and the position of the target object. Furthermore, the information processing device 10 narrows down the candidate positions using the estimated image and the captured image. This allows the information processing device 10 to estimate with high accuracy the position where the captured image was taken.

　（実施の形態２）
　実施の形態２にかかる特定部１２の処理もしくは動作について説明する。推定部１１の処理もしくは動作については、実施の形態１と実質的に同様であるため詳細な説明を省略する。特定部１２は、推定画像と撮影画像とのマッチング結果に基づいて、撮影画像が撮影された位置を特定する。 (Embodiment 2)
The process or operation of the identification unit 12 according to the second embodiment will be described. The process or operation of the estimation unit 11 is substantially the same as that of the first embodiment, and therefore detailed description thereof will be omitted. The identification unit 12 identifies the position where the captured image was captured based on the matching result between the estimated image and the captured image.

　推定画像と撮影画像とのマッチングとは、推定画像に含まれる特徴点と撮影画像に含まれる特徴点とを比較することによって、類似する特徴点を関連付けることであってもよい。類似する特徴点を関連付けるとは、類似する特徴点を対応付ける、結びつける等と言い換えられてもよい。特徴点が類似するとは、特徴点が有する特徴量が示す値が同一もしくは特徴量の差が予め定められた閾値以下を示すことであってもよい。特徴量は、例えば、色を示す情報であってもよく、物体の形状を示す情報であってもよい。特徴量は、ベクトルとして示されてもよい。特徴点は、例えば、SIFT、SURF等を用いて抽出もしくは検出されてもよい。 Matching an estimated image with a captured image may involve associating similar feature points by comparing feature points included in the estimated image with feature points included in the captured image. Associating similar feature points may be rephrased as matching or linking similar feature points. Similar feature points may mean that the feature amounts of the feature points indicate values that are the same or that the difference between the feature amounts is equal to or less than a predetermined threshold. The feature amount may be, for example, information indicating a color or information indicating the shape of an object. The feature amount may be represented as a vector. The feature points may be extracted or detected using, for example, SIFT, SURF, etc.

　マッチング結果は、対応付けられた特徴点の数であってもよい。特定部１２は、撮影画像に含まれる特徴点と対応付けられた特徴点の数が多い推定画像ほど、撮影画像と類似しているとみなしてもよい。 The matching result may be the number of associated feature points. The identification unit 12 may consider an estimated image to be more similar to the captured image if the estimated image has a larger number of feature points associated with feature points contained in the captured image.

　特定部１２は、撮影画像に含まれる特徴点と対応付けられた特徴点の数が最も多い推定画像に関連付けられた候補位置を撮影画像が撮影された位置として特定してもよい。または、特定部１２は、撮影画像に含まれる特徴点と対応付けられた特徴点の数が、予め定められた閾値を超えた少なくとも１つの推定画像に関連付けられた候補位置を撮影画像が撮影された位置として特定してもよい。 The identification unit 12 may identify, as the location where the captured image was taken, a candidate location associated with an estimated image having the largest number of feature points associated with feature points included in the captured image. Alternatively, the identification unit 12 may identify, as the location where the captured image was taken, a candidate location associated with at least one estimated image having a number of feature points associated with feature points included in the captured image that exceeds a predetermined threshold.

　特定部１２は、複数の推定画像の中から、より撮影画像に類似する推定画像を選択もしくは抽出することによって、複数の候補位置の中から撮影画像が撮影された位置としてより正確な位置を特定することができる。 The identification unit 12 can select or extract from the multiple estimated images the estimated image that is more similar to the captured image, thereby identifying a more accurate position from the multiple candidate positions as the position where the captured image was captured.

　また、特定部１２は、推定画像に含まれる注目物体と撮影画像に含まれる注目物体とのマッチング結果に基づいて、撮影画像が撮影された位置を特定してもよい。つまり、特定部１２は、撮影画像に含まれる注目物体を示す特徴点と対応付けられた推定画像に含まれる注目物体を示す特徴点の数が多いほど、推定画像が類似しているとみなしてもよい。 The identification unit 12 may also identify the position at which the captured image was taken based on a matching result between the target object included in the estimated image and the target object included in the captured image. In other words, the identification unit 12 may consider the estimated image to be more similar the greater the number of feature points indicating the target object included in the estimated image that are associated with feature points indicating the target object included in the captured image.

　特定部１２は、撮影画像に含まれる注目物体を示す特徴点と対応付けられた特徴点の数が最も多い推定画像に関連付けられた候補位置を撮影画像が撮影された位置として特定してもよい。または、特定部１２は、撮影画像に含まれる注目物体を示す特徴点と対応付けられた特徴点の数が、予め定められた閾値を超えた少なくとも１つの推定画像に関連付けられた候補位置を撮影画像が撮影された位置として特定してもよい。 The identification unit 12 may identify, as the location where the captured image was taken, a candidate location associated with an estimated image having the largest number of feature points associated with feature points indicating a target object included in the captured image. Alternatively, the identification unit 12 may identify, as the location where the captured image was taken, a candidate location associated with at least one estimated image having a number of feature points associated with feature points indicating a target object included in the captured image that exceeds a predetermined threshold.

　特定部１２は、複数の推定画像の中から、より撮影画像に類似する推定画像を選択もしくは抽出することによって、同一形状等の物体が多い環境においても、複数の候補位置の中から撮影画像が撮影された位置としてより正確な位置を特定することができる。 The identification unit 12 can select or extract from among multiple estimated images the estimated image that is most similar to the captured image, thereby being able to identify a more accurate position as the position where the captured image was captured from among multiple candidate positions, even in an environment with many objects of the same shape, etc.

　続いて、図３を用いて、特定部１２が、撮影画像が撮影された位置として、複数の位置を特定した場合について説明する。この場合における撮影画像を第１の撮影画像とする。 Next, a case where the identification unit 12 identifies multiple positions as positions at which the captured image was captured will be described with reference to FIG. 3. The captured image in this case will be referred to as the first captured image.

　ステップＳ２１は、図２のステップＳ１１と同様であるため詳細な説明を省略する。次に、特定部１２は、第１の撮影画像が撮影された位置として、複数の位置を特定する（Ｓ２２）。 Step S21 is similar to step S11 in FIG. 2, and therefore a detailed description thereof will be omitted. Next, the identification unit 12 identifies a plurality of positions as positions where the first photographed image was photographed (S22).

　次に、推定部１１は、移動体が第２の撮影画像を撮影した位置の少なくとも１つの候補位置を推定する（Ｓ２３）。第２の撮影画像は、第１の撮影画像が撮影されたタイミングとは異なるタイミングに撮影された撮影画像である。例えば、第２の撮影画像は、第１の撮影画像が撮影された後に撮影された画像であってもよい。第２の撮影画像を撮影した位置に関する候補位置の推定は、ステップＳ２１と同様の手順にて行われる。 Next, the estimation unit 11 estimates at least one candidate position for the position where the moving object captured the second photographed image (S23). The second photographed image is an image captured at a timing different from the timing when the first photographed image was captured. For example, the second photographed image may be an image captured after the first photographed image was captured. The estimation of the candidate position for the position where the second photographed image was captured is performed in the same procedure as step S21.

　特定部１２は、少なくとも１つの候補位置において所定の空間を撮影した場合の推定画像と、第２の撮影画像とを用いて、第２の撮影画像が撮影された位置を特定する（Ｓ２４）。 The identification unit 12 identifies the location where the second captured image was taken by using an estimated image obtained when a specific space is captured at at least one candidate location and the second captured image (S24).

　推定部１１は、ステップＳ２３において、第２の撮影画像を用いて推定された少なくとも１つの候補位置のうち、第１の撮影画像が撮影された位置として特定された第１の位置との距離の差が所定の値を下回る候補位置を選択してもよい。つまり、推定部１１は、第１の位置との距離の差が所定の値を下回る候補位置以外の候補位置を削除してもよい。 In step S23, the estimation unit 11 may select a candidate position from among at least one candidate position estimated using the second captured image, the candidate position having a difference in distance from the first position identified as the position where the first captured image was captured that is less than a predetermined value. In other words, the estimation unit 11 may delete candidate positions other than the candidate position having a difference in distance from the first position that is less than the predetermined value.

　推定部１１が、候補位置を、以前に撮影画像が撮影された位置として特定された第１の位置の近傍の候補位置に絞り込むことによって、特定部１２において第２の撮影画像が撮影された位置を特定する処理負荷が軽減される。また、推定部１１が候補位置を絞り込むことによって、特定部１２において特定される撮影画像が撮影された位置として特定される第１の位置も絞り込まれることになる。その結果、異なるタイミングに撮影された撮影画像を用いた処理を繰り返すことによって、特定部１２において特定される位置が１つに絞り込まれる。 The estimation unit 11 narrows down the candidate positions to candidate positions near the first position identified as the position where the previous photographed image was taken, thereby reducing the processing load of the identification unit 12 in identifying the position where the second photographed image was taken. Furthermore, the estimation unit 11 narrows down the candidate positions, which also narrows down the first position identified as the position where the photographed image identified by the identification unit 12 was taken. As a result, by repeating processing using photographed images taken at different times, the positions identified by the identification unit 12 are narrowed down to one.

　（実施の形態３）
　続いて、図４を用いて情報処理装置２０の構成例について説明する。情報処理装置２０は、プロセッサがメモリに格納されたプログラムを実行することによって動作するコンピュータ装置であってもよい。情報処理装置２０は、候補位置推定部２１、位置特定部２２、参照データ保持部２３、周辺画像データ取得部２４、位置情報取得部２５、及び設定部２６を有している。情報処理装置２０を構成する構成要素は、プロセッサがメモリに格納されたプログラムを実行することによって処理が実行されるソフトウェアもしくはモジュールであってもよい。または、情報処理装置２０を構成する構成要素は、回路もしくはチップ等のハードウェアであってもよい。 (Embodiment 3)
Next, a configuration example of the information processing device 20 will be described with reference to FIG. 4. The information processing device 20 may be a computer device that operates when a processor executes a program stored in a memory. The information processing device 20 has a candidate position estimation unit 21, a position identification unit 22, a reference data storage unit 23, a surrounding image data acquisition unit 24, a position information acquisition unit 25, and a setting unit 26. The components that configure the information processing device 20 may be software or modules that perform processing when a processor executes a program stored in a memory. Alternatively, the components that configure the information processing device 20 may be hardware such as a circuit or a chip.

　参照データ保持部２３は、所定の空間における物体の位置もしくは配置を示すデータとして予め生成された参照データを保持する。参照データは、情報処理装置２０において生成されてもよい。または、参照データは、他のコンピュータ装置において生成されてもよい。参照データ保持部２３は、他のコンピュータ装置において生成された参照データをネットワークを介して取得してもよい。もしくは、参照データ保持部２３は、オフラインで参照データを取得してもよい。 The reference data storage unit 23 stores reference data that has been generated in advance as data indicating the position or arrangement of an object in a specified space. The reference data may be generated in the information processing device 20. Alternatively, the reference data may be generated in another computer device. The reference data storage unit 23 may acquire the reference data generated in the other computer device via a network. Alternatively, the reference data storage unit 23 may acquire the reference data offline.

　参照データを生成するコンピュータ装置は、所定の空間内において撮影された複数の撮影画像を用いて参照データを生成してもよい。例えば、コンピュータ装置は、複数の撮影画像を用いてSfM（Structure from Motion）を実行することによって、３次元データである参照データを生成してもよい。SfMは、一連の既に獲得された２次元画像（もしくはフレーム）の全ての特徴点を算出し、時間的に前後する複数の撮影画像から、マッチングする特徴点を推定する。さらに、SfMは、各特徴点が現れたフレームにおける２次元平面上の位置の差異に基づいて各フレームを撮影したカメラの３次元位置もしくは姿勢を精度高く推定する。参照データ保持部２３は、それぞれの撮影画像が撮影された位置及び姿勢を示す情報を、撮影画像に関連付けて保持してもよい。 The computer device that generates the reference data may generate the reference data using multiple captured images taken within a specified space. For example, the computer device may generate reference data that is three-dimensional data by executing SfM (Structure from Motion) using multiple captured images. SfM calculates all feature points of a series of already acquired two-dimensional images (or frames) and estimates matching feature points from multiple captured images that occur before and after in time. Furthermore, SfM accurately estimates the three-dimensional position or orientation of the camera that captured each frame based on the difference in position on a two-dimensional plane in the frames in which each feature point appears. The reference data storage unit 23 may store information indicating the position and orientation at which each captured image was captured in association with the captured image.

　もしくは、コンピュータ装置は、所定の空間における設計データを、設計データから３次元データを生成するソフトウェアへ入力することによって、３次元データである参照データを生成してもよい。 Alternatively, the computer device may generate reference data, which is three-dimensional data, by inputting design data in a specified space into software that generates three-dimensional data from the design data.

　参照データは、３次元データであるため、情報処理装置２０は、参照データ内の任意の地点から、任意の物体までの距離を特定することが可能である。 Because the reference data is three-dimensional data, the information processing device 20 can determine the distance from any point in the reference data to any object.

　周辺画像データ取得部２４は、周辺画像データを取得する。周辺画像データは、移動しながら周囲の環境を撮影可能な撮影装置によって撮影された撮影画像である。撮影画像は、カメラを用いて撮影されたカメラ画像と、センサ等を用いて測定もしくは生成された点群データであってもよい。点群データは、センサの位置から物体までの距離を示すデータ及び所定の空間における３次元座標を有するデータであってもよい。点群データは、対象となる物体までの距離を測定することが可能なセンサ装置を用いて測定されてもよい。センサ装置は、例えば、LiDAR（Light Detection And Ranging）装置であってもよい。センサ装置と撮影装置とは、実質的に同じ位置に存在する。つまり、センサ装置とカメラ画像に含まれる物体との間の距離は、撮影装置とカメラ画像に含まれる物体との間の距離と実質的に等しい。 The surrounding image data acquisition unit 24 acquires surrounding image data. The surrounding image data is a photographed image taken by an imaging device capable of photographing the surrounding environment while moving. The photographed image may be a camera image taken by a camera and point cloud data measured or generated by a sensor or the like. The point cloud data may be data indicating the distance from the position of the sensor to an object and data having three-dimensional coordinates in a specified space. The point cloud data may be measured by a sensor device capable of measuring the distance to a target object. The sensor device may be, for example, a LiDAR (Light Detection And Ranging) device. The sensor device and the imaging device are substantially in the same position. In other words, the distance between the sensor device and an object included in the camera image is substantially equal to the distance between the imaging device and the object included in the camera image.

　周辺画像データ取得部２４は、ネットワークを介して撮影装置から周辺画像データを取得してもよい。または、情報処理装置２０が移動体に搭載されている場合、周辺画像データ取得部２４が、周囲の環境を撮影可能な撮影装置であってもよい。 The surrounding image data acquisition unit 24 may acquire surrounding image data from a photographing device via a network. Alternatively, if the information processing device 20 is mounted on a moving object, the surrounding image data acquisition unit 24 may be a photographing device capable of photographing the surrounding environment.

　位置情報取得部２５は、周辺画像データとして撮影画像を撮影した撮影装置の位置情報を取得する。位置情報は、例えば、撮影装置がGNSS（Global Navigation Satellite System）を用いて測定した位置情報であってもよい。位置情報取得部２５は、ネットワークを介して撮影装置から位置情報を取得してもよい。もしくは、周辺画像データ取得部２４が撮影装置である場合、位置情報取得部２５は、周辺画像データ取得部２４から位置情報を取得してもよい。 The location information acquisition unit 25 acquires location information of the imaging device that captured the image as the surrounding image data. The location information may be, for example, location information measured by the imaging device using the Global Navigation Satellite System (GNSS). The location information acquisition unit 25 may acquire location information from the imaging device via a network. Alternatively, if the surrounding image data acquisition unit 24 is an imaging device, the location information acquisition unit 25 may acquire location information from the surrounding image data acquisition unit 24.

　設定部２６は、注目物体を示す情報を保持している。注目物体を示す情報は、注目物体を識別する情報もしくは注目物体を示すパラメータ等と言い換えられてもよい。または、注目物体を示す情報は、注目物体の名称を示す情報であってもよい。 The setting unit 26 holds information indicating the target object. The information indicating the target object may be rephrased as information for identifying the target object or parameters indicating the target object. Alternatively, the information indicating the target object may be information indicating the name of the target object.

　候補位置推定部２１及び位置特定部２２は、図１における推定部１１及び特定部１２に相当する。 The candidate position estimation unit 21 and the position identification unit 22 correspond to the estimation unit 11 and the identification unit 12 in FIG. 1.

　続いて、図５を用いて情報処理装置２０において実行される位置特定処理の流れについて説明する。情報処理装置２０は、図５に示される処理を開始する前に、予め、参照データを生成していることを前提とする。つまり、参照データ保持部２３は、参照データを保持しているとする。 Next, the flow of the location identification process executed by the information processing device 20 will be described with reference to FIG. 5. It is assumed that the information processing device 20 generates reference data in advance before starting the process shown in FIG. 5. In other words, it is assumed that the reference data storage unit 23 stores the reference data.

　はじめに、候補位置推定部２１は、位置候補群マスタを空の集合として生成する（Ｓ３１）。位置候補群マスタは、位置特定部２２において特定された、撮影画像が撮影された位置を示す情報が記録される記録エリアである。記録するは、記憶する、格納する、等と言い換えられてもよい。位置候補群マスタは、データベースであってもよい。ステップＳ３１の時点においては、位置特定部２２は、撮影画像が撮影された位置を特定していないため、空の集合とする。また、候補位置推定部２１が、ステップＳ３１において位置候補群マスタを生成した時刻を、初期時刻ｔとする。 First, the candidate position estimation unit 21 generates a position candidate group master as an empty set (S31). The position candidate group master is a recording area in which information indicating the position identified by the position identification unit 22 at which the photographed image was taken is recorded. Recording may be rephrased as storing, storing, etc. The position candidate group master may be a database. At the time of step S31, the position identification unit 22 has not identified the position at which the photographed image was taken, so the position candidate group master is an empty set. In addition, the time at which the candidate position estimation unit 21 generates the position candidate group master in step S31 is set as the initial time t.

　次に、周辺画像データ取得部２４は、撮影装置によって撮影された周辺画像を取得する（Ｓ３２）。周辺画像は、撮影装置によって撮影されたカメラ画像及び点群データを含んでもよい。 Next, the surrounding image data acquisition unit 24 acquires the surrounding image captured by the imaging device (S32). The surrounding image may include the camera image captured by the imaging device and point cloud data.

　次に、周辺画像データ取得部２４は、周辺画像に含まれる物体を認識する（Ｓ３３）。物体を認識する、とは画像データを構成する最小単位の要素（２次元画像データであれば画素、３次元の点群データの場合は点、３次元のボクセルデータであればボクセル、を指す。以下、「要素」と表記する）ごとにそれが何の物体の一部であるかを判定することである（要素ごとに物体の名称やカテゴリを特定する）。物体の認識手法としては、一般的にはセマンティックセグメンテーションと呼ばれる技術がある。 Next, the surrounding image data acquisition unit 24 recognizes objects contained in the surrounding image (S33). Recognizing an object means determining what object each of the smallest elements that make up the image data (pixels for two-dimensional image data, points for three-dimensional point cloud data, and voxels for three-dimensional voxel data; hereafter referred to as "elements") is a part of (identifying the name and category of the object for each element). One common method for recognizing objects is a technology called semantic segmentation.

　例えば、周辺画像データ取得部２４は、カメラ画像に対してセマンティックセグメンテーションを行ってもよい。周辺画像データ取得部２４は、セマンティックセグメンテーションを実行することによって、カメラ画像にどのような物体が含まれているかを認識する。具体的には、周辺画像データ取得部２４は、カメラ画像のそれぞれの画素が、何の物体の一部を表しているかを特定する。周辺画像データ取得部２４は、例えば、それぞれの画素に、物体の名称を示すラベルを付与してもよい。また、周辺画像データ取得部２４は、それぞれの画素が分類されるカテゴリを決定してもよい。周辺画像データ取得部２４は、点群データを構成するそれぞれの点に対しても、画素と同様に、セマンティックセグメンテーションを行うことによって、それぞれの点が何の物体の一部を表しているかを特定してもよい。周辺画像データ取得部２４は、セマンティックセグメンテーションを実行する際に、例えば、画像から、画像に含まれる物体を特定もしくは分類する学習モデルを用いてもよい。 For example, the surrounding image data acquisition unit 24 may perform semantic segmentation on the camera image. The surrounding image data acquisition unit 24 recognizes what objects are included in the camera image by performing semantic segmentation. Specifically, the surrounding image data acquisition unit 24 identifies what object each pixel of the camera image represents. For example, the surrounding image data acquisition unit 24 may assign a label indicating the name of the object to each pixel. The surrounding image data acquisition unit 24 may also determine the category into which each pixel is classified. The surrounding image data acquisition unit 24 may also perform semantic segmentation on each point constituting the point cloud data in the same way as for the pixels, thereby identifying what object each point represents. When performing semantic segmentation, the surrounding image data acquisition unit 24 may use, for example, a learning model that identifies or classifies objects included in the image from the image.

　次に、周辺画像データ取得部２４は、カメラ画像を構成するそれぞれの画素に関連付けられた物体と、カメラ画像を撮影した撮影画像との間の距離を、特定する（Ｓ３４）。周辺画像データ取得部２４は、点群データを用いて、カメラ画像を構成するそれぞれの画素に関連付けられた物体に関する距離を特定する。カメラ画像を撮影する撮影装置と、点群データを取得するセンサ装置とは、実質的に同一の位置に存在する。そのため、カメラ画像の各画素と、点群データに含まれる点とは対応付けられているとする。つまり、各画素に関連付けられている物体までの距離は、各画素に対応付けられている点群データに示される距離とする。 Next, the surrounding image data acquisition unit 24 identifies the distance between the object associated with each pixel constituting the camera image and the captured image that captured the camera image (S34). The surrounding image data acquisition unit 24 uses the point cloud data to identify the distance related to the object associated with each pixel constituting the camera image. The imaging device that captures the camera image and the sensor device that acquires the point cloud data are substantially in the same position. Therefore, each pixel of the camera image is associated with a point included in the point cloud data. In other words, the distance to the object associated with each pixel is the distance indicated in the point cloud data associated with each pixel.

　次に、候補位置推定部２１は、周辺画像に注目物体が含まれているか否かを判定する（Ｓ３５）。注目物体は、例えば、情報処理装置２０を操作する操作者が入力する情報によって指定されてもよい。もしくは、候補位置推定部２１は、設定部２６から注目物体を示す情報を取得してもよい。もしくは、候補位置推定部２１は、予め生成された参照データを分析することによって、参照データに含まれている物体を特定してもよい。さらに、候補位置推定部２１は、特定された複数の物体の中から、注目物体を選択してもよい。参照データの分析及び注目物体の選択は、機械学習を行うことによって生成された学習モデルを用いて実行されてもよい。 Next, the candidate position estimation unit 21 determines whether or not the peripheral image includes a target object (S35). The target object may be specified by, for example, information input by an operator who operates the information processing device 20. Alternatively, the candidate position estimation unit 21 may acquire information indicating the target object from the setting unit 26. Alternatively, the candidate position estimation unit 21 may identify an object included in reference data by analyzing reference data generated in advance. Furthermore, the candidate position estimation unit 21 may select a target object from among the multiple identified objects. The analysis of the reference data and the selection of the target object may be performed using a learning model generated by performing machine learning.

　候補位置推定部２１は、周辺画像に注目物体が含まれていると判定した場合、注目物体までの距離に基づいて位置候補群（ｔ＋１）を設定する（Ｓ３６）。具体的には、候補位置推定部２１は、注目物体を構成する点群データが示す距離情報に基づいて、位置候補群（ｔ＋１）を設定する。候補位置推定部２１は、参照データに含まれる注目物体から、撮影装置と注目物体との間の距離だけ離れた参照データ内の少なくとも１以上のエリアもしくは位置を、位置候補群（ｔ＋１）へ含める。少なくとも１以上のエリアもしくは位置を、位置候補群（ｔ＋１）へ含めるとは、少なくとも１以上のエリアもしくは位置を、位置候補群（ｔ＋１）へ設定する、と言い換えられてもよい。参照データに含まれる注目物体から、撮影装置と注目物体との間の距離だけ離れた参照データ内の少なくとも１以上のエリアもしくは位置は、注目物体を中心として、撮影装置と注目物体との間の距離を半径とする円周上の位置であってもよい。 When the candidate position estimation unit 21 determines that the peripheral image includes the target object, it sets a position candidate group (t+1) based on the distance to the target object (S36). Specifically, the candidate position estimation unit 21 sets the position candidate group (t+1) based on distance information indicated by the point cloud data constituting the target object. The candidate position estimation unit 21 includes at least one area or position in the reference data that is separated from the target object included in the reference data by the distance between the image capture device and the target object in the position candidate group (t+1). Including at least one area or position in the position candidate group (t+1) may be rephrased as setting at least one area or position in the position candidate group (t+1). At least one area or position in the reference data that is separated from the target object included in the reference data by the distance between the image capture device and the target object may be a position on a circumference with the target object at the center and the distance between the image capture device and the target object as a radius.

　位置候補群（ｔ＋１）は、時刻ｔ＋１に対応付けられた位置情報を示す情報が記録される記録エリアである。位置候補群（ｔ＋１）はデータベースであってもよい。位置候補群（ｔ＋１）に含まれる少なくとも１以上のエリアもしくは位置は、周辺画像データ取得部２４が取得した周辺画像に含まれる撮影画像が撮影された位置の候補である。 The location candidate group (t+1) is a recording area in which information indicating location information associated with time t+1 is recorded. The location candidate group (t+1) may be a database. At least one area or location included in the location candidate group (t+1) is a candidate for the location where the captured image included in the peripheral image acquired by the peripheral image data acquisition unit 24 was captured.

　候補位置推定部２１は、周辺画像に注目物体が含まれていないと判定した場合、注目物体以外までの距離に基づいて位置候補群（ｔ＋１）を設定する（Ｓ３７）。候補位置推定部２１は、参照データに含まれる注目物体以外の物体から、撮影装置と物体と間の距離だけ離れた参照データ内の少なくとも１以上のエリアを、位置候補群（ｔ＋１）へ含める。 If the candidate position estimation unit 21 determines that the target object is not included in the peripheral image, it sets a group of position candidates (t+1) based on the distance to the object other than the target object (S37). The candidate position estimation unit 21 includes in the group of position candidates (t+1) at least one area in the reference data that is separated from the object other than the target object included in the reference data by the distance between the image capture device and the object.

　次に、候補位置推定部２１は、位置候補群（ｔ＋１）に含まれる位置候補のうち、位置候補群マスタに格納されている位置の近傍の位置候補のみを位置候補群（ｔ＋１）に残す（Ｓ３８）。位置候補群マスタに格納されている位置の近傍は、例えば、位置候補群マスタに格納されている位置から所定の距離内であってもよい。位置候補群マスタが空の集合であり、位置候補群マスタに位置を示す情報が記録されていない場合、候補位置推定部２１は、位置候補群（ｔ＋１）に含まれる全ての位置候補を残す。位置候補を残すとは、位置候補を維持する、削除しない、等と言い換えられてもよい。また、残される位置候補以外の位置候補は、削除される。 Next, the candidate position estimation unit 21 leaves in the position candidate group (t+1) only the position candidates that are in the vicinity of the position stored in the position candidate group master among the position candidates included in the position candidate group (t+1) (S38). The vicinity of the position stored in the position candidate group master may be, for example, within a predetermined distance from the position stored in the position candidate group master. If the position candidate group master is an empty set and no information indicating a position is recorded in the position candidate group master, the candidate position estimation unit 21 leaves all the position candidates included in the position candidate group (t+1). Leaving a position candidate may be rephrased as maintaining or not deleting a position candidate. Furthermore, position candidates other than the remaining position candidates are deleted.

　次に、位置特定部２２は、位置候補群（ｔ＋１）に含まれる位置候補からの視点における視点画像を生成する（Ｓ３９）。視点画像は、他の実施の形態において説明された推定画像に相当する。視点は、位置候補の位置及び視点の方向が定められている。位置特定部２２は、参照データ内における位置候補の位置を特定し、さらに、特定された位置から、定められた方向における参照データ内の視点画像を生成する。視点画像は、特定された位置から、定められた方向を撮影した撮影画像に相当する画像であってもよい。位置特定部２２は、例えば、特定された位置から、定められた方向を撮影する場合の画角を予め定めてもよい。位置特定部２２は、予め定められた画角を用いて、視点画像を生成してもよい。 Next, the position identification unit 22 generates a viewpoint image at a viewpoint from a position candidate included in the position candidate group (t+1) (S39). The viewpoint image corresponds to the estimated image described in other embodiments. The viewpoint is determined by the position of the position candidate and the viewpoint direction. The position identification unit 22 identifies the position of the position candidate in the reference data, and further generates a viewpoint image in the reference data in the determined direction from the identified position. The viewpoint image may be an image equivalent to a captured image captured in the determined direction from the identified position. The position identification unit 22 may, for example, predetermine the angle of view when capturing an image in the determined direction from the identified position. The position identification unit 22 may generate the viewpoint image using the predetermine angle of view.

　もしくは、位置特定部２２は、SfMを実行することによって３次元データである参照データを生成する際に用いた２次元画像を視点画像として用いてもよい。位置特定部２２は、参照データを生成する際に用いた２次元画像であって、撮影された位置が位置候補の位置もしくは位置候補の位置から所定の範囲内の位置である２次元画像を視点画像として用いてもよい。 Alternatively, the position identification unit 22 may use, as the viewpoint image, a two-dimensional image used when generating reference data, which is three-dimensional data, by executing SfM. The position identification unit 22 may use, as the viewpoint image, a two-dimensional image used when generating reference data, in which the photographed position is the position of the candidate position or a position within a predetermined range from the position of the candidate position.

　次に、位置特定部２２は、視点画像と、周辺画像とを比較し、類似度を算出する（Ｓ４０）。例えば、位置特定部２２は、視点画像と、周辺画像に含まれるカメラ画像とのそれぞれの画像における特徴点を抽出する。位置特定部２２は、抽出した特徴点同士の比較を行う。特徴点同士の比較結果は、スコアとして示されてもよい。例えば、スコアが高いほど、視点画像と周辺画像との類似度が高くてもよい。また、視点画像と周辺画像との比較は、画素単位に行われてもよい。画素単位の比較は、例えば、画素単位のスコアを算出し、算出したスコアを統計処理することによって、視点画像と周辺画像との比較結果を示すスコアが生成されてもよい。 Next, the position identification unit 22 compares the viewpoint image with the surrounding image and calculates the similarity (S40). For example, the position identification unit 22 extracts feature points in each of the viewpoint image and the camera image included in the surrounding image. The position identification unit 22 compares the extracted feature points with each other. The comparison result between the feature points may be indicated as a score. For example, the higher the score, the higher the similarity between the viewpoint image and the surrounding image. Furthermore, the comparison between the viewpoint image and the surrounding image may be performed on a pixel-by-pixel basis. For the pixel-by-pixel comparison, for example, a score may be calculated on a pixel-by-pixel basis, and the calculated score may be statistically processed to generate a score indicating the comparison result between the viewpoint image and the surrounding image.

　次に、位置特定部２２は、類似度の高い視点画像の位置及び方向を位置候補群マスタに格納する（Ｓ４１）。類似度の高い視点画像は、例えば、所定のスコアより高いスコアに関連付けられた視点画像であってもよい。位置特定部２２は、閾値を超えた類似度の視点画像の位置及び方向を位置候補群マスタに格納してもよい。 Next, the position identification unit 22 stores the positions and directions of viewpoint images with high similarity in the position candidate group master (S41). A viewpoint image with high similarity may be, for example, a viewpoint image associated with a score higher than a predetermined score. The position identification unit 22 may store the positions and directions of viewpoint images with similarity exceeding a threshold in the position candidate group master.

　次に、位置特定部２２は、位置候補群マスタに含まれる要素が１つのみであるか否かを判定する（Ｓ４２）。要素は、位置候補群マスタに含まれる位置及び方向を示す情報である。位置特定部２２は、位置候補群マスタに要素が１つのみ存在する場合、位置候補群マスタに含まれる位置及び方向を、周辺画像が撮影された位置及び方向として特定する（Ｓ４３）。位置特定部２２は、位置候補群マスタに要素が複数存在する場合、時刻を１単位進めて、ｔ＝ｔ＋１とする（Ｓ４４）。ステップＳ４４の後は、ステップＳ３２以降の処理が繰り返される。 Next, the position identification unit 22 determines whether the position candidate group master contains only one element (S42). An element is information that indicates a position and direction contained in the position candidate group master. If the position candidate group master contains only one element, the position identification unit 22 identifies the position and direction contained in the position candidate group master as the position and direction at which the surrounding image was captured (S43). If the position candidate group master contains multiple elements, the position identification unit 22 advances the time by one unit, setting t = t + 1 (S44). After step S44, the processing from step S32 onwards is repeated.

　情報処理装置２０は、ステップＳ３８において、位置候補群（ｔ＋１）に残す位置候補を絞り込むことによって、ステップＳ４１において位置候補群マスタに格納する位置情報も絞り込むことができる。その結果、情報処理装置２０は、図５及び図６に示される処理を繰り返すことによって、撮影画像が撮影された位置を１つに特定することができる。 In step S38, the information processing device 20 narrows down the location candidates to be left in the location candidate group (t+1), thereby narrowing down the location information to be stored in the location candidate group master in step S41. As a result, the information processing device 20 can identify a single location where the captured image was captured by repeating the processes shown in Figures 5 and 6.

　また、候補位置推定部２１は、位置情報取得部２５において取得された撮影装置の位置情報に基づいて、ステップＳ３６及びＳ３７において、候補位置を絞り込んでもよい。例えば、変電所や送電鉄塔は、上方に鉄構が存在するため、GNSSでは正確な位置情報を取得できない場合がある。つまり、位置情報取得部２５が取得した位置情報は、撮影装置の位置を正確に示していない場合がある。ただし、位置情報取得部２５が取得した位置情報は、撮影装置のおおよその位置を示している。そのため、候補位置推定部２１は、撮影装置のおおよその位置を示す位置情報の近傍にある候補位置のみを位置候補群（ｔ＋１）へ含めてもよい。このように、候補位置推定部２１が位置候補群（ｔ＋１）へ含める候補位置を絞り込むことによって、位置特定部２２は、撮影画像が撮影された位置を早期に特定することができる。 The candidate position estimation unit 21 may also narrow down the candidate positions in steps S36 and S37 based on the position information of the imaging device acquired by the position information acquisition unit 25. For example, since there is a steel structure above a substation or a transmission tower, accurate position information may not be acquired by GNSS. In other words, the position information acquired by the position information acquisition unit 25 may not accurately indicate the position of the imaging device. However, the position information acquired by the position information acquisition unit 25 indicates the approximate position of the imaging device. Therefore, the candidate position estimation unit 21 may include only candidate positions that are in the vicinity of the position information indicating the approximate position of the imaging device in the position candidate group (t+1). In this way, by narrowing down the candidate positions to be included in the position candidate group (t+1) by the candidate position estimation unit 21, the position identification unit 22 can quickly identify the position where the image was taken.

　また、ステップＳ３６及びＳ３７において、位置候補群（ｔ＋１）を設定する際に用いられる距離情報は、任意に設定する誤差が考慮されてもよい。つまり、候補位置推定部２１は、例えば、ステップＳ３６において、参照データに含まれる注目物体から、撮影装置と注目物体との間の距離±Ａ（Ａは任意の誤差）だけ離れた参照データ内の少なくとも１以上の位置を、位置候補群（ｔ＋１）へ含めてもよい。さらに、誤差を示す値Ａは、時刻が進むごとに、小さく設定されてもよい。時刻が進むごとに値Ａを小さくすることによって、位置候補群（ｔ＋１）へ含める位置の数を減少させることができるため、位置特定部２２が、撮影画像が撮影された位置を早期に特定することができる。 Furthermore, in steps S36 and S37, the distance information used in setting the position candidate group (t+1) may take into account an arbitrary set error. That is, for example, in step S36, the candidate position estimation unit 21 may include in the position candidate group (t+1) at least one position in the reference data that is away from the target object included in the reference data by the distance between the shooting device and the target object ±A (A is an arbitrary error). Furthermore, the value A indicating the error may be set to be smaller as time progresses. By decreasing the value A as time progresses, the number of positions to be included in the position candidate group (t+1) can be reduced, and the position identification unit 22 can quickly identify the position where the captured image was captured.

　また、ステップＳ４１において、位置特定部２２は、位置候補群マスタへ格納する視点画像の位置及び方向を抽出もしくは選択する際に用いる閾値を、時刻が進むごとに大きくしてもよい。時刻が進むごとに閾値を大きくすることによって、位置候補群マスタへ格納する視点画像の位置及び方向の数を減少させることができるため、位置特定部２２が、撮影画像が撮影された位置を早期に特定することができる。 In addition, in step S41, the position identification unit 22 may increase the threshold value used when extracting or selecting the positions and directions of the viewpoint images to be stored in the position candidate group master as time advances. By increasing the threshold value as time advances, the number of positions and directions of the viewpoint images to be stored in the position candidate group master can be reduced, and the position identification unit 22 can quickly identify the position where the captured image was taken.

　また、周辺画像データ取得部２４が取得する周辺画像及び視点画像のいずれかが点群データのみである場合について説明する。この場合、位置特定部２２は、点群データを用いてエッジを算出する。エッジは、面の向きが変わる領域である。位置特定部２２は、エッジの箇所の色を他の領域と異なる色としてもよい。位置特定部２２は、点群データ上のエッジと、周辺画像データ取得部２４が取得する周辺画像もしくは視点画像のエッジの位置とを用いて類似度を算出する。 Furthermore, a case will be described in which either the peripheral image or the viewpoint image acquired by the peripheral image data acquisition unit 24 is only point cloud data. In this case, the position identification unit 22 calculates edges using the point cloud data. An edge is an area where the orientation of a surface changes. The position identification unit 22 may set the color of the edge to a color different from other areas. The position identification unit 22 calculates the similarity using the edge on the point cloud data and the position of the edge of the peripheral image or the viewpoint image acquired by the peripheral image data acquisition unit 24.

　（実施の形態４）
　続いて、図７を用いて情報処理システムの構成例について説明する。図７の情報処理システムは、撮影装置４０、情報処理装置５０、及び情報処理装置６０を有している。撮影装置４０は、移動体３０と共に移動して、移動体３０の周囲を撮影して撮影画像を生成する。 (Embodiment 4)
Next, a configuration example of an information processing system will be described with reference to Fig. 7. The information processing system in Fig. 7 includes an image capturing device 40, an information processing device 50, and an information processing device 60. The image capturing device 40 moves together with a moving object 30, captures an image of the surroundings of the moving object 30, and generates a captured image.

　移動体３０は、撮影装置４０が搭載された車両であってもよい。車両は、自律的に走行可能な車両であってもよい。車両は、遠隔操作によって走行する車両であってもよい。車両は、運転手が直接操作することによって走行する車両であってもよい。または、移動体３０は、撮影装置４０を保持しながら移動する人間であってもよい。 The moving body 30 may be a vehicle equipped with the imaging device 40. The vehicle may be a vehicle capable of running autonomously. The vehicle may be a vehicle that runs by remote control. The vehicle may be a vehicle that runs by being directly operated by a driver. Or, the moving body 30 may be a human being that moves while holding the imaging device 40.

　情報処理装置５０は、所定の空間における物体の位置を示すデータとして予め参照データを生成する。情報処理装置５０は、所定の空間内において撮影された撮影画像を用いて、参照データを生成してもよい。もしくは、情報処理装置５０は、所定の空間における設計データ等を用いて参照データを生成してもよい。参照データは、３次元データであってもよい。 The information processing device 50 generates reference data in advance as data indicating the position of an object in a specified space. The information processing device 50 may generate reference data using a captured image captured within the specified space. Alternatively, the information processing device 50 may generate reference data using design data or the like in the specified space. The reference data may be three-dimensional data.

　情報処理装置６０は、撮影画像に含まれる注目物体と撮影画像が撮影された位置との間の距離と、参照データにおける注目物体の位置と、を用いて、撮影画像が撮影された位置の少なくとも１つの候補位置を推定する。さらに、情報処理装置６０は、少なくとも１つの候補位置において所定の空間を撮影した場合の推定画像と、撮影画像とを用いて、撮影画像が撮影された位置を特定する。 The information processing device 60 estimates at least one candidate position for the position where the photographed image was taken, using the distance between the target object included in the photographed image and the position where the photographed image was taken, and the position of the target object in the reference data. Furthermore, the information processing device 60 identifies the position where the photographed image was taken, using the photographed image and an estimated image obtained when a specific space is photographed at the at least one candidate position.

　以上説明したように、参照データの生成は、撮影画像が撮影された位置の推定及び特定を行う情報処理装置６０とは異なる情報処理装置５０において行われてもよい。このように情報処理装置を分けることによって、参照データの生成と、撮影画像が撮影された位置の推定及び特定とを一台の情報処理装置において実行する場合と比較して、負荷が分散される。 As described above, the generation of reference data may be performed in an information processing device 50 that is different from the information processing device 60 that estimates and identifies the location where the captured image was taken. By separating the information processing devices in this way, the load is distributed compared to when the generation of reference data and the estimation and identification of the location where the captured image was taken are performed in a single information processing device.

　また、図７に示した機能の配置以外にも、様々な機能の配置例がある。例えば、図１の情報処理装置１０に含まれる推定部１１及び特定部１２は、それぞれ異なる情報処理装置において実行されてもよい。このように複数の情報処理装置に機能が分散される場合、それぞれの情報処理装置は、ネットワークを介して接続されてもよい。 Furthermore, there are various examples of functional arrangements other than the functional arrangement shown in FIG. 7. For example, the estimation unit 11 and the identification unit 12 included in the information processing device 10 of FIG. 1 may each be executed in a different information processing device. When the functions are distributed to multiple information processing devices in this manner, each information processing device may be connected via a network.

　図８は、上述の実施の形態において説明した情報処理装置１０、２０、５０、及び６０（以下、情報処理装置１０等とする）の構成例を示すブロック図である。図７を参照すると、情報処理装置１０等は、ネットワークインターフェース１２０１、プロセッサ１２０２、及びメモリ１２０３を含む。ネットワークインターフェース１２０１は、ネットワークノードと通信するために使用されてもよい。ネットワークインターフェース１２０１は、例えば、IEEE 802.3 seriesに準拠したネットワークインタフェースカード（NIC）を含んでもよい。IEEEは、Institute of Electrical and Electronics Engineersを表す。 FIG. 8 is a block diagram showing an example configuration of the information processing devices 10, 20, 50, and 60 (hereinafter referred to as information processing devices 10, etc.) described in the above-mentioned embodiments. Referring to FIG. 7, the information processing device 10, etc. includes a network interface 1201, a processor 1202, and a memory 1203. The network interface 1201 may be used to communicate with a network node. The network interface 1201 may include, for example, a network interface card (NIC) that complies with the IEEE 802.3 series. IEEE stands for Institute of Electrical and Electronics Engineers.

　プロセッサ１２０２は、メモリ１２０３からソフトウェア（コンピュータプログラム）を読み出して実行することで、上述の実施形態においてフローチャートを用いて説明された測定装置２０の処理を行う。プロセッサ１２０２は、例えば、マイクロプロセッサ、MPU、又はCPUであってもよい。プロセッサ１２０２は、複数のプロセッサを含んでもよい。 The processor 1202 reads out and executes software (computer programs) from the memory 1203 to perform the processing of the measuring device 20 described using the flowcharts in the above-mentioned embodiment. The processor 1202 may be, for example, a microprocessor, an MPU, or a CPU. The processor 1202 may include multiple processors.

　メモリ１２０３は、揮発性メモリ及び不揮発性メモリの組み合わせによって構成される。メモリ１２０３は、プロセッサ１２０２から離れて配置されたストレージを含んでもよい。この場合、プロセッサ１２０２は、図示されていないI/O（Input/Output）インタフェースを介してメモリ１２０３にアクセスしてもよい。 Memory 1203 is composed of a combination of volatile memory and non-volatile memory. Memory 1203 may include storage located away from processor 1202. In this case, processor 1202 may access memory 1203 via an I/O (Input/Output) interface (not shown).

　図７の例では、メモリ１２０３は、ソフトウェアモジュール群を格納するために使用される。プロセッサ１２０２は、これらのソフトウェアモジュール群をメモリ１２０３から読み出して実行することで、上述の実施形態において説明された情報処理装置１０等の処理を行うことができる。 In the example of FIG. 7, the memory 1203 is used to store a group of software modules. The processor 1202 reads out and executes these groups of software modules from the memory 1203, thereby performing the processing of the information processing device 10 and the like described in the above embodiment.

　図７を用いて説明したように、上述の実施形態における情報処理装置１０等が有するプロセッサの各々は、図面を用いて説明されたアルゴリズムをコンピュータに行わせるための命令群を含む１又は複数のプログラムを実行する。 As explained using FIG. 7, each of the processors of the information processing device 10 in the above-mentioned embodiment executes one or more programs including a set of instructions for causing a computer to execute the algorithm explained using the drawings.

　上述の例において、プログラムは、コンピュータに読み込まれた場合に、実施形態で説明された１又はそれ以上の機能をコンピュータに行わせるための命令群（又はソフトウェアコード）を含む。プログラムは、非一時的なコンピュータ可読媒体又は実体のある記憶媒体に格納されてもよい。限定ではなく例として、コンピュータ可読媒体又は実体のある記憶媒体は、random-access memory（RAM）、read-only memory（ROM）、フラッシュメモリ、solid-state drive（SSD）又はその他のメモリ技術、CD-ROM、digital versatile disc（DVD）、Blu-ray（登録商標）ディスク又はその他の光ディスクストレージ、磁気カセット、磁気テープ、磁気ディスクストレージ又はその他の磁気ストレージデバイスを含む。プログラムは、一時的なコンピュータ可読媒体又は通信媒体上で送信されてもよい。限定ではなく例として、一時的なコンピュータ可読媒体又は通信媒体は、電気的、光学的、音響的、またはその他の形式の伝搬信号を含む。 In the above examples, the program includes instructions (or software code) that, when loaded into a computer, cause the computer to perform one or more functions described in the embodiments. The program may be stored on a non-transitory computer-readable medium or tangible storage medium. By way of example and not limitation, computer-readable medium or tangible storage medium may include random-access memory (RAM), read-only memory (ROM), flash memory, solid-state drive (SSD) or other memory technology, CD-ROM, digital versatile disc (DVD), Blu-ray® disc or other optical disk storage, magnetic cassette, magnetic tape, magnetic disk storage or other magnetic storage device. The program may be transmitted on a transitory computer-readable medium or communication medium. By way of example and not limitation, transitory computer-readable medium or communication medium may include electrical, optical, acoustic, or other forms of propagated signals.

　以上、実施の形態を参照して本開示を説明したが、本開示は上述の実施の形態に限定されるものではない。本開示の構成や詳細には、本開示のスコープ内で当業者が理解し得る様々な変更をすることができる。そして、各実施の形態は、適宜他の実施の形態と組み合わせることができる。 The present disclosure has been described above with reference to the embodiments, but the present disclosure is not limited to the above-mentioned embodiments. Various modifications that can be understood by those skilled in the art can be made to the configuration and details of the present disclosure within the scope of the present disclosure. Furthermore, each embodiment can be combined with other embodiments as appropriate.

　上記の実施形態の一部又は全部は、以下の付記のようにも記載されうるが、以下には限られない。 Some or all of the above embodiments can be described as follows, but are not limited to the following:

　（付記１）
　移動体から撮影された第１の撮影画像に含まれる注目物体と前記第１の撮影画像が撮影された位置との間の距離と、所定の空間における物体の位置を示すデータとして予め生成された参照データにおける前記注目物体の位置と、を用いて、前記第１の撮影画像が撮影された位置の少なくとも１つの候補位置を推定する推定部と、
　前記少なくとも１つの候補位置において前記所定の空間を撮影した場合の推定画像と、前記第１の撮影画像とを用いて、前記第１の撮影画像が撮影された位置を特定する特定部と、を備える情報処理装置。
　（付記２）
　前記特定部は、
　前記推定画像と前記第１の撮影画像とのマッチング結果に基づいて、前記第１の撮影画像が撮影された位置を特定する、付記１に記載の情報処理装置。
　（付記３）
　前記特定部は、
　前記推定画像に含まれる前記注目物体と前記第１の撮影画像に含まれる前記注目物体とのマッチング結果に基づいて前記第１の撮影画像が撮影された位置を特定する、付記１または２に記載の情報処理装置。
　（付記４）
　前記特定部は、
　前記第１の撮影画像が撮影された位置として複数の第１の位置を特定した場合、前記第１の撮影画像が撮影されたタイミングとは異なるタイミングに撮影された第２の撮影画像を用いて推定された前記候補位置において前記所定の空間を撮影した場合の推定画像と、前記第２の撮影画像とを用いて、前記第２の撮影画像が撮影された位置を特定する、付記１から３のいずれか１項に記載の情報処理装置。
　（付記５）
　前記推定部は、
　前記第２の撮影画像を用いて推定された少なくとも１つの前記候補位置のうち、前記複数の第１の位置との距離の差が所定の値を下回る候補位置を選択する、付記４に記載の情報処理装置。
　（付記６）
　前記第１の撮影画像は、点群データを含み、前記注目物体と前記第１の撮影画像が撮影された位置との間の距離は、前記第１の撮影画像に含まれる前記注目物体を示す点群データを用いて特定される、付記１から５のいずれか１項に記載の情報処理装置。
　（付記７）
　前記推定部は、
　さらに、GNSSを用いた位置情報に基づいて、前記少なくとも１つの候補位置を推定する、付記１から６のいずれか１項に記載の情報処理装置。
　（付記８）
　前記特定部は、
　前記推定画像として、前記参照データを生成する際に前記候補位置において撮影された参照データ用撮影画像を用いる、付記１から７のいずれか１項に記載の情報処理装置。
　（付記９）
　移動体と共に移動して前記移動体の周囲を撮影して第１の撮影画像を生成する撮影装置と、
　所定の空間における物体の位置を示すデータとして予め参照データを生成する第１の情報処理装置と、
　前記第１の撮影画像に含まれる注目物体と前記第１の撮影画像が撮影された位置との間の距離と、前記参照データにおける前記注目物体の位置と、を用いて、前記第１の撮影画像が撮影された位置の少なくとも１つの候補位置を推定し、前記少なくとも１つの候補位置において前記所定の空間を撮影した場合の推定画像と、前記第１の撮影画像とを用いて、前記第１の撮影画像が撮影された位置を特定する、第２の情報処理装置と、を備える情報処理システム。
　（付記１０）
　前記第２の情報処理装置は、
　前記推定画像と前記第１の撮影画像とのマッチング結果に基づいて、前記第１の撮影画像が撮影された位置を特定する、付記９に記載の情報処理システム。
　（付記１１）
　移動体から撮影された第１の撮影画像に含まれる注目物体と前記第１の撮影画像が撮影された位置との間の距離と、所定の空間における物体の位置を示すデータとして予め生成された参照データにおける前記注目物体の位置と、を用いて、前記第１の撮影画像が撮影された位置の少なくとも１つの候補位置を推定し、
　前記少なくとも１つの候補位置において前記所定の空間を撮影した場合の推定画像と、前記第１の撮影画像とを用いて、前記第１の撮影画像が撮影された位置を特定する、位置特定方法。
　（付記１２）
　前記撮影画像が撮影された位置を特定する際に、
　前記推定画像と前記第１の撮影画像とのマッチング結果に基づいて、前記第１の撮影画像が撮影された位置を特定する、付記１１に記載の位置特定方法。
　（付記１３）
　前記撮影画像が撮影された位置を特定する際に、
　前記推定画像に含まれる前記注目物体と前記第１の撮影画像に含まれる前記注目物体とのマッチング結果に基づいて前記第１の撮影画像が撮影された位置を特定する、付記１１または１２に記載の位置特定方法。
　（付記１４）
　前記撮影画像が撮影された位置を特定する際に、
　前記第１の撮影画像が撮影された位置として複数の第１の位置を特定した場合、前記第１の撮影画像が撮影されたタイミングとは異なるタイミングに撮影された第２の撮影画像を用いて推定された前記候補位置において前記所定の空間を撮影した場合の推定画像と、前記第２の撮影画像とを用いて、前記第２の撮影画像が撮影された位置を特定する、付記１１から１３のいずれか１項に記載の位置特定方法。
　（付記１５）
　前記候補位置を推定する際に、
　前記第２の撮影画像を用いて推定された少なくとも１つの前記候補位置のうち、前記複数の第１の位置との距離の差が所定の値を下回る候補位置を選択する、付記１４に記載の位置特定方法。
　（付記１６）
　前記第１の撮影画像は、点群データを含み、前記注目物体と前記第１の撮影画像が撮影された位置との間の距離は、前記第１の撮影画像に含まれる前記注目物体を示す点群データを用いて特定される、付記１１から１５のいずれか１項に記載の位置特定方法。
　（付記１７）
　前記候補位置を推定する際に、
　さらに、GNSSを用いた位置情報に基づいて、前記少なくとも１つの候補位置を推定する、付記１１から１６のいずれか１項に記載の位置特定方法。
　（付記１８）
　前記撮影画像が撮影された位置を特定する際に、
　前記推定画像として、前記参照データを生成する際に前記候補位置において撮影された参照データ用撮影画像を用いる、付記１１から１７のいずれか１項に記載の位置特定方法。
　（付記１９）
　所定の空間における物体の位置を示すデータとして参照データを生成し、
　移動体と共に移動して前記移動体の周囲を撮影して第１の撮影画像を生成し、
　前記第１の撮影画像に含まれる注目物体と前記第１の撮影画像が撮影された位置との間の距離と、前記参照データにおける前記注目物体の位置と、を用いて、前記第１の撮影画像が撮影された位置の少なくとも１つの候補位置を推定し、
　前記少なくとも１つの候補位置において前記所定の空間を撮影した場合の推定画像と、前記第１の撮影画像とを用いて、前記第１の撮影画像が撮影された位置を特定する、位置特定方法。
　（付記２０）
　移動体から撮影された第１の撮影画像に含まれる注目物体と前記第１の撮影画像が撮影された位置との間の距離と、所定の空間における物体の位置を示すデータとして予め生成された参照データにおける前記注目物体の位置と、を用いて、前記第１の撮影画像が撮影された位置の少なくとも１つの候補位置を推定し、
　前記少なくとも１つの候補位置において前記所定の空間を撮影した場合の推定画像と、前記第１の撮影画像とを用いて、前記第１の撮影画像が撮影された位置を特定する、ことをコンピュータに実行させるプログラムが格納された非一時的なコンピュータ可読媒体。 (Appendix 1)
an estimation unit that estimates at least one candidate position of the position at which the first photographed image was photographed, using a distance between a target object included in a first photographed image photographed from a moving body and a position at which the first photographed image was photographed, and a position of the target object in reference data that is generated in advance as data indicating the position of an object in a predetermined space;
An information processing device comprising: an identification unit that identifies a position where the first captured image was captured using an estimated image obtained when the specified space is captured at the at least one candidate position and the first captured image.
(Appendix 2)
The identification unit is
The information processing device according to claim 1, further comprising: a processor configured to detect a position where the first captured image was captured based on a matching result between the estimated image and the first captured image.
(Appendix 3)
The identification unit is
The information processing device according to claim 1 or 2, further comprising: determining a position at which the first captured image was captured based on a matching result between the target object included in the estimated image and the target object included in the first captured image.
(Appendix 4)
The identification unit is
An information processing device as described in any one of appendices 1 to 3, wherein when a plurality of first positions are identified as positions at which the first captured image was captured, the position at which the second captured image was captured is identified using an estimated image of the specified space captured at the candidate position estimated using a second captured image captured at a time different from the time at which the first captured image was captured, and the second captured image.
(Appendix 5)
The estimation unit is
The information processing device according to claim 4, further comprising: selecting, from among at least one of the candidate positions estimated using the second captured image, a candidate position whose difference in distance from the plurality of first positions is less than a predetermined value.
(Appendix 6)
The information processing device according to any one of appendixes 1 to 5, wherein the first captured image includes point cloud data, and the distance between the target object and the position at which the first captured image was captured is determined using point cloud data indicating the target object included in the first captured image.
(Appendix 7)
The estimation unit is
The information processing device according to any one of claims 1 to 6, further comprising: estimating the at least one candidate position based on position information using GNSS.
(Appendix 8)
The identification unit is
The information processing device according to any one of claims 1 to 7, wherein a captured image for reference data captured at the candidate position when generating the reference data is used as the estimated image.
(Appendix 9)
an image capturing device that moves together with the moving object and captures an image of the surroundings of the moving object to generate a first captured image;
A first information processing device that generates reference data in advance as data indicating a position of an object in a predetermined space;
and a second information processing device that estimates at least one candidate position for the position at which the first captured image was captured using a distance between a target object included in the first captured image and a position at which the first captured image was captured and the position of the target object in the reference data, and identifies the position at which the first captured image was captured using an estimated image obtained when the specified space is captured at the at least one candidate position and the first captured image.
(Appendix 10)
The second information processing device is
The information processing system according to claim 9, further comprising: identifying a position where the first captured image was captured based on a matching result between the estimated image and the first captured image.
(Appendix 11)
estimating at least one candidate position for the position at which the first photographed image was photographed, using a distance between a target object included in a first photographed image photographed from a moving body and a position at which the first photographed image was photographed, and a position of the target object in reference data that is generated in advance as data indicating a position of an object in a predetermined space;
A position identification method, comprising: identifying a position at which the first captured image was captured by using an estimated image obtained when the specified space is captured at the at least one candidate position and the first captured image.
(Appendix 12)
When identifying the location where the captured image was captured,
The position identification method according to claim 11, further comprising identifying a position at which the first captured image was captured based on a matching result between the estimated image and the first captured image.
(Appendix 13)
When identifying the location where the captured image was captured,
The position identification method according to claim 11 or 12, further comprising identifying a position at which the first captured image was captured based on a matching result between the target object included in the estimated image and the target object included in the first captured image.
(Appendix 14)
When identifying the location where the captured image was captured,
The position identification method according to any one of appendices 11 to 13, in which, when a plurality of first positions are identified as positions at which the first photographed image was taken, the position at which the second photographed image was taken is identified using an estimated image of the specified space photographed at the candidate position estimated using a second photographed image taken at a timing different from the timing at which the first photographed image was taken, and the second photographed image.
(Appendix 15)
When estimating the candidate positions,
The position identification method according to claim 14, further comprising selecting, from among at least one of the candidate positions estimated using the second captured image, a candidate position whose difference in distance from the plurality of first positions is less than a predetermined value.
(Appendix 16)
The position identification method according to any one of appendixes 11 to 15, wherein the first captured image includes point cloud data, and the distance between the target object and a position at which the first captured image was captured is identified using the point cloud data indicating the target object included in the first captured image.
(Appendix 17)
When estimating the candidate positions,
17. The method of claim 11, further comprising estimating the at least one candidate position based on location information using GNSS.
(Appendix 18)
When identifying the location where the captured image was captured,
18. The position identification method according to any one of appendices 11 to 17, wherein a captured image for reference data captured at the candidate position when generating the reference data is used as the estimated image.
(Appendix 19)
generating reference data as data indicating a position of an object in a predetermined space;
Moving together with the moving object and capturing an image of the surroundings of the moving object to generate a first captured image;
estimating at least one candidate position of the position at which the first photographed image was photographed, using a distance between a target object included in the first photographed image and a position at which the first photographed image was photographed, and a position of the target object in the reference data;
A position identification method, comprising: identifying a position at which the first captured image was captured by using an estimated image obtained when the specified space is captured at the at least one candidate position and the first captured image.
(Appendix 20)
estimating at least one candidate position for the position at which the first photographed image was photographed, using a distance between a target object included in a first photographed image photographed from a moving body and a position at which the first photographed image was photographed, and a position of the target object in reference data that is generated in advance as data indicating a position of an object in a predetermined space;
A non-transitory computer-readable medium having stored thereon a program for causing a computer to execute the following: identify the location at which the first captured image was captured by using an estimated image obtained when the specified space is captured at the at least one candidate location and the first captured image.

　１０　情報処理装置
　１１　推定部
　１２　特定部
　２０　情報処理装置
　２１　候補位置推定部
　２２　位置特定部
　２３　参照データ保持部
　２４　周辺画像データ取得部
　２５　位置情報取得部
　２６　設定部
　３０　移動体
　４０　撮影装置
　５０　情報処理装置
　６０　情報処理装置 REFERENCE SIGNS LIST 10 Information processing device 11 Estimation unit 12 Identification unit 20 Information processing device 21 Candidate position estimation unit 22 Position identification unit 23 Reference data storage unit 24 Peripheral image data acquisition unit 25 Position information acquisition unit 26 Setting unit 30 Mobile object 40 Photographing device 50 Information processing device 60 Information processing device

Claims

an estimation unit that estimates at least one candidate position of the position at which the first photographed image was photographed, using a distance between a target object included in a first photographed image photographed from a moving body and a position at which the first photographed image was photographed, and a position of the target object in reference data that is generated in advance as data indicating the position of an object in a predetermined space;
An information processing device comprising: an identification unit that identifies a position where the first captured image was captured using an estimated image obtained when the specified space is captured at the at least one candidate position and the first captured image.

The identification unit is
The information processing apparatus according to claim 1 , further comprising: identifying a position where the first captured image was captured based on a matching result between the estimated image and the first captured image.

The identification unit is
The information processing apparatus according to claim 1 , further comprising: determining a position where the first captured image was captured based on a matching result between the target object included in the estimated image and the target object included in the first captured image.

The identification unit is
4. The information processing device according to claim 1, wherein, when a plurality of first positions are identified as the positions at which the first captured image was captured, the position at which the second captured image was captured is identified using an estimated image of the specified space captured at the candidate position estimated using a second captured image captured at a time different from the time at which the first captured image was captured, and the second captured image.

The estimation unit is
The information processing apparatus according to claim 4 , further comprising: selecting, from among the at least one candidate position estimated using the second captured image, a candidate position whose difference in distance from the plurality of first positions is less than a predetermined value.

The information processing device according to any one of claims 1 to 5, wherein the first captured image includes point cloud data, and the distance between the target object and the position where the first captured image was captured is determined using point cloud data indicating the target object included in the first captured image.

The estimation unit is
The information processing device according to claim 1 , further comprising: a step of estimating the at least one candidate position based on position information using a GNSS.

The identification unit is
The information processing apparatus according to claim 1 , wherein a captured image for reference data captured at the candidate position when generating the reference data is used as the estimated image.

an image capturing device that moves together with the moving object and captures an image of the surroundings of the moving object to generate a first captured image;
A first information processing device that generates reference data in advance as data indicating a position of an object in a predetermined space;
and a second information processing device that estimates at least one candidate position for the position at which the first captured image was captured using a distance between a target object included in the first captured image and a position at which the first captured image was captured and the position of the target object in the reference data, and identifies the position at which the first captured image was captured using an estimated image obtained when the specified space is captured at the at least one candidate position and the first captured image.

The second information processing device is
The information processing system according to claim 9 , further comprising: identifying a position where the first captured image was captured based on a matching result between the estimated image and the first captured image.

estimating at least one candidate position for the position at which the first photographed image was photographed, using a distance between a target object included in a first photographed image photographed from a moving body and a position at which the first photographed image was photographed, and a position of the target object in reference data that is generated in advance as data indicating a position of an object in a predetermined space;
A position identification method, comprising: identifying a position at which the first captured image was captured by using an estimated image obtained when the specified space is captured at the at least one candidate position and the first captured image.

When identifying the location where the captured image was captured,
The position specifying method according to claim 11 , further comprising the step of specifying a position at which the first photographed image was taken based on a matching result between the estimated image and the first photographed image.

When identifying the location where the captured image was captured,
The position specifying method according to claim 11 or 12, further comprising specifying a position at which the first captured image was captured based on a matching result between the target object included in the estimated image and the target object included in the first captured image.

When identifying the location where the captured image was captured,
14. The method for identifying a position according to claim 11, wherein, when a plurality of first positions are identified as positions at which the first photographed image was taken, the position at which the second photographed image was taken is identified using an estimated image of the specified space photographed at the candidate position estimated using a second photographed image photographed at a timing different from the timing at which the first photographed image was taken, and the second photographed image.

When estimating the candidate positions,
The position identification method according to claim 14 , further comprising selecting, from among the at least one candidate position estimated using the second captured image, a candidate position whose difference in distance from the plurality of first positions is less than a predetermined value.

The position determination method according to any one of claims 11 to 15, wherein the first captured image includes point cloud data, and the distance between the target object and the position where the first captured image was captured is determined using point cloud data indicating the target object included in the first captured image.

When estimating the candidate positions,
The method of claim 11 , further comprising estimating the at least one candidate position based on location information using GNSS.

When identifying the location where the captured image was captured,
The position specifying method according to claim 11 , further comprising the step of: using, as the estimated image, a reference data captured image taken at the candidate position when generating the reference data.

generating reference data as data indicating a position of an object in a predetermined space;
Moving together with the moving object and capturing an image of the surroundings of the moving object to generate a first captured image;
estimating at least one candidate position of the position at which the first photographed image was photographed, using a distance between a target object included in the first photographed image and a position at which the first photographed image was photographed, and a position of the target object in the reference data;
A position identification method, comprising: identifying a position at which the first captured image was captured by using an estimated image obtained when the specified space is captured at the at least one candidate position and the first captured image.

estimating at least one candidate position for the position at which the first photographed image was photographed, using a distance between a target object included in a first photographed image photographed from a moving body and a position at which the first photographed image was photographed, and a position of the target object in reference data that is generated in advance as data indicating a position of an object in a predetermined space;
A non-transitory computer-readable medium having stored thereon a program for causing a computer to execute the following: identify the location at which the first captured image was captured by using an estimated image obtained when the specified space is captured at the at least one candidate location and the first captured image.