JP2017174403A

JP2017174403A - Information processing device, information processing method and program

Info

Publication number: JP2017174403A
Application number: JP2017011253A
Authority: JP
Inventors: 日出来空門; Hideki Sorakado
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-03-16
Filing date: 2017-01-25
Publication date: 2017-09-28

Abstract

PROBLEM TO BE SOLVED: To achieve both privacy protection and timing gap elimination.SOLUTION: An information processing device includes estimation means for estimating service availability in each user, classification means for classifying a plurality of users to groups on the basis of the availability and the similarity of context information of the plurality of users, corresponding to the service, and generation means for generating disclosure information to be disclosed to a providing source of the service in each group on the basis of text information of a user included in the group.SELECTED DRAWING: Figure 5

Description

本発明は、情報処理装置、情報処理方法及びプログラムに関する。 The present invention relates to an information processing apparatus, an information processing method, and a program.

従来、ユーザのコンテキストを推定して、状況にあった情報やサービスを推薦することが行われている。こうしたシステムや技術は、ＰｅｒｓｏｎａｌＡｓｓｉｓｔａｎｔ、ＣｏｎｔｅｘｔＡｗａｒｅＣｏｍｐｕｔｉｎｇ等と様々に呼ばれているが、ここではＰｅｒｓｏｎａｌＣｏｎｔｅｘｔＡｓｓｉｓｎｔａｎｔ（以下、ＰＣＡ）と呼ぶことにする。
例えばＰＣＡとしては、話題のニュースや都道府県等に関連したニュースを、通勤時間や現在地に基づいて提供することが行われている。また、位置情報を基に空港への到着等を検知するとタクシーの配車を依頼することを推薦すること等も考えられている。
しかし、サービスを推薦して、ユーザがそのサービスの利用を行っても、サービスによっては待ち時間を要することがある。例えば、タクシーの配車であれば、通知を見てから配車依頼を行うと、空港でタクシー到着まで待つ必要がある。
このように、ユーザがサービスを必要とするタイミングと、サービス提供者がサービスを提供できるタイミングと、にギャップが生じているために、待ち時間が発生する。
そのため、このタイミングギャップを解消するために、将来的には、システムが先回りしてサービスを準備・実行するようになると考える。例えば、特許文献１では、注文される前に商品を近くにまで出荷しておくことで、注文から配達までの時間短縮を実現すること等が提案されている。
システムが先回りしてサービスを準備するには、サービス提供元が準備をできるようにする必要がある。そのためのプラットフォームとして、コンテキストをサービス提供元と共有する仕組みができると予想する。
このとき、コンテキストを無条件に共有すると、正当な目的以外にコンテキストを使用する意思を有するサービス提供元がコンテキストを蓄積することが考えられる。このとき、ユーザの行動履歴がサービス提供側に残ってしまう。そのため、プライバシが保護されない。しかし、コンテキストをサービス利用の可能性が高まる直前まで共有しない場合、タイミングギャップが大きくなってしまう。 Conventionally, information and services suitable for a situation are recommended by estimating a user's context. These systems and technologies are variously called Personal Assistant, Context Aware Computing, etc., but here they are called Personal Context Assistant (hereinafter, PCA).
For example, as PCA, news related to hot topics or prefectures is provided based on commuting time or current location. It is also considered to recommend that taxi be dispatched when arrival at the airport is detected based on the location information.
However, even if a service is recommended and a user uses the service, a waiting time may be required depending on the service. For example, in the case of taxi dispatch, if a request for dispatch is made after seeing the notification, it is necessary to wait until the taxi arrives at the airport.
As described above, since there is a gap between the timing when the user needs the service and the timing when the service provider can provide the service, a waiting time occurs.
Therefore, in order to eliminate this timing gap, it is considered that the system will prepare and execute services ahead of time in the future. For example, Patent Document 1 proposes to shorten the time from ordering to delivery by shipping products close to each other before placing an order.
In order for the system to proactively prepare the service, the service provider needs to be able to prepare. As a platform for that, we expect to have a mechanism for sharing contexts with service providers.
At this time, if the context is unconditionally shared, it is conceivable that the service provider having the intention to use the context other than the legitimate purpose accumulates the context. At this time, the user's behavior history remains on the service provider side. Therefore, privacy is not protected. However, if the context is not shared until just before the possibility of using the service increases, the timing gap becomes large.

従来、プライバシを保護するためには匿名化技術があった（例えば、特許文献２）。特許文献２では、個々のデータがｋ匿名化の要求レベルを有するときに、全てのデータの要求レベルを満足しつつ、情報価値の低下を防ぐ匿名化方法を提案している。より具体的には、特許文献２の技術では、情報価値の低下を防ぐために、類似するデータをグループに分割して、要求レベルを満たすときグループを再度分割していくことを繰り返す。特許文献２の技術では、要求レベルを満たす最小のグループ単位で、匿名化の処理を行う。これによって、個々のデータが求める要求レベルを満たしつつ、情報価値の低下を防ぐ匿名化を行っている。 Conventionally, there has been an anonymization technique for protecting privacy (for example, Patent Document 2). Patent Document 2 proposes an anonymization method that prevents a decrease in information value while satisfying the required level of all data when each data has a required level of k anonymization. More specifically, in the technique of Patent Document 2, in order to prevent a decrease in information value, similar data is divided into groups, and the group is repeated when the required level is satisfied. In the technique of Patent Document 2, anonymization processing is performed in the minimum group unit that satisfies the required level. In this way, anonymization is performed to prevent a decline in information value while satisfying the required level required by individual data.

特表２００８−５２４７１４号公報Special table 2008-524714 gazette 国際公開第１３／０３１９９７号International Publication No. 13/031997

ユーザのコンテキストをより精度高く提供すれば、タイミングギャップをより小さくできる。そのため、サービスを利用する確率が高いとき、ユーザのコンテキストをより精度高く提供すべきである。
そこで、利用確率の高いユーザのコンテキストをより精度高く残すために、特許文献２における匿名化の要求レベルとして、サービスに利用確率を用いることが考えられる。つまり、サービスの利用確率が高いときほど、匿名化の要求レベルを下げることを行うことが考えられる。
しかしながら、特許文献２では、匿名化の要求レベルが低いユーザと高いユーザとを合わせて匿名化する。そのため、場合によっては、匿名化の要求レベルを低く希望するユーザが、匿名化の要求レベルが高いユーザに要求レベルを合わされてしまうことが起こりえる。このとき、匿名化の要求レベルを低くして、サービスの準備をより進めておいてほしいと希望したユーザの目的が達成されない問題が起きる。
本発明は、プライバシ保護とタイミングギャップ解消とを両立させることを目的とする。 If the user context is provided with higher accuracy, the timing gap can be made smaller. Therefore, when the probability of using the service is high, the user context should be provided with higher accuracy.
Therefore, in order to leave the context of a user with a high use probability with higher accuracy, it is conceivable to use the use probability for the service as the request level for anonymization in Patent Document 2. In other words, it can be considered that the request level of anonymization is lowered as the service use probability is higher.
However, in patent document 2, it anonymizes a user with a low request level of anonymization, and a high user. Therefore, in some cases, a user who desires a low request level for anonymization may cause the request level to be adjusted by a user who has a high request level for anonymization. At this time, there is a problem that the purpose of the user who desires to lower the request level of anonymization and to advance the preparation of the service is not achieved.
An object of the present invention is to achieve both privacy protection and timing gap resolution.

本発明は、ユーザごとにサービスの利用可能性を推定する推定手段と、前記利用可能性と複数のユーザのコンテキスト情報であって前記サービスに応じたコンテキスト情報の類似性とに基づき前記複数のユーザをグループに分類する分類手段と、グループに含まれるユーザのコンテキスト情報に基づき前記グループごとに前記サービスの提供元に開示する開示情報を生成する生成手段と、を有する。 The present invention provides an estimation means for estimating service availability for each user, and the plurality of users based on the availability and the similarity of context information of the plurality of users corresponding to the service. And a generating unit that generates disclosure information to be disclosed to the service provider for each group based on context information of users included in the group.

本発明によれば、プライバシ保護とタイミングギャップ解消とを両立させることができる。 According to the present invention, both privacy protection and timing gap elimination can be achieved.

情報処理システムのシステム構成の一例を示す図である。It is a figure which shows an example of the system configuration | structure of an information processing system. コンピュータのハードウェア構成の一例を示す図である。It is a figure which shows an example of the hardware constitutions of a computer. ＰＣＡサーバの機能構成の一例を示す図である。It is a figure which shows an example of a function structure of a PCA server. パラメータ等の一例を示す図である。It is a figure which shows an example of a parameter etc. 開示情報生成処理の一例を示すフローチャートである。It is a flowchart which shows an example of a disclosed information generation process. 匿名化グループ作成処理の一例を示すフローチャートである。It is a flowchart which shows an example of an anonymization group creation process. 限界誤差グループ解体処理の一例を示すフローチャートである。It is a flowchart which shows an example of a marginal error group disassembly process. 図７の処理を説明するための図である。It is a figure for demonstrating the process of FIG. コンテキスト情報の一例を示す図（その１）である。It is a figure which shows an example of context information (the 1). コンテキスト情報の一例を示す図（その２）である。It is a figure (example 2) which shows an example of context information. コンテキスト情報の一例を示す図（その３）である。FIG. 10 is a diagram (part 3) illustrating an example of context information; パラメータ等の一例を示す図である。It is a figure which shows an example of a parameter etc. コンテキスト情報の一例を示す図（その４）である。It is FIG. (4) which shows an example of context information. コンテキスト情報の一例を示す図（その５）である。It is FIG. (5) which shows an example of context information.

以下、本発明の実施形態について図面に基づいて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

＜実施形態１＞
本実施形態の情報処理システムのシステム構成について、図１を用いて説明する。
ＰＣＡクライアント１０１は、クライアントであり、コンテキスト情報を取得してＰＣＡサーバ１０２へと送信する。ＰＣＡクライアント１０１は、例えば、ユーザが保有するスマートフォン等のモバイルデバイスである。コンテキスト情報としては、例えば、ＧＰＳ（ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ）で得た緯度経度等の位置情報や、ジャイロセンサで得た角速度、磁気センサで得た方位、加速度センサで得た加速度等である。また、コンテキスト情報としては、例えば、ユーザの心拍等のバイタル、又は揺れ等から推定される集中度等の情報、デバイスの使用状況等の情報、近接通信等で取得される近くにあるデバイスやモノ等に関する情報等である。
ＰＣＡサーバ１０２は、複数のＰＣＡクライアント１０１から得たコンテキスト情報を匿名化して、複数のサービス１０３へコンテキスト情報を提供する。ＰＣＡサーバ１０２は、例えば、サーバ装置等である。又は、ＰＣＡサーバ１０２は、例えば、クラウド環境に構築された仮想サーバ等であってもよい。
また、ＰＣＡサーバ１０２は、ＰＣＡクライアント１０１から得たコンテキスト情報を加工して、より高次のコンテキスト情報を生成することを行ってもよい。そして、ＰＣＡサーバ１０２は、その高次のコンテキスト情報を匿名化して、サービス１０３へ提供するようにしてもよい。
高次のコンテキスト情報の生成処理については、図３のコンテキスト加工部３０２の説明において説明する。
サービス１０３は、匿名化されたコンテキスト情報に基づいて、ユーザに合わせたサービスの準備を行う。例えば、タクシーサービス等あれば、ユーザの位置情報等が匿名化して得られるため、その情報に基づいてタクシーの巡回ルートを変更する等を行う。 <Embodiment 1>
The system configuration of the information processing system of this embodiment will be described with reference to FIG.
The PCA client 101 is a client, acquires context information, and transmits it to the PCA server 102. The PCA client 101 is, for example, a mobile device such as a smartphone owned by the user. Examples of the context information include position information such as latitude and longitude obtained by GPS (Global Positioning System), angular velocity obtained by a gyro sensor, azimuth obtained by a magnetic sensor, acceleration obtained by an acceleration sensor, and the like. In addition, as context information, for example, vitality such as a user's heartbeat or information such as concentration estimated from shaking, information such as device usage status, nearby devices and objects acquired by proximity communication, etc. It is information about etc.
The PCA server 102 anonymizes the context information obtained from the plurality of PCA clients 101 and provides the context information to the plurality of services 103. The PCA server 102 is, for example, a server device. Alternatively, the PCA server 102 may be, for example, a virtual server constructed in a cloud environment.
The PCA server 102 may process the context information obtained from the PCA client 101 to generate higher order context information. Then, the PCA server 102 may anonymize the higher order context information and provide it to the service 103.
High-order context information generation processing will be described in the description of the context processing unit 302 in FIG.
The service 103 prepares a service according to the user based on the anonymized context information. For example, if there is a taxi service or the like, the user's location information and the like can be obtained by anonymizing, so the taxi route is changed based on the information.

本実施形態のサーバ装置やクライアント装置を構成するコンピュータの構成について、図２を参照して説明する。サーバ装置やクライアント装置はそれぞれ単一のコンピュータで実現してもよいし、必要に応じた複数のコンピュータに各機能を分散して実現するようにしてもよい。複数のコンピュータで構成される場合は、互いに通信可能なようにＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ（ＬＡＮ）等で接続されている。
ＣＰＵ２０１は、コンピュータ２００全体を制御するＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ（ＣＰＵ）である。ＲＯＭ２０２は、変更を必要としないプログラムやパラメータを格納するＲｅａｄＯｎｌｙＭｅｍｏｒｙ（ＲＯＭ）である。ＲＡＭ２０３は、外部装置等から供給されるプログラムやデータを一時記憶するＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ（ＲＡＭ）である。外部記憶装置２０４は、コンピュータ２００に固定して設置されたハードディスクやメモリカードである。又は、外部記憶装置２０４は、外部記憶装置２０４には、コンピュータ２００から着脱可能なフレキシブルディスク（ＦＤ）やＣｏｍｐａｃｔＤｉｓｋ（ＣＤ）等の光ディスク、磁気や光カード、ＩＣカード、メモリカード等を含んでもよい。入力デバイスインターフェイス２０５は、ユーザの操作を受け、データを入力するポインティングデバイスやキーボード等の入力デバイス２０９とのインターフェイスである。出力デバイスインターフェイス２０６は、コンピュータ２００の保持するデータや供給されたデータを表示するためのモニタ２１０とのインターフェイスである。通信インターフェイス２０７は、インターネット等のネットワーク回線２１１に接続するためのネットワークインターフェイスである。システムバス２０８は、２０１〜２０７の各ユニットを通信可能に接続するシステムバスである。
本実施形態のクライアント装置であるＰＣＡクライアント１０１においては、センサ等を含んでいる。例えば、センサとは、ＧＰＳセンサ、ジャイロセンサ、磁気センサ、加速度センサ等である。また、心拍等を取得するバイタルセンサは他装置の形態をとり、ＰＣＡクライアント１０１は、通信インターフェイス等からデータを取得するように構成してもよい。また、ＰＣＡクライアント１０１がスマートフォン等の形態であれば、通話部等や撮像部等を含むものになる。
サーバ装置のＣＰＵ２０１がサーバ装置のＲＯＭ２０２又は外部記憶装置２０４に記憶されたプログラムに基づき処理を実行することによって後述する図３に示すサーバ装置の機能、図５〜７のフローチャートの処理が実現される。 A configuration of a computer constituting the server device and the client device according to the present embodiment will be described with reference to FIG. Each of the server device and the client device may be realized by a single computer, or may be realized by distributing functions to a plurality of computers as necessary. When configured by a plurality of computers, they are connected by a local area network (LAN) or the like so that they can communicate with each other.
The CPU 201 is a central processing unit (CPU) that controls the entire computer 200. The ROM 202 is a Read Only Memory (ROM) that stores programs and parameters that do not need to be changed. The RAM 203 is a Random Access Memory (RAM) that temporarily stores programs and data supplied from an external device or the like. The external storage device 204 is a hard disk or memory card that is fixedly installed in the computer 200. Alternatively, the external storage device 204 may include an optical disk such as a flexible disk (FD) and a Compact Disk (CD) that can be detached from the computer 200, a magnetic or optical card, an IC card, a memory card, and the like. Good. An input device interface 205 is an interface with an input device 209 such as a pointing device or a keyboard that receives data from a user and inputs data. The output device interface 206 is an interface with the monitor 210 for displaying data held by the computer 200 and supplied data. The communication interface 207 is a network interface for connecting to a network line 211 such as the Internet. A system bus 208 is a system bus that connects the units 201 to 207 so that they can communicate with each other.
The PCA client 101 which is a client device of this embodiment includes a sensor and the like. For example, the sensor is a GPS sensor, a gyro sensor, a magnetic sensor, an acceleration sensor, or the like. In addition, the vital sensor for acquiring a heartbeat or the like may take the form of another device, and the PCA client 101 may be configured to acquire data from a communication interface or the like. Further, if the PCA client 101 is in the form of a smartphone or the like, it includes a telephone call unit and an imaging unit.
When the CPU 201 of the server apparatus executes processing based on a program stored in the ROM 202 or the external storage device 204 of the server apparatus, the functions of the server apparatus shown in FIG. 3 described later and the processes of the flowcharts of FIGS. .

本実施形態ではＰＣＡサーバ１０２が、ＰＣＡクライアント１０１から収集したコンテキスト情報を匿名化して、サービス１０３へと情報を開示する。以下、本実施形態における、開示する情報を生成するシステムの機能構成について図３を用いて説明する。本実施形態ではＰＣＡサーバ１０２に図３に示す機能が実装されているものとするが、ＰＣＡクライアント１０１等に機能の一部を分散する構成としてもよい。
コンテキスト受信部３０１は、ＰＣＡクライアント１０１からコンテキスト情報を受信する。より具体的には、コンテキスト受信部３０１は、通信インターフェイス２０７を介して、位置情報等のコンテキスト情報を受信する。
コンテキスト加工部３０２は、コンテキスト情報を加工して、高次なコンテキスト情報を生成する。
例えば、コンテキスト加工部３０２は、ＰＣＡクライアント１０１から緯度経度が得られたとき、その緯度経度を住所等の情報に変換することを行ってもよい。
また、コンテキスト加工部３０２は、周囲のモノや人との関係性に関する情報を生成してもよい。例えば、コンテキスト加工部３０２は、複数のユーザから得られた位置情報の軌跡から、同一の軌跡を描くユーザ同士が一緒に行動していること等を推定して、同行者の情報を生成すること等を行ってもよい。又は、コンテキスト加工部３０２は、近接通信で得られたモノを特定して、モノを携帯していること等を特定してもよい。例えば、コンテキスト加工部３０２は、スマートウォッチ等の携帯等を特定してもよい。
又は、コンテキスト加工部３０２は、身体状態に関する情報を生成してもよい。例えば、コンテキスト加工部３０２は、位置情報や加速度センサ等から歩行しているのか、自動車に乗っているのか等の移動状態を推定し、身体状態に関する情報を生成してもよい。又は、コンテキスト加工部３０２は、歩行状態の継続時間等から疲労度等を推定し、身体状態に関する情報を生成してもよい。
又は、コンテキスト加工部３０２は、心理状態に関する情報を生成してもよい。例えば、コンテキスト加工部３０２は、職場等の集中すべき環境にいるが集中度が低いことから、集中したいと考えている等といったことを推定し、心理状態に関する情報を生成してもよい。又は、心理状態としては嗜好等でもよい。例えば、コンテキスト加工部３０２は、過去の購買履歴等から、ユーザが望む商品の情報等を生成する。又は、コンテキスト加工部３０２は、予め入力された「ほしいものリスト」等を基に、ユーザが望む商品等を生成してもよい。
本実施形態では、コンテキスト情報としては、いつ・どこで・何を・誰としているのか等の状況を指す情報に加えて、ユーザと人・モノとの関係性、ユーザの心身状態等を指す情報とする。
コンテキスト加工部３０２の機能は、ＰＣＡクライアント１０１とＰＣＡサーバ１０２とで分散して実現するようにしてもよい。
ユーザ管理部３０３は、ユーザ情報やユーザのコンテキスト情報を管理する。より具体的には、コンテキスト受信部３０１やコンテキスト加工部３０２で取得したコンテキスト情報を、ユーザ管理部３０３は、外部記憶装置２０４に保存して管理する。
サービス管理部３０４は、ユーザ情報を開示するサービスを管理する。より具体的には、サービス管理部３０４は、サービスの情報に関するリストを、外部記憶装置２０４に保存して管理する。 In this embodiment, the PCA server 102 anonymizes the context information collected from the PCA client 101 and discloses the information to the service 103. Hereinafter, the functional configuration of a system for generating information to be disclosed in the present embodiment will be described with reference to FIG. In this embodiment, it is assumed that the functions shown in FIG. 3 are implemented in the PCA server 102. However, a part of the functions may be distributed to the PCA client 101 or the like.
The context receiving unit 301 receives context information from the PCA client 101. More specifically, the context receiving unit 301 receives context information such as position information via the communication interface 207.
The context processing unit 302 processes the context information to generate higher-order context information.
For example, when the latitude / longitude is obtained from the PCA client 101, the context processing unit 302 may convert the latitude / longitude into information such as an address.
Further, the context processing unit 302 may generate information related to the relationship with surrounding objects and people. For example, the context processing unit 302 estimates the users who draw the same trajectory together from the trajectories of position information obtained from a plurality of users, and generates companion information. Etc. may be performed. Alternatively, the context processing unit 302 may specify a thing obtained by proximity communication and specify that the thing is carried. For example, the context processing unit 302 may specify a mobile phone such as a smart watch.
Or the context process part 302 may produce | generate the information regarding a physical condition. For example, the context processing unit 302 may estimate a movement state such as whether the user is walking or riding a car from position information, an acceleration sensor, and the like, and may generate information related to the body state. Or the context process part 302 may estimate a fatigue degree etc. from the continuation time etc. of a walk state, and may produce | generate the information regarding a physical state.
Or the context process part 302 may produce | generate the information regarding a psychological state. For example, the context processing unit 302 may generate information related to the psychological state by estimating that the user is in an environment where the user should concentrate, such as the workplace, but the concentration is low, and wants to concentrate. Or preference etc. may be sufficient as a psychological state. For example, the context processing unit 302 generates product information desired by the user from past purchase history and the like. Or the context process part 302 may produce | generate the goods etc. which a user desires based on the "wish list" etc. which were input previously.
In this embodiment, as context information, in addition to information indicating the situation such as when, where, what, and who, information indicating the relationship between the user and the person / thing, the state of mind and body of the user, and the like To do.
The functions of the context processing unit 302 may be realized by being distributed between the PCA client 101 and the PCA server 102.
The user management unit 303 manages user information and user context information. More specifically, the user management unit 303 stores and manages the context information acquired by the context reception unit 301 and the context processing unit 302 in the external storage device 204.
The service management unit 304 manages a service that discloses user information. More specifically, the service management unit 304 stores and manages a list relating to service information in the external storage device 204.

開示情報生成部３０５は、サービスの提供元に開示するコンテキスト情報を生成する。より具体的な処理方法は次の通りである。まず、開示情報生成部３０５は、ユーザのサービスの利用確率を推定して、利用確率ごとにユーザをレイヤに分ける。利用確率は、利用可能性の一例である。次に、開示情報生成部３０５は、レイヤごとに匿名化する情報でユーザをグループに分ける。開示情報生成部３０５は、グループごとに匿名化処理を行うことで、開示するコンテキスト情報を生成する。詳細な処理内容については、図５のフローチャートを用いて後述する。
開示情報生成部３０５は、コンテキスト取得部３０６、利用確率算出部３０７、利用確率分類部３０８、匿名化分類部３０９、匿名化情報生成部３１０より構成される。
コンテキスト取得部３０６は、匿名化を行う前のユーザのコンテキスト情報を取得する。より具体的には、コンテキスト取得部３０６は、ユーザ管理部３０３から、ユーザのコンテキスト情報を検索して取得する。
利用確率算出部３０７は、ユーザごとにサービスの利用確率を算出する。より具体的には、利用確率算出部３０７は、サービス管理部３０４で管理されているサービスごとに、サービスの利用確率を推定する。例えば、利用確率算出部３０７は、タクシーサービス等であれば、タクシーの配車依頼を行う確率を推定する。より具体的には、利用確率算出部３０７は、普段の行動からタクシーに乗る状況を学習して、現在の状況からその状況に到達する確率を算出する。例えば、ショッピングモールで買い物を行った後に、外が雨の場合において、タクシーを呼ぶことが多いとする。例えば、利用確率算出部３０７は、ショッピングモールにユーザがいることを位置情報等で特定して、外部の天気の降水確率等から、タクシーを呼ぶ確率を算出することができる。但し、利用確率の算出方法はこれらに限定されるものではない。
利用確率分類部３０８は、利用確率算出部３０７で求めたユーザの利用確率が、同程度のユーザごとに分けることを行う。本実施形態では、この分類をレイヤと呼ぶ。例えば、利用確率分類部３０８は、同程度とする利用確率の区間を、サービスごとに予め定めておいたものを利用する。利用確率分類部３０８は、この利用確率の区間等を、サービス１０３から取得したものを利用するようにしてもよい。 The disclosure information generation unit 305 generates context information disclosed to the service provider. A more specific processing method is as follows. First, the disclosure information generation unit 305 estimates the use probability of the user's service, and divides the user into layers for each use probability. Usage probability is an example of availability. Next, the disclosure information generation unit 305 divides users into groups with information to be anonymized for each layer. The disclosure information generation unit 305 generates context information to be disclosed by performing anonymization processing for each group. Detailed processing contents will be described later with reference to the flowchart of FIG.
The disclosure information generation unit 305 includes a context acquisition unit 306, a usage probability calculation unit 307, a usage probability classification unit 308, an anonymization classification unit 309, and an anonymization information generation unit 310.
The context acquisition unit 306 acquires user context information before anonymization. More specifically, the context acquisition unit 306 searches and acquires user context information from the user management unit 303.
The usage probability calculation unit 307 calculates the usage probability of the service for each user. More specifically, the use probability calculation unit 307 estimates the service use probability for each service managed by the service management unit 304. For example, the use probability calculation unit 307 estimates the probability of requesting a taxi dispatch for a taxi service or the like. More specifically, the use probability calculation unit 307 learns the situation of getting on a taxi from a normal action, and calculates the probability of reaching the situation from the current situation. For example, suppose that a taxi is often called when it is raining outside after shopping in a shopping mall. For example, the use probability calculating unit 307 can specify that there is a user in a shopping mall by using location information and the like, and can calculate the probability of calling a taxi from the precipitation probability of external weather. However, the usage probability calculation method is not limited to these.
The usage probability classification unit 308 divides the usage probability of the user obtained by the usage probability calculation unit 307 for each user having the same level. In this embodiment, this classification is called a layer. For example, the use probability classifying unit 308 uses a section in which use probabilities having the same degree are set in advance for each service. The usage probability classifying unit 308 may use the usage probability section acquired from the service 103.

匿名化分類部３０９は、利用確率分類部３０８で得たレイヤごとに、匿名化対象のコンテキスト情報の類似性に基づいてユーザをグループに分類する。このとき、匿名化分類部３０９は、グループのサイズはレイヤごとに予め定められた値を用いる。又は、匿名化分類部３０９は、利用確率に基づいてグループのサイズを決定するようにする。詳細なグループサイズの決定方法は、フローチャート図５〜７を説明した後に述べる。
コンテキスト情報の類似性としては、コンテキスト情報が「年齢」等の数値であれば、数値の差の逆数等を類似性にできる。また、コンテキスト情報が「位置情報」等のベクトルであれば、ベクトル間の距離の逆数等を類似性とできる。又は、コサイン類似度等をとってもよい。また、コンテキスト情報が「性別」等のカテゴリカルなデータである場合は、一致しているか否かで類似度を定義できる。
また、階層を持ったカテゴリカルなデータである場合もある。このとき、何階層目までが一致するかを用いて、一致する階層数を類似度としてもよい。又は、階層木をたどって片方のデータからもう片方のデータまで移動するときに必要な移動回数等をデータ間の距離として、距離の逆数等で類似度を定義してもよい。
また、コンテキスト情報がベクトルやカテゴリカルなデータ等の複数から構成される場合は、ベクトルデータをカテゴリカルなデータに変換して、カテゴリカルなデータ同士が一致する回数等からデータの類似度を決定してもよい。又は、コンテキスト情報を構成するデータごとに類似度を求め、類似度の重みづけ和等をとってもよい。本実施形態におけるコンテキスト情報の類似性の定義はこれらに限定されるものではない。
コンテキスト情報に基づいたユーザの分類方法としては、ｋ−ｍｅａｎｓやスペクトラルクラスタリング等のクラスタリング手法でコンテキスト情報の類似性に基づいて分類してもよい。又は、ｋ匿名化手法で良く用いられる一般化階層木を用いる手法を利用してもよい。本実施形態におけるユーザの分類方法はこれらに限定されるものではない。
匿名化情報生成部３１０は、匿名化分類部３０９で得たグループごとに、開示するコンテキスト情報を生成する。例えば、匿名化情報生成部３１０は、開示するコンテキスト情報が位置情報であれば、グループを構成するユーザの位置情報の重心等を求める。加えて、匿名化情報生成部３１０は、重心からグループを構成するユーザまでの最大距離を半径等として求めて開示する情報としてもよい。また、匿名化情報生成部３１０は、開示するコンテキスト情報が住所等の属性値であれば、番地を省略する等の一般化を行う。但し、匿名化情報の生成処理については、これらに限定されるものではない。 The anonymization classification unit 309 classifies users into groups based on the similarity of context information to be anonymized for each layer obtained by the use probability classification unit 308. At this time, the anonymization classification unit 309 uses a predetermined value for each layer as the size of the group. Or the anonymization classification | category part 309 determines the size of a group based on a use probability. A detailed group size determination method will be described after the flowcharts 5 to 7 are described.
As the similarity of the context information, if the context information is a numerical value such as “age”, the reciprocal of the difference between the numerical values can be made similar. If the context information is a vector such as “position information”, the reciprocal of the distance between the vectors can be regarded as similarity. Or you may take cosine similarity. Further, when the context information is categorical data such as “sex”, the similarity can be defined based on whether or not they match.
Moreover, it may be categorical data having a hierarchy. At this time, the number of matching hierarchies may be used as the similarity by using how many hierarchies match. Alternatively, the degree of similarity may be defined by the reciprocal of the distance, etc., with the number of movements required when moving from one data to the other data following the hierarchical tree.
In addition, when the context information is composed of a plurality of vectors, categorical data, etc., vector data is converted into categorical data, and the similarity of data is determined based on the number of times categorical data matches. May be. Alternatively, the similarity may be obtained for each data constituting the context information, and the weighted sum of the similarities may be taken. The definition of the similarity of context information in the present embodiment is not limited to these.
As a user classification method based on context information, classification may be performed based on the similarity of context information by a clustering technique such as k-means or spectral clustering. Alternatively, a method using a generalized hierarchical tree often used in the k anonymization method may be used. The user classification method in the present embodiment is not limited to these.
The anonymization information generation unit 310 generates context information to be disclosed for each group obtained by the anonymization classification unit 309. For example, if the context information to be disclosed is position information, the anonymized information generation unit 310 obtains the center of gravity of the position information of users constituting the group. In addition, the anonymized information generation unit 310 may obtain and disclose the maximum distance from the center of gravity to the users constituting the group as a radius or the like. Further, the anonymization information generation unit 310 performs generalization such as omitting the address if the disclosed context information is an attribute value such as an address. However, the anonymization information generation process is not limited to these.

コンテキスト開示部３１１は、匿名化情報生成部３１０が生成したコンテキスト情報をサービスに開示をする。例えば、コンテキスト開示部３１１は、位置情報を匿名化して開示する場合においては、匿名化情報生成部３１０で重心と半径とを求めているため、これをユーザＩＤと関連付けて開示を行う。又は、コンテキスト開示部３１１は、グループ単位に情報を開示するようにしてもよく、匿名化分類部３０９で生成したグループにＩＤを付与して、グループＩＤに重心と半径とユーザＩＤリストとを紐付けて開示するようにしてもよい。但し、コンテキスト情報の開示方法は、これらに限定されるものではない。
関連付けるユーザＩＤはサービスに閉じたものとすることが望ましく、ＰＣＡサーバ内でのユーザＩＤを用いない方が望ましい。なぜならば、本実施形態では、開示されたコンテキスト情報から一定以上の人数以下には絞り込めないように、コンテキスト情報を生成している。しかし、ＰＣＡ全体でユーザを一意に特定できるＩＤを利用すると、特定性が生まれてしまう。一方で、サービスに閉じたＩＤであれば、サービスに開示されていないコンテキスト情報との紐付けを行うには、サービスのユーザＩＤを用いることはできない。そのため、コンテキスト情報の一致をとって、データを結合することになる。しかし、コンテキスト情報は匿名化されているため、一定数以上には絞り込むことができない。そのため、ユーザが想定する以上にサービスに対して情報が開示されることがなくなる。
但し、ＰＣＡサーバ内でのユーザＩＤを用いるようにしてもよい。このとき、サービスに開示されていないコンテキスト情報であっても、ユーザＩＤを用いることで、容易に結合することができてしまう。しかし、開示されるコンテキスト情報は、匿名化の過程で、精度が落ちているため、ある程度のプライバシを保護できる。
開示される情報にユーザＩＤを含む必要はなく、コンテキスト情報だけを開示するようにしてもよい。但し、開示する情報の構成については、これらに限定されるものではない。 The context disclosure unit 311 discloses the context information generated by the anonymized information generation unit 310 to the service. For example, in the case where the location information is anonymized and disclosed, the context disclosure unit 311 obtains the center of gravity and the radius by the anonymization information generation unit 310, and thus discloses the information in association with the user ID. Alternatively, the context disclosure unit 311 may disclose information in units of groups, assigning an ID to the group generated by the anonymization classification unit 309, and linking the center of gravity, the radius, and the user ID list to the group ID. You may make it disclose. However, the disclosure method of the context information is not limited to these.
The associated user ID is preferably closed to the service, and it is preferable not to use the user ID in the PCA server. This is because, in the present embodiment, context information is generated so that the disclosed context information cannot be narrowed down to a certain number of people or less. However, if an ID that can uniquely identify a user in the entire PCA is used, specificity is born. On the other hand, if the ID is closed to the service, the user ID of the service cannot be used for association with the context information that is not disclosed in the service. Therefore, the data is combined by matching the context information. However, since the context information is anonymized, it cannot be narrowed down beyond a certain number. Therefore, information is not disclosed to the service more than the user assumes.
However, a user ID in the PCA server may be used. At this time, even context information that is not disclosed in the service can be easily combined by using the user ID. However, since the accuracy of the disclosed context information is reduced in the process of anonymization, a certain degree of privacy can be protected.
It is not necessary to include the user ID in the disclosed information, and only the context information may be disclosed. However, the configuration of information to be disclosed is not limited to these.

次に、本実施形態における開示情報生成処理について、フローチャート図５を用いて説明する。本処理は、開示情報生成部３０５によって実行され、サービスへ開示するコンテキスト情報を生成する。そのため、本処理では、サービスをパラメータとして指定して、処理を実行する。加えて、サービスごとに設定されたパラメータも与えて実行する。パラメータについて図４（ａ）を用いて説明する。第一のパラメータは、ユーザをレイヤに分ける際に利用する利用確率の区間である。この例では、４つのレイヤに分けるための区間が定められている。第二のパラメータは、レイヤごとのコンテキスト情報の開示可否を定義している。「否」は情報を開示しないことを示す。「可（保護）」は匿名化したうえで情報を開示することを示す。「可（素値）」は匿名化を行わずに、情報を開示してよいことを示す。第三のパラメータは、匿名化グループサイズであり、レイヤ内で生成する匿名化グループのサイズを指定している。第四のパラメータの限界誤差は、レイヤ内で生成する匿名化情報の限界とすべき誤差を指定している。詳細は、図６のフローチャートを用いて後述する。限界誤差は設定誤差の一例である。符号は本実施形態で説明に使用するシンボルであり、パラメータではない。
以下の説明では、位置情報を匿名化する場合を例に説明を行う。以下、図５のフローチャートに従って説明する。
Ｓ５０１では、利用確率算出部３０７は、ユーザごとにサービスの利用確率を推定する。より具体的には、利用確率算出部３０７は、利用確率算出部３０７を用いて、サービスを利用する確率をユーザごとに推定する。
Ｓ５０２では、利用確率分類部３０８は、ユーザを利用確率でレイヤに分ける。より具体的には、利用確率分類部３０８は、図４（ａ）に示す利用確率の区間を用いて、ユーザがどの利用確率の区間に該当するかを決定する。
Ｓ５０３では、利用確率分類部３０８は、匿名化対象のレイヤを特定する。より具体的には、利用確率分類部３０８は、図４（ａ）に示す開示可否を参照して、「可（保護）」となっているレイヤを特定する。
Ｓ５０４では、匿名化分類部３０９は、Ｓ５０３で特定されたレイヤを対象として、匿名化グループ作成処理を行う。詳細は、図６のフローチャートを用いて後述する。
Ｓ５０５では、匿名化情報生成部３１０は、匿名化グループごとに開示情報を生成する。例えば、匿名化情報生成部３１０は、開示するコンテキスト情報が位置情報であれば、グループを構成するユーザの位置情報の重心と半径とを開示情報とする。
Ｓ５０３で匿名化対象とされなかったレイヤのユーザ情報に関しては、匿名化情報生成部３１０は、開示可否のパラメータに基づいて開示方法を決定する。より具体的には、匿名化情報生成部３１０は、「否」のレイヤのユーザに関しては、情報を開示しない。一方で、「可（素値）」のユーザに関しては、そのまま情報を開示することを行う。
即ち、匿名化情報生成部３１０は、後述するグループの利用確率に基づき開示情報の匿名化の強弱を変える。より具体的に説明すると、匿名化情報生成部３１０は、グループの利用確率が高いほど開示情報の匿名化を弱くし（例えば、開示情報を素値のままとし）、グループの利用確率が低いほど開示情報の匿名化を高くする（例えば、開示情報を開示しない）。 Next, the disclosure information generation processing in this embodiment will be described with reference to the flowchart of FIG. This process is executed by the disclosure information generation unit 305 to generate context information to be disclosed to the service. For this reason, in this process, the service is specified as a parameter and the process is executed. In addition, a parameter set for each service is also given and executed. The parameters will be described with reference to FIG. The first parameter is a usage probability interval used when dividing users into layers. In this example, sections to be divided into four layers are defined. The second parameter defines whether to disclose context information for each layer. “No” indicates that information is not disclosed. “Permitted (protected)” indicates that information is disclosed after anonymization. “Yes (elementary value)” indicates that information may be disclosed without anonymization. The third parameter is the anonymization group size, which specifies the size of the anonymization group generated in the layer. The limit error of the fourth parameter specifies an error that should be the limit of the anonymization information generated in the layer. Details will be described later with reference to the flowchart of FIG. The limit error is an example of a setting error. The symbol is a symbol used for explanation in the present embodiment, not a parameter.
In the following description, a case where position information is anonymized will be described as an example. Hereinafter, a description will be given according to the flowchart of FIG.
In S501, the use probability calculation unit 307 estimates the use probability of the service for each user. More specifically, the use probability calculation unit 307 uses the use probability calculation unit 307 to estimate the probability of using the service for each user.
In S502, the use probability classifying unit 308 divides users into layers based on use probabilities. More specifically, the use probability classifying unit 308 determines which use probability section the user corresponds to using the use probability section shown in FIG.
In S503, the use probability classifying unit 308 identifies the anonymization target layer. More specifically, the use probability classification unit 308 refers to the disclosure availability shown in FIG. 4A and identifies the layer that is “permitted (protected)”.
In S504, the anonymization classification unit 309 performs an anonymization group creation process for the layer specified in S503. Details will be described later with reference to the flowchart of FIG.
In S505, the anonymization information generation unit 310 generates disclosure information for each anonymization group. For example, if the context information to be disclosed is position information, the anonymized information generation unit 310 uses the center of gravity and the radius of the position information of the users constituting the group as disclosure information.
For the user information of the layer that has not been anonymized in S503, the anonymization information generation unit 310 determines a disclosure method based on a parameter indicating whether disclosure is possible. More specifically, the anonymized information generation unit 310 does not disclose information regarding the user of the “No” layer. On the other hand, the information is disclosed as it is for the user who is “possible (primary value)”.
That is, the anonymized information generation unit 310 changes the strength of anonymization of the disclosed information based on the use probability of the group described later. More specifically, the anonymization information generation unit 310 weakens the anonymization of the disclosure information as the use probability of the group is high (for example, the disclosure information is left as a raw value), and the use probability of the group is low. Increase the anonymization of disclosed information (for example, do not disclose disclosed information).

次に、本実施形態における匿名化グループ作成処理について、図６のフローチャートを用いて説明する。本処理では、匿名化対象とされたレイヤごとに、ユーザ集合の分割を繰り返すことで、匿名化グループを作成することを行う。以下、ステップごとに説明する。
Ｓ６０１では、匿名化分類部３０９は、図５のＳ５０３で特定された匿名化対象のレイヤを順に処理するためのループであり、匿名化対象のレイヤには１から順に番号が割り当てられているものとする。匿名化分類部３０９は、レイヤを変数ｉを用いて参照するため、はじめにｉを１に初期化する。更に、匿名化分類部３０９は、ｉがレイヤ数以下であるときＳ６０２へ移り、これを満たさないときループを抜けてＳ６０９へ移る。
Ｓ６０２では、匿名化分類部３０９は、レイヤｉのユーザを１グループとして分割候補グループリストに登録する。
Ｓ６０３では、匿名化分類部３０９は、分割候補グループリストからグループを取り出してコンテキスト情報の類似性に基づき分割する。匿名化分類部３０９は、取り出したグループは分割候補グループリストから削除する。匿名化分類部３０９は、分割方法としては、ｋ−ｍｅａｎｓ等の分割クラスタリング手法を用いて２つにグループを分割する。又は、匿名化分類部３０９は、スペクトラルクラスタリング等の手法を用いて分割してもよい。グループの分割方法はこれらに限定されるものではない。
Ｓ６０４では、匿名化分類部３０９は、Ｓ６０３で分割された結果を評価して、分割を行ってもよいか否かを判定する。より具体的には、匿名化分類部３０９は、パラメータとして与えられた匿名化グループサイズの条件を、分割後のグループのサイズが満たしていることを確認する。例えば、図４（ａ）に示す例では、利用確率の区間が０．４以上０．７未満では、匿名化グループのサイズは「６以上」であることが条件として設定されている。匿名化分類部３０９は、各々の分割後グループで、分割後のグループのサイズが条件を満たしているとき、分割可能（ＯＫ）と判断して、Ｓ６０５へ移る。匿名化分類部３０９は、それ以外の場合は、Ｓ６０６へ移る。
Ｓ６０５では、匿名化分類部３０９は、分割候補グループリストに分割したグループを登録する。
Ｓ６０６では、匿名化分類部３０９は、Ｓ６０３の分割をキャンセルして、分割前のグループを完成グループリストに登録する。
Ｓ６０７では、匿名化分類部３０９は、分割候補グループリストにグループがあるか否かを判定する。匿名化分類部３０９は、グループがあるときＳ６０３へ移り、それ以外は、Ｓ６０８へ移る。
Ｓ６０８は、レイヤのループの終端であり、匿名化分類部３０９は、ｉに１を加算してＳ６０１へ戻る。
Ｓ６０９では、匿名化分類部３０９は、限界誤差グループ解体処理を行う。詳細は、図７のフローチャートを用いて説明する。 Next, the anonymization group creation process in this embodiment will be described with reference to the flowchart of FIG. In this process, an anonymization group is created by repeating the division of the user set for each layer that is to be anonymized. Hereinafter, each step will be described.
In S601, the anonymization classification unit 309 is a loop for sequentially processing the layers to be anonymized specified in S503 in FIG. 5, and the layers to be anonymized are assigned numbers in order from 1. And Since the anonymization classification unit 309 refers to the layer using the variable i, first, i is initialized to 1. Further, the anonymization classification unit 309 proceeds to S602 when i is equal to or smaller than the number of layers, and proceeds to S609 through a loop when i is not satisfied.
In S602, the anonymization classification unit 309 registers the user of layer i as one group in the division candidate group list.
In step S603, the anonymization classification unit 309 extracts a group from the division candidate group list and divides the group based on the similarity of the context information. The anonymization classification unit 309 deletes the extracted group from the division candidate group list. The anonymization classification unit 309 divides the group into two using a division clustering method such as k-means as a division method. Or the anonymization classification | category part 309 may divide | segment using methods, such as spectral clustering. The group dividing method is not limited to these.
In S604, the anonymization classification unit 309 evaluates the result divided in S603 and determines whether or not the division may be performed. More specifically, the anonymization classification unit 309 confirms that the size of the group after the division satisfies the condition of the anonymization group size given as a parameter. For example, in the example illustrated in FIG. 4A, it is set as a condition that the size of the anonymization group is “6 or more” when the use probability interval is 0.4 or more and less than 0.7. The anonymization classification unit 309 determines that the group can be divided (OK) when the size of the group after the division satisfies the condition in each group after division, and proceeds to S605. In other cases, the anonymization classification unit 309 proceeds to S606.
In S605, the anonymization classification unit 309 registers the divided group in the division candidate group list.
In S606, the anonymization classification unit 309 cancels the division in S603 and registers the group before the division in the completed group list.
In S607, the anonymization classification unit 309 determines whether there is a group in the division candidate group list. The anonymization classification unit 309 moves to S603 when there is a group, and moves to S608 otherwise.
S608 is the end of the loop of the layer, and the anonymization classification unit 309 adds 1 to i and returns to S601.
In S609, the anonymization classification unit 309 performs a marginal error group disassembly process. Details will be described with reference to the flowchart of FIG.

次に、本実施形態における限界誤差グループ解体処理について、図７のフローチャートを用いて説明する。本処理の開始時点では、匿名化グループ作成処理（図６）のＳ６０９の直前において、匿名化グループが完成グループリストに登録された状態にある。本処理では、このリストのグループの誤差を評価して、誤差の大きいグループを解体して、他のグループに統合することで、誤差を小さくすることを試みる。誤差の評価にはパラメータの限界誤差を用いる。図４（ａ）に示すように、限界誤差はレイヤごとに設定された値である。
Ｓ７０１は、レイヤを下位層から順に処理するためのループである。ここで、レイヤは利用確率の区間が小さい方を下位層と呼んでいる。例えば、図４（ａ）では２行目の方が３行目よりも利用確率の区間が小さいことから、２行目の方が下位層に位置付けられる。レイヤには下位層から順に１から順の番号が割り当てられているものとする。匿名化分類部３０９は、レイヤを変数ｉを用いて参照するため、はじめにｉを１に初期化する。更に、匿名化分類部３０９は、ｉがレイヤ数以下であるときＳ７０２へ移り、これを満たさないときループを抜けて、本処理を終了する。
Ｓ７０２では、匿名化分類部３０９は、レイヤｉにある限界誤差を超えるグループを解体グループとして特定する。図４（ａ）の例で説明すれば、例えばレイヤｉが利用確率の区間が０．４以上０．７未満のレイヤであったとき、限界誤差は１０００ｍと定義されている。そのため、ここでは、匿名化分類部３０９は、グループの半径を誤差として利用して、この半径が１０００ｍを超えるグループを解体候補として利用する。 Next, the limit error group disassembly process in the present embodiment will be described with reference to the flowchart of FIG. At the start of this process, the anonymization group is registered in the completed group list immediately before S609 in the anonymization group creation process (FIG. 6). In this processing, the error of the group in this list is evaluated, and the group with a large error is disassembled and integrated with other groups to try to reduce the error. Parameter error is used for error evaluation. As shown in FIG. 4A, the limit error is a value set for each layer.
S701 is a loop for processing layers in order from the lower layer. Here, the layer has a smaller usage probability interval called a lower layer. For example, in FIG. 4 (a), the second row is positioned in the lower layer because the second row has a smaller use probability interval than the third row. Assume that layers are assigned numbers in order from 1 in order from the lower layer. Since the anonymization classification unit 309 refers to the layer using the variable i, first, i is initialized to 1. Further, the anonymization classification unit 309 proceeds to S702 when i is equal to or less than the number of layers, and when not satisfying this, exits the loop and ends the process.
In S702, the anonymization classification unit 309 specifies a group that exceeds the limit error in layer i as a dismantling group. In the example of FIG. 4A, for example, when the layer i is a layer having a usage probability interval of 0.4 or more and less than 0.7, the limit error is defined as 1000 m. Therefore, here, the anonymization classification unit 309 uses the radius of the group as an error, and uses a group with this radius exceeding 1000 m as a disassembly candidate.

Ｓ７０３は、Ｓ７０２で特定された解体グループを順に処理するためのループである。解体グループには１から順に番号が割り当てられているものとする。匿名化分類部３０９は、解体グループを変数ｊを用いて参照するため、はじめにｊを１に初期化する。更に、解体グループｊがグループ数以下であるとき、Ｓ７０４へ移り、これを満たさないときループを抜けて、Ｓ７１４へ移る。
Ｓ７０４では、匿名化分類部３０９は、解体グループからユーザを除外したときの誤差を、ユーザごとに求める。そして、匿名化分類部３０９は、誤差が最も小さくなるユーザをユーザｍとする。このとき、匿名化分類部３０９は、ユーザｍは処理済みとしてフラグを立てておく。加えて、匿名化分類部３０９は、ユーザｍを選ぶときは、未処理のユーザから選ぶことにする。
Ｓ７０５では、匿名化分類部３０９は、ユーザｍを除外すると解体グループの誤差が小さくなるか否かを判定する。加えて、匿名化分類部３０９は、ユーザｍを除いたときに、匿名化グループサイズの条件を満たすことを判定する。匿名化分類部３０９は、誤差が小さくなりつつ、条件も満たすとき（ＹＥＳ）、Ｓ７０６へ移り、それ以外は（ＮＯ）、Ｓ７１２へ移る。
Ｓ７０６では、匿名化分類部３０９は、限界誤差を下回る「レイヤｉのグループ」と「レイヤｉより下位レイヤのグループ」とを吸収先グループとして特定する。レイヤｉの上位レイヤを含まない理由は、レイヤｉのユーザが求める匿名化グループサイズは、上位レイヤより大きいため、上位レイヤ側に統合すると、レイヤｉのユーザの条件を満たすことができなくなるためである。一方で、下位レイヤはレイヤｉのユーザの匿名化グループのサイズよりは大きいことが確実であるため、下位レイヤ側のグループには統合することができるためである。 S703 is a loop for sequentially processing the dismantling groups specified in S702. It is assumed that numbers are assigned to the dismantling groups in order from 1. Since the anonymization classification unit 309 refers to the dismantling group using the variable j, j is first initialized to 1. Further, when the dismantling group j is equal to or less than the number of groups, the process proceeds to S704.
In S704, the anonymization classification unit 309 obtains an error for each user when the user is excluded from the dismantling group. Then, the anonymization classification unit 309 sets the user m as the user with the smallest error. At this time, the anonymization classification unit 309 sets a flag that the user m has been processed. In addition, when selecting the user m, the anonymization classification unit 309 selects from unprocessed users.
In S705, the anonymization classification unit 309 determines whether or not the error of the dismantling group is reduced when the user m is excluded. In addition, the anonymization classification unit 309 determines that the condition of the anonymization group size is satisfied when the user m is excluded. The anonymization classification unit 309 proceeds to S706 when the error is reduced and the condition is satisfied (YES), and proceeds to S706 otherwise (NO).
In step S <b> 706, the anonymization classification unit 309 identifies “a group of layer i” and “a group of layers lower than layer i” that are less than the limit error as absorption destination groups. The reason why the upper layer of layer i is not included is that the anonymization group size required by the user of layer i is larger than the upper layer, and therefore, when integrated on the upper layer side, the condition of the user of layer i cannot be satisfied. is there. On the other hand, since it is certain that the lower layer is larger than the size of the anonymization group of the user of layer i, it can be integrated into the group on the lower layer side.

Ｓ７０７では、匿名化分類部３０９は、Ｓ７０６で特定した吸収先グループにユーザｍを含めたときの誤差を求める。そして、匿名化分類部３０９は、最も小さい誤差が限界誤差を下回るとき、それを吸収先グループｎとする。
Ｓ７０８では、匿名化分類部３０９は、Ｓ７０７で吸収先グループｎが見つかったか否かを判定する。匿名化分類部３０９は、見つかったとき（ＹＥＳ）、Ｓ７０９へ移り、それ以外は（ＮＯ）、Ｓ７１１へ移る。
Ｓ７０９では、匿名化分類部３０９は、吸収先グループｎにユーザｍを含める。
Ｓ７１０では、匿名化分類部３０９は、ユーザｍを解体グループから除外したときの誤差が、限界誤差を下回るか否かを求める。匿名化分類部３０９は、限界誤差を下回るとき（ＹＥＳ）、Ｓ７１２へ移り、それ以外は（ＮＯ）、Ｓ７１１へ移る。
Ｓ７１１では、匿名化分類部３０９は、解体グループｊに未処理のユーザが残っているか否かを判定する。匿名化分類部３０９は、処理済みのフラグはＳ７０４で付与されているものを使う。匿名化分類部３０９は、未処理ユーザがあるとき（ＹＥＳ）、Ｓ７０４へ移り、それ以外は（ＮＯ）、Ｓ７１２へ移る。 In S707, the anonymization classification unit 309 obtains an error when the user m is included in the absorption destination group specified in S706. And the anonymization classification | category part 309 makes it the absorption destination group n, when the smallest error is less than a limit error.
In S708, the anonymization classification unit 309 determines whether or not the absorption destination group n is found in S707. When the anonymization classification unit 309 is found (YES), the process proceeds to S709, and otherwise (NO), the process proceeds to S711.
In S709, the anonymization classification unit 309 includes the user m in the absorption destination group n.
In S710, the anonymization classification unit 309 determines whether or not the error when the user m is excluded from the dismantling group is less than the limit error. When the anonymization classification unit 309 is below the limit error (YES), the process proceeds to S712, and otherwise (NO), the process proceeds to S711.
In S711, the anonymization classification unit 309 determines whether or not an unprocessed user remains in the dismantling group j. The anonymization classification unit 309 uses the processed flag given in S704. The anonymization classification unit 309 proceeds to S704 when there is an unprocessed user (YES), and proceeds to S712 otherwise (NO).

Ｓ７１２では、匿名化分類部３０９は、解体グループｊの全てのユーザを処理できないとき、匿名化グループサイズを満たすように、解体をキャンセルしていく。より具体的には、匿名化分類部３０９は、Ｓ７０４〜Ｓ７１１の処理によって、解体グループのユーザを、限界誤差を満たせる他のグループへと移動させていく。しかし、解体グループにユーザが残ることがあり得る。例えば、限界誤差を満たせる他のグループがないときは、解体グループにユーザが残る。このとき、解体グループに残ったユーザが、匿名化グループサイズの条件を満たさないとき、この解体グループの匿名化を達成できなくなる。そこで、このような場合、匿名化分類部３０９は、解体グループの限界誤差を小さくすることを断念して、匿名化グループサイズの条件を満たせる状態にまで処理を戻す。より具体的には、Ｓ７０９では、匿名化分類部３０９は、吸収先グループｎにユーザｍを含めることをスタックに記憶しておく。Ｓ７１２では、匿名化分類部３０９は、スタックからどのグループにユーザを含めたかを読みだして、順にユーザを解体グループに戻していく。匿名化分類部３０９は、解体グループが匿名化グループサイズの条件を満たすまで、順にグループに戻す。これによって、パラメータで指定された匿名化グループのサイズの条件を満たすことができるようになり、グループの匿名性を優先的に守ることができるようになる。
Ｓ７１３は、解体先グループループの終端であり、匿名化分類部３０９は、ｊに１を加算してＳ７０３へ戻る。
Ｓ７１４は、下位レイヤからのループ処理であり、匿名化分類部３０９は、ｉに１を加算してＳ７０１へ戻る。 In S712, the anonymization classification unit 309 cancels the dismantling so as to satisfy the anonymizing group size when all users of the dismantling group j cannot be processed. More specifically, the anonymization classification unit 309 moves the users of the dismantling group to another group that can satisfy the limit error by the processes of S704 to S711. However, the user may remain in the dismantling group. For example, when there is no other group that can satisfy the limit error, the user remains in the dismantling group. At this time, when the user remaining in the dismantling group does not satisfy the condition of the anonymizing group size, the dismantling of the dismantling group cannot be achieved. Therefore, in such a case, the anonymization classification unit 309 abandons reducing the marginal error of the dismantling group and returns the process to a state where the condition of the anonymization group size can be satisfied. More specifically, in S709, the anonymization classification unit 309 stores in the stack that the user m is included in the absorption destination group n. In S712, the anonymization classification unit 309 reads out to which group the user is included from the stack, and sequentially returns the user to the dismantling group. The anonymization classification unit 309 sequentially returns to the group until the dismantling group satisfies the condition of the anonymization group size. As a result, it becomes possible to satisfy the condition of the size of the anonymization group specified by the parameter, and the anonymity of the group can be preferentially protected.
S713 is the end of the dismantling destination group loop, and the anonymization classification unit 309 adds 1 to j and returns to S703.
S714 is a loop process from the lower layer, and the anonymization classification unit 309 adds 1 to i and returns to S701.

以上の図７のフローチャートの処理について、位置情報を匿名化して開示するケースを例に、図８を用いてより具体的に説明をする。図８（ａ）はユーザの位置情報の分布を表した図である。図中の○△×はユーザのサービスの利用確率を表している。利用確率との対応は図４（ａ）の符号に示すとおりである。
図８（ａ）はユーザの位置情報の分布を表した図である。図８中の○△×はユーザのサービスの利用確率を表している。利用確率との対応は図４（ａ）の符号に示すとおりである。図８（ａ）に示す状態は開示情報生成処理（Ｓ５０１）を終えた状態にある。
図８（ｂ）は、更に処理を進めて、限界誤差グループ解体処理（Ｓ６０９）を行う直前まで処理が進んだ状態を示している。２種類の破線はそれぞれ匿名化グループを示している。細かい破線１４０２ｂの匿名化グループは利用確率が△で表されるレイヤの匿名化グループであり、粗い破線１４０４ｂの匿名化グループは利用確率が○で表されるレイヤの匿名化グループである。利用確率が×で表現されるレイヤは情報を開示しないため、図８（ｂ）中から削除されている。なお、１４０１ｂは後の説明で利用する各グループの限界誤差を示している。
以降、図８（ｂ）に示す情報が誤差限界グループ解体処理に入力されたことを前提として、図７のフローチャートの処理についてより具体的に説明する。まず、Ｓ７０１では、匿名化分類部３０９は、利用確率△のレイヤを選択する。そしてＳ７０２では、匿名化分類部３０９は、誤差限界を超えるグループを特定する。しかし、この例では利用確率△のレイヤのグループは何れも誤差限界を下回るため、匿名化分類部３０９は、Ｓ７０３からＳ７１３までの処理をスキップする。そして、Ｓ７１４では、匿名化分類部３０９は、次の利用確率○のレイヤを処理対象にし、Ｓ７０１へ戻る。 The process of the flowchart of FIG. 7 will be described more specifically with reference to FIG. 8 using an example of anonymizing and disclosing position information. FIG. 8A is a diagram showing a distribution of user position information. In the figure, .DELTA..DELTA.X represents the use probability of the user service. Correspondence with the use probability is as shown by reference numerals in FIG.
FIG. 8A is a diagram showing a distribution of user position information. In FIG. 8, ◯ Δ × represents the use probability of the user's service. Correspondence with the use probability is as shown by reference numerals in FIG. The state shown in FIG. 8A is a state in which the disclosure information generation process (S501) is completed.
FIG. 8B shows a state in which the process is further advanced to the point immediately before the limit error group disassembly process (S609) is performed. Each of the two types of broken lines indicates an anonymization group. The anonymization group of the fine broken line 1402b is an anonymization group of the layer whose utilization probability is represented by Δ, and the anonymization group of the rough broken line 1404b is an anonymization group of the layer whose utilization probability is represented by ○. Since the layer whose use probability is represented by x does not disclose information, it is deleted from FIG. Reference numeral 1401b denotes a limit error of each group used in later explanation.
Hereinafter, the processing of the flowchart of FIG. 7 will be described more specifically on the assumption that the information illustrated in FIG. 8B is input to the error limit group disassembly processing. First, in S701, the anonymization classification unit 309 selects a layer having a usage probability Δ. In step S702, the anonymization classification unit 309 identifies a group that exceeds the error limit. However, in this example, since all the groups of layers having the usage probability Δ are below the error limit, the anonymization classification unit 309 skips the processing from S703 to S713. In S <b> 714, the anonymization classification unit 309 sets the next layer of usage probability ○ as the processing target, and returns to S <b> 701.

次の利用確率○のレイヤにはグループ１４０４ｂ〜１４０７ｂの４つのグループがある。Ｓ７０２では、匿名化分類部３０９は、誤差限界を超えるグループ１４０４ｂだけを選択する。Ｓ７０３からＳ７１３まででは、匿名化分類部３０９は、このグループからユーザを他のグループに移動させることで、グループ１４０４ｂの誤差を小さくする。１４０４ｂにはユーザが１４０８ｂ〜１４１２ｂの５人が存在する。Ｓ７０４では、匿名化分類部３０９は、グループから除外すると最も誤差を小さくするユーザを選択する。この例では、匿名化分類部３０９は、１４０８ｂに示すユーザを選択する。Ｓ７０５では、匿名化分類部３０９は、ユーザ１４０８ｂを除外するとグループ１４０４ｂの誤差が小さくなると判定する。Ｓ７０６では、匿名化分類部３０９は、グループ１４０２ｂ〜１４０３ｂ、１４０５ｂ〜１４０７ｂのグループを吸収先グループの候補として特定する。Ｓ７０７では、匿名化分類部３０９は、ユーザ１４０７ｂを含めたときに、誤差が最も小さく、誤差限界も下回るグループを選択する。この例では、匿名化分類部３０９は、ユーザ１４０８ｂをグループ１４０３ｂに含めることができると判断する。したがって、Ｓ７０８では、匿名化分類部３０９は、吸収先グループが見つかったと判定し、Ｓ７０９に進む。Ｓ７０９では、匿名化分類部３０９は、ユーザ１４０８ｂをグループ１４０４ｂからグループ１４０３ｂに移動して、双方の誤差を修正する。その結果、図８（ｃ）に示すような匿名化グループの状態となる。グループ１４０４ｂは誤差が小さくなり、グループ１４０３ｂは誤差が大きくなるように修正されていることが分かる。
ユーザ１４０８ｂを移動してもグループ１４０４ｂの誤差は限界誤差５００ｍを超えるため、Ｓ７１０では、匿名化分類部３０９は、ＮＯと判断する。まだ処理していないユーザが４人存在するため、Ｓ７１１では、匿名化分類部３０９は、ＹＥＳと判断する。 There are four groups of groups 1404b to 1407b in the next layer of use probability ◯. In S702, the anonymization classification unit 309 selects only the group 1404b exceeding the error limit. In S703 to S713, the anonymization classification unit 309 reduces the error of the group 1404b by moving the user from this group to another group. In 1404b, there are five users 1408b to 1412b. In S704, the anonymization classification unit 309 selects a user who minimizes the error when excluded from the group. In this example, the anonymization classification unit 309 selects the user indicated by 1408b. In S705, the anonymization classification unit 309 determines that the error of the group 1404b is small when the user 1408b is excluded. In S706, the anonymization classification unit 309 specifies the groups 1402b to 1403b and 1405b to 1407b as candidates for the absorption destination group. In S707, the anonymization classification unit 309 selects a group having the smallest error and the error limit below when the user 1407b is included. In this example, the anonymization classification unit 309 determines that the user 1408b can be included in the group 1403b. Therefore, in S708, the anonymization classification unit 309 determines that the absorption destination group has been found, and proceeds to S709. In S709, the anonymization classification unit 309 moves the user 1408b from the group 1404b to the group 1403b, and corrects both errors. As a result, the state of the anonymization group as shown in FIG. It can be seen that the group 1404b has been corrected so that the error is small and the group 1403b is corrected so that the error is large.
Even if the user 1408b is moved, the error of the group 1404b exceeds the limit error of 500 m. Therefore, in S710, the anonymization classification unit 309 determines NO. Since there are four users who have not yet processed, the anonymization classification unit 309 determines YES in S711.

Ｓ７０４〜Ｓ７０８では、匿名化分類部３０９は、再び同様の処理を繰り返す。つまり、匿名化分類部３０９は、除外すると誤差が小さくなるユーザとして、ユーザ１４０９ｂを特定する。そして、匿名化分類部３０９は、ユーザ１４０９ｂを移動しても誤差が限界誤差を超えないグループとしてグループ１４０３ｂを特定する。そして、Ｓ７０９では、匿名化分類部３０９は、グループ１４０４ｂからグループ１４０３ｂへユーザ１４０９ｂを移動し、双方のグループの誤差を更新する。その結果、図８（ｄ）に示すような匿名化グループの状態となる。グループ１４０４ｂは誤差が小さくなっている。一方で、グループ１４０３ｂは誤差に影響を与えなかったため、誤差は変わっていない。
Ｓ７１０では、グループ１４０４ｂの誤差は限界誤差５００ｍを下回ったため、匿名化分類部３０９は、ＹＥＳと判断する。Ｓ７１２では、グループ１４０４ｂは全てのユーザを処理できていないが、匿名化グループのサイズを満たすため、匿名化分類部３０９は、解体をキャンセルしない。
解体候補となっているグループは１４０４ｂ以外は存在しないため、匿名化分類部３０９は、Ｓ７１３を抜ける。更に、利用確率○よりも上位のレイヤは存在しないため、匿名化分類部３０９は、Ｓ７１４も同様に抜けて、図７に示すフローチャートの処理を終える。
上記処理において、解体グループのユーザを下位レイヤの吸収先グループに移すとき、吸収先グループの誤差は、上位レイヤの誤差限界を満たすようにしてもよい。より具体的に説明すると、Ｓ７０７では、匿名化分類部３０９は、解体グループに属しているユーザｍを、下位レイヤのグループに含めたときの誤差を求め、その誤差が「上位レイヤの誤差限界」を下回るグループを吸収先グループｎとして求めることを行う。これによって、上位レイヤから下位レイヤのグループに移されたユーザの開示情報の情報品質低下を防ぐことができる。
以上のように、図７のフローチャートの処理では、誤差限界を上回るグループがある場合、誤差に余裕のあるグループにユーザを移動する。これによって、誤差限界を満たすようにグループを調整することができる。
以上の開示情報生成処理（図５）において利用されているパラメータである匿名化グループサイズ（図４（ａ））は、グループを構成するユーザのサービスの利用確率を考慮して決定されていることが望ましい。 In S704 to S708, the anonymization classification unit 309 repeats the same process again. That is, the anonymization classification unit 309 specifies the user 1409b as a user whose error is reduced when excluded. And the anonymization classification | category part 309 specifies the group 1403b as a group by which an error does not exceed a limit error, even if it moves the user 1409b. In step S709, the anonymization classification unit 309 moves the user 1409b from the group 1404b to the group 1403b, and updates the error of both groups. As a result, the state of the anonymization group as shown in FIG. The group 1404b has a small error. On the other hand, since the group 1403b did not affect the error, the error has not changed.
In S710, since the error of the group 1404b is less than the limit error 500m, the anonymization classification unit 309 determines YES. In S712, the group 1404b cannot process all users, but the anonymization classification unit 309 does not cancel dismantling because the size of the anonymization group is satisfied.
Since there are no groups that are candidates for dismantling except for 1404b, the anonymization classification unit 309 exits S713. Further, since there is no higher layer than the use probability ◯, the anonymization classification unit 309 similarly skips S714 and ends the processing of the flowchart shown in FIG.
In the above processing, when the user of the dismantling group is moved to the absorption destination group of the lower layer, the error of the absorption destination group may satisfy the error limit of the upper layer. More specifically, in S707, the anonymization classification unit 309 obtains an error when the user m belonging to the disassembly group is included in the lower layer group, and the error is “error limit of upper layer”. The group that falls below is determined as the absorption group n. Thereby, it is possible to prevent a deterioration in information quality of the disclosure information of the user moved from the upper layer to the lower layer group.
As described above, in the process of the flowchart of FIG. 7, when there is a group exceeding the error limit, the user is moved to a group having a margin for error. This allows the group to be adjusted to meet the error limit.
The anonymized group size (FIG. 4 (a)), which is a parameter used in the above disclosed information generation process (FIG. 5), is determined in consideration of the use probability of the services of the users making up the group. Is desirable.

例えば、グループを構成するユーザの何れか１人がサービスを実行する確率を「グループの利用確率」と定義する。このグループの利用確率が所定の要求確率以上になるように設定するようにしてもよい。グループの何れか１人のユーザがサービスを利用する確率は図４（ｃ）に示す式により求めることができる。図４（ｂ）は要求確率を９５％としたときに、ユーザのサービスを利用する確率に対して、何人のユーザが匿名化グループに属していれば９５％以上を達成できるかを示した情報である。匿名化分類部３０９は、レイヤの利用確率の区間を決定後に、図４（ｂ）に示すような情報を基に、匿名化グループのサイズを決定することができる。例えば、匿名化分類部３０９は、利用確率の区間が０．４以上０．７未満に対しては、図４（ｂ）の０．４以上０．７未満を参照して、最も大きい匿名化グループのサイズを用いる。この例では６となるため、匿名化グループのサイズは６以上として設定される。匿名化分類部３０９が、このようにして設定することで、匿名化グループの何れか１人のユーザがサービスを利用する確率を、要求確率９５％以上に設定することができる。
これによって、匿名化された情報に合わせてサービスが準備を行うようになったとしても、要求確率のもとで少なくとも１人はサービスを享受する可能性があり、サービス提供側にとって有効な情報を生成する匿名化ができる。
又は、匿名化分類部３０９は、グループの利用確率を、グループを構成するユーザの利用確率の和で近似してもよい。これによって、簡便な計算によってグループの利用確率を求めることができるようになる。
匿名化分類部３０９は、上記の要求確率をサービスから取得するようにしてもよい。また、匿名化分類部３０９は、グループの利用確率は、グループのうち１人が利用する確率ではなく、グループのうち少なくともＮ人が利用する確率等としてもよい。
本実施形態では、匿名化分類部３０９は、パラメータとして匿名化グループのサイズを決定してから処理を行っている。しかしながら、匿名化分類部３０９は、匿名化グループのサイズを決定せずに、要求確率のみを決定しておき処理するようにしてもよい。より具体的には、匿名化分類部３０９は、図６のＳ６０４における判定時に、分割後のグループの構成ユーザを用いて、図４（ｃ）に示す数式を用いて、少なくとも１人のユーザがサービスを利用する確率（グループの利用確率）を算出する。匿名化分類部３０９は、各々の分割後グループで、グループの利用確率が要求確率を上回るとき、分割可能と判断する。一方で、匿名化分類部３０９は、要求確率を下回るとき、分割不可能と判断する。
また、限界誤差グループ解体処理（図７）においても、Ｓ７１２で、匿名化分類部３０９は、匿名化グループサイズの条件を満たすまで解体をキャンセルするところを、要求確率を満たすまで解体をキャンセルするようにする。
これによって、予め匿名化グループのサイズを決定している場合に比べて、より精度高く要求確率を匿名化グループが充足しているか否かを判定できるようになる。一方で、予め匿名化グループのサイズを決定している場合は、Ｓ６０４での処理が簡便になるため、より高速に処理ができるようになるという効果が得られる。 For example, the probability that any one of the users constituting the group will execute the service is defined as a “group use probability”. You may make it set so that the utilization probability of this group may become more than a predetermined | prescribed request | requirement probability. The probability that any one user in the group uses the service can be obtained by the equation shown in FIG. FIG. 4B shows information indicating how many users can achieve 95% or more if the user's service is used if the request probability is 95% if the user belongs to the anonymization group. It is. The anonymization classification unit 309 can determine the size of the anonymization group based on the information shown in FIG. 4B after determining the section of the layer use probability. For example, the anonymization classification unit 309 refers to the range of 0.4 or more and less than 0.7 in FIG. Use the size of the group. Since it is 6 in this example, the size of the anonymization group is set as 6 or more. By setting the anonymization classification unit 309 in this way, the probability that any one user of the anonymization group uses the service can be set to a request probability of 95% or more.
As a result, even if the service prepares for anonymized information, there is a possibility that at least one person will enjoy the service with the probability of request. Generate anonymization.
Or the anonymization classification | category part 309 may approximate the utilization probability of a group with the sum of the utilization probability of the user who comprises a group. As a result, the use probability of the group can be obtained by simple calculation.
The anonymization classification unit 309 may acquire the request probability from the service. In addition, the anonymization classification unit 309 may use the group use probability as a probability that at least N persons in the group will use instead of a probability that one person in the group will use.
In the present embodiment, the anonymization classification unit 309 performs processing after determining the size of the anonymization group as a parameter. However, the anonymization classification unit 309 may determine and process only the request probability without determining the size of the anonymization group. More specifically, the anonymization classifying unit 309 uses the constituent users of the group after the division at the time of determination in S604 of FIG. The probability of using the service (group usage probability) is calculated. The anonymization classification unit 309 determines that division is possible when the group use probability exceeds the required probability in each divided group. On the other hand, the anonymization classification | category part 309 judges that it cannot divide | segment, when it falls below a request | requirement probability.
Also in the marginal error group disassembly process (FIG. 7), in S712, the anonymization classification unit 309 cancels disassembly until the condition of the anonymization group size is satisfied, but cancels disassembly until the request probability is satisfied. To.
This makes it possible to determine whether or not the anonymization group satisfies the request probability with higher accuracy than when the size of the anonymization group is determined in advance. On the other hand, when the size of the anonymization group is determined in advance, since the process in S604 is simplified, an effect that the process can be performed at a higher speed is obtained.

次に、タクシーサービスに対して位置情報を開示する例について図９〜図１１を用いて説明する。
図９（ａ）はある時刻Ｔにおけるユーザの位置情報の分布を表した図である。図中の○△×はユーザのタクシーサービスの利用確率を表している。利用確率との対応は図４（ａ）の符号に示すとおりである。図９（ａ）に示す状態は開示情報生成処理（図５）のＳ５０１を終えた状態にある。
図９（ｂ）は、更にＳ５０２〜Ｓ５０４を終えて、匿名化グループを作成した状態を示している。２種類の破線はそれぞれ匿名化グループを示している。細かい破線８０１ｂの匿名化グループは利用確率が△で表されるレイヤの匿名化グループであり、粗い破線８０２ｂの匿名化グループは利用確率が○で表されるレイヤの匿名化グループである。利用確率が×で表現されるレイヤは情報を開示しないため、図中から削除している。
図９（ｃ）は、更にＳ５０５を終えて匿名化情報を生成した状態であり、タクシーサービスに提供される情報のイメージを示したものである。匿名化グループごとに重心と半径とが算出され、開示するコンテキスト情報として生成される。タクシーサービスには、匿名化グループごとに、重心・半径・グループの確率・構成するユーザＩＤ・ユーザのサービス利用確率等が開示されることになる。グループの確率は、構成するユーザのサービスの利用確率を用いて、何れか１人のユーザがサービスを実行する確率を求めたものである。ユーザＩＤはタクシーサービスに閉じたＩＤである。そのため、タクシーサービス以外が保有するデータと組み合わせて利用はできない。
こうした情報をタクシーサービスが受け取ると、図９（ｃ）の破線で表されるエリア内をタクシーが巡回するように巡回ルートを決定できるようになる。加えて、タクシーサービス側にユーザＩＤを開示することで、ユーザＩＤに紐付けた属性情報等をタクシーサービス側が有しているとき、それを基にタクシーの準備等ができる。例えば、タクシー内にサイネージを有するとき、サイネージに流す広告をユーザに合わせたものを予めダウンロードして準備しておくこと等ができるようになる。
但し、サービスに開示する情報の項目や形態は、これらに限定されるものではない。 Next, an example of disclosing location information for a taxi service will be described with reference to FIGS.
FIG. 9A is a diagram showing a distribution of user position information at a certain time T. FIG. ○ △ × in the figure represents the use probability of the user's taxi service. Correspondence with the use probability is as shown by reference numerals in FIG. The state shown in FIG. 9A is a state after S501 of the disclosure information generation process (FIG. 5).
FIG. 9B shows a state where S502 to S504 are further completed and an anonymization group is created. Each of the two types of broken lines indicates an anonymization group. The anonymization group of the fine broken line 801b is an anonymization group of the layer whose utilization probability is represented by Δ, and the anonymization group of the rough broken line 802b is an anonymization group of the layer whose utilization probability is represented by ○. Since the layer whose usage probability is expressed by “x” does not disclose information, it is deleted from the figure.
FIG. 9C shows a state where the anonymized information is generated after S505 is further completed, and shows an image of information provided to the taxi service. The center of gravity and the radius are calculated for each anonymization group, and are generated as disclosed context information. In the taxi service, for each anonymized group, the center of gravity, the radius, the probability of the group, the configured user ID, the service usage probability of the user, and the like are disclosed. The probability of the group is obtained by using the service usage probability of the constituent users to determine the probability that any one user will execute the service. The user ID is an ID closed to the taxi service. Therefore, it cannot be used in combination with data held by other than taxi service.
When such information is received by the taxi service, the patrol route can be determined so that the taxi patrols the area indicated by the broken line in FIG. In addition, by disclosing the user ID to the taxi service side, when the taxi service side has attribute information associated with the user ID, it is possible to prepare a taxi based on the attribute information. For example, when having a signage in a taxi, it is possible to pre-download and prepare advertisements tailored to the user.
However, the items and forms of information disclosed to the service are not limited to these.

図１０（ａ）は図９（ａ）より少し進んだ時刻である時刻Ｔ＋１におけるユーザの位置情報の分布と、ユーザの利用確率と、を表した図である。図１０中の黒塗りにされた符号（●と★）は時刻Ｔから利用確率が変化部分である。何れも時刻Ｔよりも利用確率が高まった状態になっている。また、時刻Ｔにおいて、破線で表わされるエリアを巡回するように決定していたため、破線のエリアの数だけタクシーが出現している。
図１０（ｂ）は時刻Ｔ＋１の状態での匿名化グループの状態を示している。利用確率の高まったユーザ９０１ｂと９０２ｂとは匿名化グループを時刻Ｔと比べて変えている。そのため、開示する情報の精度が高くなる。一方で、ユーザ９０３ｂは匿名化グループを変えていない。これは、同一のレイヤにあるユーザと匿名化グループを形成すると、誤差が大きくなりすぎる。そのため、限界誤差グループの解体処理によって、下位レイヤの匿名化グループにユーザを割り当てるように処理されたためである。
図１０（ｃ）は時刻Ｔ＋１でタクシーサービスに提供される情報のイメージを示したものである。タクシーサービスからは、配車依頼をしたユーザ９０１ｂの位置情報を分かるようになり、近くで準備をしていたタクシー９０１ｃが迎えに行くことで、ユーザの待ち時間を少なくすることができる。 FIG. 10A shows the distribution of the user's position information and the user's use probability at time T + 1, which is a time slightly advanced from FIG. 9A. The black symbols (● and ★) in FIG. 10 are portions where the use probability has changed from time T. In either case, the use probability is higher than at time T. In addition, at time T, since it has been decided to go around the area represented by the broken line, there are as many taxis as there are broken line areas.
FIG. 10B shows the state of the anonymization group in the state at time T + 1. Users 901b and 902b with increased use probabilities change the anonymization group as compared to time T. Therefore, the accuracy of information to be disclosed is increased. On the other hand, the user 903b has not changed the anonymization group. If an anonymization group is formed with a user in the same layer, the error becomes too large. For this reason, the process is performed so that the user is assigned to the anonymization group of the lower layer by the dismantling process of the marginal error group.
FIG. 10 (c) shows an image of information provided to the taxi service at time T + 1. From the taxi service, it becomes possible to know the position information of the user 901b who requested the vehicle dispatch, and the waiting time of the user can be reduced by the taxi 901c that has been prepared nearby picking up.

図１１（ａ）は、図１０（ａ）と同様の情報を時刻を時刻Ｔ＋２に進めた状態で示したものである。タクシー９０１ｃとユーザ９０１ｂとはユーザの目的地へと向かったため、図１１中からは消えている。加えて、黒塗りにされた符号（★）が時刻Ｔ＋１から利用確率が変化した部分である。
図１１（ｂ）は、匿名化グループの状態を示しており、ユーザ１００１ｂは１人で匿名化グループを形成するようになる。
図１１（ｃ）は、タクシーサービスに提供される情報のイメージであり、近くに配車されていたタクシー１００１ｃをユーザ１００１ｂに向かわせる。加えて、エリア１００２ｃを補うタクシーがいなくなるため、別のタクシーをエリア１００２ｃに向かわせることを行う。
このように、タクシーの利用確率が高まるにつれて、情報の精度が高まっていく。より具体的には、位置情報の精度が高まっていく。これによって、利用確率の高い状態においては、より個人化された準備がされるようになる。より具体的には、ユーザに近いところをタクシーが巡回するようになる。その結果、タクシーの配車を依頼したときに、待ち時間が少なくなるという効果が得られる。 FIG. 11A shows the same information as in FIG. 10A with the time advanced to time T + 2. Since the taxi 901c and the user 901b headed for the user's destination, they disappear from FIG. In addition, the black symbols (★) are portions where the use probability has changed from time T + 1.
FIG.11 (b) has shown the state of the anonymization group, and the user 1001b comes to form an anonymization group by one person.
FIG. 11C is an image of information provided to the taxi service, and directs the taxi 1001c that has been dispatched nearby to the user 1001b. In addition, since there is no taxi supplementing the area 1002c, another taxi is directed to the area 1002c.
Thus, as the taxi utilization probability increases, the accuracy of information increases. More specifically, the accuracy of position information increases. As a result, in a state where the use probability is high, a more personalized preparation is made. More specifically, a taxi goes around the user. As a result, there is an effect that waiting time is reduced when taxi dispatch is requested.

次に、コンテキスト情報が複数であり、属性情報等を含む場合の例について述べる。この例では、シューズショップサービスに対して、来店する可能性のある客（ＰＣＡクライアントのユーザ）が望む靴の情報を共有する例について図１２〜図１４を用いて説明する。
シューズショップは客に靴を販売する店舗である。客は店舗を訪れたときに、自身の望む靴を見たり試着したりすることを望む。そのため、希望の靴の在庫が訪れた店にないとき、店員が近隣の他店舗に取りに行くため、その間待たされるということが起こる。こうした問題を解決するために、シューズショップサービスでは、店頭に来店する客の情報に合わせて、靴を用意しておくことを行う。これによって、客が店舗を訪れたときに、希望の靴がないために、待たされることを防ぐ。
ＰＣＡサーバで匿名化する前のコンテキスト情報について、図１３（ａ）を用いて説明する。これは主にユーザがほしいと思っている靴の情報である。これら情報は、ユーザの普段の行動から推定されるものであってもよい。例えば、オンラインショッピング等では、ほしいものリストを作成しておくことが行われている。コンテキスト取得部３０６は、ほしいものリストからコンテキスト情報を作成してもよい。又は、コンテキスト取得部３０６は、ユーザがいくつかのシューズショップを巡る中でコンテキスト情報を獲得してもよい。但し、コンテキスト情報の獲得方法はこれらに限定されるものではない。 Next, an example in which there is a plurality of context information and attribute information etc. will be described. In this example, an example of sharing shoe information desired by a visitor (PCA client user) who has the possibility of visiting the shoe shop service will be described with reference to FIGS.
A shoe shop is a store that sells shoes to customers. When customers visit the store, they want to see and try on their shoes. For this reason, when the desired shoe stock is not in the visited store, the store clerk goes to another store in the vicinity, so that he / she is kept waiting. In order to solve these problems, the shoe shop service prepares shoes according to the information of customers who visit the store. This prevents the customer from waiting because the desired shoe is not available when the customer visits the store.
The context information before anonymizing with the PCA server will be described with reference to FIG. This is mainly information about shoes that the user wants. Such information may be estimated from the user's usual behavior. For example, in online shopping or the like, a wish list is created. The context acquisition unit 306 may create context information from the wish list. Alternatively, the context acquisition unit 306 may acquire context information while the user goes around several shoe shops. However, the method for acquiring the context information is not limited to these.

図１３（ａ）中の列の各項目について説明する。
ユーザＩＤは、ユーザを一意に特定するための情報であり、既にユーザに関する情報をサービスが有しているときは、その情報と紐付けて処理するためのものである。例えば、ユーザが既にショップの会員等であれば、ショップには年齢・性別・以前に購入した品等の情報が蓄積されているため、これらと紐付けて処理することで、より様々な準備ができる。例えば、以前に購入した商品と同じにならないように用意をしておくこと等ができる。
利用確率は、ユーザが来店する確率であり、ここでは符号で表現されている。この符号と利用確率の対応は図１２に示すとおりである。
カテゴリとサブカテゴリとは、靴の種類に関する情報である。カテゴリが上位の分類であり、その下位分類がサブカテゴリである。
色分類と色とは、靴の色に関する情報である。色分類が上位の分類であり、その下位分類が色である。
サイズは、靴のサイズに関する情報である。
図１３（ａ）はある時刻Ｔにおけるコンテキスト情報を表している。これに対して、開示情報生成部３０５は、開示情報生成処理を行うことで「カテゴリ、サブカテゴリ、色分類、色、サイズ」を匿名化して、図１３（ｂ）の情報を生成する。このとき開示情報生成処理では、図１２に示すパラメータを用いる。このパラメータは図４（ａ）に示すものとほぼ同じである。但し、限界誤差が匿名化したいコンテキスト情報に合わせて変えられている。より具体的には、ここでは限界誤差を匿名化した項目（列）の個数としている。つまり、利用確率の符号△で表現されるデータに対しては、匿名化される項目数は４以下が望ましいことを示している。 Each item in the column in FIG. 13A will be described.
The user ID is information for uniquely identifying the user. When the service already has information on the user, the user ID is associated with the information and processed. For example, if the user is already a member of the shop, the shop stores information such as age, gender, and previously purchased items. it can. For example, it can be prepared not to be the same as a previously purchased product.
The use probability is a probability that the user visits the store, and is expressed by a code here. The correspondence between this code and the use probability is as shown in FIG.
A category and a subcategory are information regarding the type of shoe. The category is an upper classification, and the lower classification is a subcategory.
The color classification and the color are information on the color of the shoe. The color classification is an upper classification, and the lower classification is a color.
The size is information regarding the size of the shoe.
FIG. 13A shows context information at a certain time T. FIG. On the other hand, the disclosure information generation unit 305 anonymizes “category, subcategory, color classification, color, size” by performing the disclosure information generation process, and generates the information in FIG. At this time, the parameters shown in FIG. 12 are used in the disclosure information generation process. This parameter is almost the same as that shown in FIG. However, the marginal error is changed according to the context information to be anonymized. More specifically, here, the limit error is the number of items (columns) anonymized. That is, it is indicated that the number of items to be anonymized is desirably 4 or less for the data expressed by the use probability code Δ.

以下、図５に示す開示情報生成処理について、ステップごとに説明する。
Ｓ５０１では、利用確率算出部３０７は、利用確率を推定する。例えば、利用確率算出部３０７は、ユーザと店舗との距離と利用確率の対応を予め作成しておき、遠くの距離に対して小さな確率を割り当て、近くの距離に対して大きな確率を割り当てるように設定するようにしてもよい。例えば、利用確率算出部３０７は、店舗を中心とした正規分布等を設定することで、これは実現ができる。又は、利用確率算出部３０７は、ＰＣＡサーバに集積されるコンテキスト情報を用いて、ＰＣＡサーバが地点間の巡回モデルを作成しておくようにしてもよい。例えば、利用確率算出部３０７は、ユーザから集められる位置情報を基に、地点間の移動確率を巡回モデルとして作成しておく。そして、利用確率算出部３０７は、ユーザが現在いる地点からシューズショップまでの移動確率を巡回モデルに基づき求める。但し、利用確率の推定方法はこれらに限定されるものではない。
Ｓ５０２では、利用確率分類部３０８は、ユーザを利用確率ごとにレイヤに分ける。この例では利用確率△と○の２つに分けられることになる。
Ｓ５０３では、利用確率分類部３０８は、匿名化対象のレイヤを特定する。図１２によると利用確率の符号△と○で表されるレイヤが対象であるため、図１３（ａ）では全てのレイヤが匿名化対象となる。 Hereinafter, the disclosure information generation process illustrated in FIG. 5 will be described step by step.
In S501, the use probability calculation unit 307 estimates the use probability. For example, the usage probability calculation unit 307 creates a correspondence between the distance between the user and the store and the usage probability in advance, assigns a small probability to a far distance, and assigns a large probability to a nearby distance. You may make it set. For example, the use probability calculation unit 307 can achieve this by setting a normal distribution centered on the store. Alternatively, the use probability calculation unit 307 may cause the PCA server to create a traveling model between points using the context information accumulated in the PCA server. For example, the use probability calculation unit 307 creates a movement probability between points as a traveling model based on position information collected from the user. Then, the use probability calculation unit 307 obtains the movement probability from the point where the user is currently located to the shoe shop based on the traveling model. However, the usage probability estimation method is not limited to these.
In S502, the use probability classification unit 308 divides users into layers for each use probability. In this example, there are two usage probabilities Δ and ○.
In S503, the use probability classifying unit 308 identifies the anonymization target layer. According to FIG. 12, since the layers represented by the signs Δ and ○ of the use probability are targets, all layers are anonymized in FIG.

Ｓ５０４では、匿名化分類部３０９は、匿名化グループ作成処理を行う。まず開示情報生成部３０５は、利用確率の符号△のデータだけに着目して、匿名化グループサイズの条件を満たす最小のサイズにまでユーザを分割することを繰り返す。この例では、Ｕｓｅｒ０１〜Ｕｓｅｒ０８とＵｓｅｒ０１２〜Ｕｓｅｒ１８との２つのグループが作成される。次に、開示情報生成部３０５は、利用確率の符号○のデータに着目して、処理を行う。するとＵｓｅｒ０９〜Ｕｓｅｒ１１とＵｓｅｒ１９〜Ｕｓｅｒ２１との２つのグループが作成される。
Ｓ５０５では、匿名化情報生成部３１０は、匿名化グループごとに開示情報を生成する。匿名化情報生成部３１０は、上記４つのグループごとに、各項目を処理する。より具体的には、匿名化情報生成部３１０は、「カテゴリ、サブカテゴリ、色分類、色」に対しては、グループ内のすべてのデータが同じ値であるときは、その値を残し、異なる場合は「＊」に置換する。これによって、元の値が何か分からないように匿名化をする。匿名化情報生成部３１０は、サイズに対しては、グループ内のデータの範囲に変換する。これによって、各データのサイズを分からないようにする。
以上によって図１３（ｂ）に示す開示用のコンテキスト情報が生成される。この情報がシューズショップサービスに開示される。これによって、シューズショップサービスは、ショップ内の在庫と照らし合わせて、このリストにある靴が在庫にないときは、近隣の店舗から取り寄せることを行う。 In S504, the anonymization classification unit 309 performs an anonymization group creation process. First, the disclosure information generation unit 305 repeats dividing the user to the minimum size that satisfies the condition of the anonymization group size, paying attention only to the data of the usage probability code Δ. In this example, two groups of User01 to User08 and User012 to User18 are created. Next, the disclosure information generation unit 305 performs processing while paying attention to the data of the use probability code ◯. Then, two groups of User 09 to User 11 and User 19 to User 21 are created.
In S505, the anonymization information generation unit 310 generates disclosure information for each anonymization group. The anonymization information generation unit 310 processes each item for each of the four groups. More specifically, the anonymization information generation unit 310 leaves the value for “category, subcategory, color classification, color” when all the data in the group has the same value, and is different. Is replaced with “*”. As a result, anonymization is performed so that the original value is unknown. The anonymization information generation unit 310 converts the size into a data range in the group. As a result, the size of each data is not known.
Thus, the context information for disclosure shown in FIG. 13B is generated. This information is disclosed to the shoe shop service. As a result, the shoe shop service compares with the stock in the shop, and if the shoes on this list are not in stock, they are ordered from nearby stores.

図１４（ａ）は図１３に示す状態から時間が進んだ時刻Ｔ＋１における開示用のコンテキスト情報を示している。時刻Ｔ＋１においてはＵｓｅｒ１２の利用確率が上昇したため、利用確率の符号が△から○へと変化している。ここでは分かりやすいように符号を黒で塗りつぶしている。
利用確率の上昇の結果、Ｓ５０４で作成される匿名化グループが変化する。より具体的には、利用確率の符号△のデータでは、匿名化グループが、Ｕｓｅｒ１２が抜けたことで、Ｕｓｅｒ０１〜Ｕｓｅｒ０８とＵｓｅｒ０１３〜Ｕｓｅｒ１８とに変わる。更に、利用確率の符号○のデータでは、匿名化グループが、Ｕｓｅｒ１２が増えたことで、Ｕｓｅｒ０９〜Ｕｓｅｒ１２とＵｓｅｒ１９〜Ｕｓｅｒ２１とに変化する。
そして、匿名化グループごとに開示情報が生成される。例えば、Ｕｓｅｒ１２を含むＵｓｅｒ０９〜Ｕｓｅｒ１２のグループは、色が＊に置きかえられ、サイズがグループのデータの区間へと置きかえられる。
これによって、Ｕｓｅｒ１２の開示されるコンテキスト情報は、時刻Ｔでは加工されて分からなかったサブカテゴリと色分類が分かるようになる。
図１４（ｂ）は更に時刻がＴ＋２へと進んだ状態を示している。時刻Ｔ＋２では、Ｕｓｅｒ１２の利用確率が更に上昇したため、利用確率の符号が○から☆へと変化している。ここでは分かりやすいように符号を黒で塗りつぶしている。
利用確率の上昇の結果、Ｓ５０４で作成される匿名化グループが変化する。より具体的には、利用確率の符号△のデータの匿名化グループは変化しない。しかし、利用確率の符号○のデータでは、匿名化グループが、Ｕｓｅｒ１２が減ったことで、Ｕｓｅｒ０９〜Ｕｓｅｒ１１とＵｓｅｒ１９〜Ｕｓｅｒ２１とに変化する。そして、利用確率の☆のデータは、匿名化グループのサイズは１であることが条件なため、Ｕｓｅｒ１だけの匿名化グループが形成される。
そして、匿名化グループごとに開示情報が生成される。しかし、Ｕｓｅｒ１２を含む匿名化グループは、Ｕｓｅｒ１２しかデータがないため、そのままの値が開示されることになる。
これによって、Ｕｓｅｒ１２の開示されるコンテキスト情報は、時刻Ｔ＋１では加工されて分からなかった色とサイズが分かるようになる。
このように、Ｕｓｅｒ１２のように、利用確率が上昇するにつれて、そのユーザの情報の精度が高まる。より具体的には、「カテゴリ、サブカテゴリ、色分類、色」はその値が分かるようになる。「サイズ」はその区間が狭まっていく。これによって、利用確率の高い状態においては、より個人化された準備がされるようになる。より具体的には、ユーザが開示した情報に適合する靴が予め店舗に用意されるようになる。その結果、店舗を訪れたときに、望む靴が在庫にないという事態が避けられ、待ち時間少なく商品を見たり試着したりすることができるようになる。 FIG. 14A shows the context information for disclosure at time T + 1 when time has advanced from the state shown in FIG. At time T + 1, the usage probability of User 12 has increased, so the sign of the usage probability has changed from Δ to ◯. Here, the symbols are blacked out for easy understanding.
As a result of the increase in usage probability, the anonymization group created in S504 changes. More specifically, in the data of the usage probability code Δ, the anonymization group is changed to User01 to User08 and User013 to User18 due to the absence of User12. Furthermore, in the data of the use probability code ◯, the anonymization group is changed to User 09 to User 12 and User 19 to User 21 due to the increase of User 12.
Then, disclosure information is generated for each anonymization group. For example, in the group of User 09 to User 12 including User 12, the color is replaced with * and the size is replaced with the data section of the group.
As a result, the context information disclosed by the User 12 can be recognized at the time T, and the subcategory and the color classification that are not known.
FIG. 14B shows a state in which the time has further advanced to T + 2. At time T + 2, since the usage probability of User 12 has further increased, the sign of the usage probability has changed from ○ to ☆. Here, the symbols are blacked out for easy understanding.
As a result of the increase in usage probability, the anonymization group created in S504 changes. More specifically, the anonymization group of the data of the use probability code Δ does not change. However, in the data of the use probability code ◯, the anonymization group changes to User 09 to User 11 and User 19 to User 21 because User 12 has decreased. And since the data of the use probability ☆ is on condition that the size of the anonymization group is 1, an anonymization group of only User1 is formed.
Then, disclosure information is generated for each anonymization group. However, since the anonymization group including User 12 has data only for User 12, the value as it is is disclosed.
As a result, the context information disclosed by the User 12 can be known at the time T + 1, and the color and size that were not known after being processed.
Thus, as with User 12, as the use probability increases, the accuracy of the user's information increases. More specifically, the values of “category, subcategory, color classification, color” can be known. “Size” narrows the section. As a result, in a state where the use probability is high, a more personalized preparation is made. More specifically, shoes that match the information disclosed by the user are prepared in advance in the store. As a result, it is possible to avoid a situation where the desired shoes are not in stock when visiting the store, and to see or try on the product with less waiting time.

タクシーサービスやシューズショップサービスの例においては、コンテキスト情報は変化せずに、利用確率のみが変化する場合について述べた。しかし、コンテキスト情報も時々刻々と変化する。つまり、タクシーサービスであれば、ユーザが移動することで位置情報は変化していく。また、シューズショップサービスでは、望みの靴の情報も複数のお店を見る過程で変化していく。
また、この例では、利用確率が上昇する例について述べたが、利用確率が低下する場合もある。利用確率が低下すると、情報の精度は低くなる。タクシーサービスの例では、開示情報の重心座標がユーザの位置情報と異なりが大きくなることがある。シューズショップサービスの例では、「カテゴリ、サブカテゴリ、色分類、色」はその値が分からなくなり、「サイズ」はその区間が広がるといったことが起きる。これによって、利用確率が低下したユーザの情報が匿名化されるようになる。
一度開示された情報が匿名化で隠されたとしても、開示したときの情報をサービス側が記憶している場合もあり得る。しかし、コンテキスト情報は時々刻々と変化するため、開示以降の変化に対しては追従をすることができなくなる。
以上によって、ユーザがサービスを利用する確率に合わせて情報の精度を変えることで、匿名化とサービス準備との両立を図ることができる。
特に、開示情報生成部３０５は、匿名化グループを構成して、同グループのユーザのコンテキスト情報を同じものにする匿名化を行う。これによって、グループ内のユーザは他のユーザとの区別がつかなくなる。これによって、プライバシを保護できるようになる。 In the examples of taxi service and shoe shop service, the case where only the use probability changes without changing the context information has been described. However, context information also changes from moment to moment. That is, in the case of a taxi service, the location information changes as the user moves. Also, in the shoe shop service, the information on the shoes you want changes as you browse multiple shops.
In this example, an example in which the use probability increases is described, but the use probability may decrease. As the use probability decreases, the accuracy of information decreases. In the example of the taxi service, the center-of-gravity coordinates of the disclosure information may be different from the user position information. In the example of the shoe shop service, the value of “category, subcategory, color classification, color” is unknown, and “size” has a wide range. As a result, the information of the user whose use probability has decreased is anonymized.
Even if the information once disclosed is hidden by anonymization, the service side may store the information when disclosed. However, since the context information changes from moment to moment, it becomes impossible to follow changes after disclosure.
As described above, it is possible to achieve both anonymization and service preparation by changing the accuracy of information according to the probability that the user uses the service.
In particular, the disclosure information generation unit 305 forms an anonymization group and anonymizes the context information of the users of the same group. As a result, the users in the group cannot be distinguished from other users. As a result, privacy can be protected.

本実施形態では、ユーザＩＤを含めてサービス側にコンテキスト情報を開示している。しかし、このユーザＩＤはサービスに閉じたユーザＩＤである。そのため、他サービスが保有するデータと紐付けるには、共通に開示されたコンテキスト情報との一致をとって、データを結合する必要がある。しかし、コンテキスト情報は匿名化されているため、一定数以上には絞り込むことができない。
また、本実施形態では、利用確率でユーザをレイヤに分けることで、利用確率が同程度のユーザ同士で匿名化を行う。これによって、利用確率が高いユーザが、利用確率の低いユーザの匿名化の要求レベルに合わせて情報の精度が落とされてしまうことを防ぐことができる。
加えて、本実施形態では、匿名化グループのサイズを、グループの利用確率に基づいて決定する。これによって、匿名化後の情報がサービス側にとっても有益になるようにすることができる。加えて、本実施形態では、グループの利用確率を、グループを構成するユーザのうち少なくとも１人がサービスを利用する確率とする。このとき、提供された情報に対する準備が、何れかのユーザ１人に要求確率のもとで使われるため、無駄になる可能性を確率的に制御できるようになる。
加えて、本実施形態では、利用確率のレイヤごとに限界誤差を定めておき、限界誤差を超えるグループを解体して、別のグループへとユーザを割り当てる。これによって、利用確率ごとに誤差が限界誤差より大きくなることを抑制できる。特に、本実施形態では、別のグループへの割り当ての際に、ユーザの属するレイヤよりも下位のレイヤのグループへの割り当ても含めている。これによって、誤差が限界誤差よりも大きくなることをより抑制できる。 In the present embodiment, context information is disclosed to the service side including the user ID. However, this user ID is a user ID closed to the service. For this reason, in order to associate with data held by another service, it is necessary to match the data with the context information disclosed in common and combine the data. However, since the context information is anonymized, it cannot be narrowed down beyond a certain number.
Moreover, in this embodiment, anonymization is performed between users having the same usage probability by dividing the users into layers based on the usage probability. Thereby, it is possible to prevent a user with a high use probability from degrading the accuracy of information in accordance with the request level for anonymization of a user with a low use probability.
In addition, in this embodiment, the size of the anonymization group is determined based on the use probability of the group. Thereby, the information after anonymization can be useful for the service side. In addition, in this embodiment, the group use probability is a probability that at least one of the users configuring the group will use the service. At this time, since the preparation for the provided information is used for one of the users with a request probability, the possibility of being wasted can be controlled stochastically.
In addition, in this embodiment, a limit error is defined for each layer of use probability, a group exceeding the limit error is disassembled, and a user is assigned to another group. Thereby, it is possible to suppress the error from becoming larger than the limit error for each use probability. In particular, in the present embodiment, the assignment to a group in a lower layer than the layer to which the user belongs is included in the assignment to another group. This can further suppress the error from becoming larger than the limit error.

本実施形態では、利用確率に基づいて開示情報を生成していた。しかし、利用確率に限定されるものではなく、ユーザのサービスの要求度であればよい。より具体的には、確率のような０〜１の連続値ではなく、０〜１という制限のない連続値であったり、「高」「中」「低」などのカテゴリカル値であったりしてもよい。本実施形態における要求度はこれらに限定されるものではない。
また、本実施形態ではシステムが推定することによって求められていたが、ユーザが入力したものを利用してもよい。より具体的には、ＰＣＡサーバ１０２は、ＰＣＡクライアント１０１のモニタ等に要求度を入力するＵＩを表示させ、ユーザ入力を受け取るようにしてもよい。例えば、ＰＣＡサーバ１０２は、要求度を５段階等でＰＣＡクライアント１０１のモニタ等に提示して、スライダー等のＵＩによりユーザに入力させたものを用いてもよい。段階については、ＰＣＡサーバ１０２は、図４（ａ）に示す開示情報を生成する際のパラメータを用いて、レイヤ数に一致するように生成してもよい。又は、ＰＣＡサーバ１０２は、要求度を色味等で表示して、要求度を高めたり低めたりするためのボタンを通して入力させるようにしてもよい。又は、例えば、ＰＣＡサーバ１０２は、ＰＣＡクライアント１０１のモニタ等を介して利用確率を受け付けるようにしてもよいし、要求度を表現する連続値やカテゴリカル値を受け付けるようにしてもよい。推定は誤ることもあるため、ユーザがサービスを強く希望しているときは、ＰＣＡサーバ１０２は、ＰＣＡクライアント１０１のモニタ等を介したユーザ操作に基づき、要求度を高く入力する。又は、自身の情報が開示されることを嫌うときは、ＰＣＡサーバ１０２は、ＰＣＡクライアント１０１のモニタ等を介したユーザ操作に基づき、能動的に要求度を低く入力する。これによって、ユーザ意図を反映した情報開示を行うことができる。 In the present embodiment, the disclosure information is generated based on the use probability. However, it is not limited to the use probability, and may be a user service request level. More specifically, it is not a continuous value of 0 to 1 such as a probability but a continuous value of 0 to 1 or a categorical value such as “high”, “medium”, and “low”. May be. The required degree in this embodiment is not limited to these.
Further, in the present embodiment, it is obtained by estimation by the system, but what is input by the user may be used. More specifically, the PCA server 102 may display a UI for inputting the degree of request on the monitor or the like of the PCA client 101 and receive a user input. For example, the PCA server 102 may use a request degree that is presented to the monitor of the PCA client 101 in five stages or the like and is input by the user through a UI such as a slider. The PCA server 102 may generate the stages so as to match the number of layers by using the parameters when generating the disclosure information illustrated in FIG. Alternatively, the PCA server 102 may display the request level with a color or the like and input the request level through buttons for increasing or decreasing the request level. Alternatively, for example, the PCA server 102 may accept the use probability via the monitor of the PCA client 101 or the like, or may accept a continuous value or a categorical value expressing the request level. Since the estimation may be erroneous, when the user strongly desires the service, the PCA server 102 inputs a high degree of request based on a user operation via the monitor of the PCA client 101 or the like. Alternatively, when the user does not want to disclose his own information, the PCA server 102 actively inputs a low request level based on a user operation via the monitor of the PCA client 101 or the like. Thereby, information disclosure reflecting the user's intention can be performed.

＜実施形態２＞
実施形態１では利用確率が高いとき、サービスへと情報を開示していた。しかしながら、利用確率を高く推定されたユーザであっても、一定以上の情報開示を望まないこともある。そこで、ユーザが指定した開示条件を満たすように開示情報を生成する。本実施形態では、ユーザが入力した開示条件の一つである「匿名化要求」を満たすように開示情報を生成する方法を示す。そこで、本実施形態では、ユーザが入力した匿名化要求を満たすように開示情報を生成する方法を示す。 <Embodiment 2>
In the first embodiment, when the use probability is high, information is disclosed to the service. However, even a user whose usage probability is highly estimated may not want to disclose more than a certain amount of information. Therefore, the disclosure information is generated so as to satisfy the disclosure condition specified by the user. In the present embodiment, a method for generating disclosure information so as to satisfy an “anonymization request” which is one of disclosure conditions input by a user is shown. Therefore, in the present embodiment, a method for generating the disclosure information so as to satisfy the anonymization request input by the user is shown.

本実施形態における、開示する情報を生成するシステムの機能構成について説明する。
本実施形態では、実施形態１の構成に加えて、匿名化要求入力部が存在する。加えて、匿名化分類部３０９と匿名化情報生成部３１０とが異なる。以下、順に説明を行う。
本実施形態の匿名化要求入力部は、ユーザの情報を開示する際に満たすべき匿名化の条件を入力する。例えば、匿名化要求入力部は、ユーザの情報を開示する際には匿名化グループサイズは３人以上必要であるという条件を入力する。又は、匿名化要求入力部は、ある地域や時間帯では、匿名化グループサイズは３人以上必要であるという条件でもよい。これによって、家等にいる場合は、家の所在を明らかにしないように匿名化グループを大きくできる。例えば、匿名化要求入力部は、ＰＣＡサーバ１０２のモニタ、入力デバイス等を介して入力された匿名化要求の条件を受け取ってもよい。また、匿名化要求入力部は、ＰＣＡクライアント１０１のモニタ、入力デバイス等を介して入力された匿名化要求の条件を、ネットワークを介して受け取ってもよい。
又は、匿名化要求入力部は、開示情報のあいまいさの条件を入力してもよい。つまり、位置情報をもとに開示情報として中心座標と半径とを開示するときは、匿名化要求入力部は、その半径サイズの最低値の条件を入力してもよい。又は、属性情報を開示するときは、匿名化要求入力部は、匿名化された項目の個数の最低値を条件として入力してもよい。又は、匿名化要求入力部は、必ず匿名化されるべき項目等を入力してもよい。また、匿名化要求入力部は、地域や時間帯等の条件を含めて、開示情報のあいまいさの条件を入力するようにしてもよい。例えば、匿名化要求入力部は、ある地域においては、位置情報を開示する際は、最低の半径サイズは１００ｍ以上にする等である。又は、匿名化要求入力部は、ある地域においては、位置情報は開示しない等としてもよい。
なお、匿名化要求はこれらに限定されるものではない。 A functional configuration of a system for generating information to be disclosed in the present embodiment will be described.
In the present embodiment, in addition to the configuration of the first embodiment, there is an anonymization request input unit. In addition, the anonymization classification unit 309 and the anonymization information generation unit 310 are different. Hereinafter, description will be made in order.
The anonymization request input unit of the present embodiment inputs anonymization conditions that should be satisfied when disclosing user information. For example, the anonymization request input unit inputs a condition that three or more anonymization group sizes are required when disclosing user information. Alternatively, the anonymization request input unit may be a condition that anonymization group size is required for three or more people in a certain region or time zone. This makes it possible to increase the anonymization group so as not to reveal the location of the house when at home. For example, the anonymization request input unit may receive an anonymization request condition input via a monitor, an input device, or the like of the PCA server 102. Further, the anonymization request input unit may receive an anonymization request condition input via a monitor, an input device, or the like of the PCA client 101 via a network.
Alternatively, the anonymization request input unit may input the ambiguity condition of the disclosure information. That is, when disclosing center coordinates and a radius as disclosure information based on position information, the anonymization request input unit may input a condition for the minimum value of the radius size. Alternatively, when disclosing attribute information, the anonymization request input unit may input the minimum value of the number of anonymized items as a condition. Alternatively, the anonymization request input unit may input items that should be anonymized without fail. Further, the anonymization request input unit may input the ambiguity condition of the disclosure information including the conditions such as the region and the time zone. For example, when anonymization request input unit discloses position information in a certain area, the minimum radius size is set to 100 m or more. Alternatively, the anonymization request input unit may not disclose position information in a certain area.
The anonymization request is not limited to these.

本実施形態の匿名化分類部３０９は、匿名化要求を満たすように、利用確率分類部３０８で得たレイヤごとに、匿名化対象のコンテキスト情報の類似性に基づいてユーザをグループに分類する。匿名化グループサイズに関する条件が指定されている場合、匿名化分類部３０９は、条件を満たすように匿名化グループを作成する。より具体的には、実施形態１の図５の、Ｓ５０１とＳ５０２との間において、匿名化分類部３０９は、匿名化要求を満たすように利用確率を修正する。例えば、匿名化要求が「匿名化グループサイズは３人以上」という条件であり、Ｓ５０１において利用確率１と推定されたユーザが存在した場合、匿名化分類部３０９は、以下のように修正を行う。即ち、匿名化分類部３０９は、図４（ａ）のパラメータを基に、利用確率を０．７以上１未満の値として０．８等と修正する。また、匿名化要求が地域や時間帯等を有していれば、匿名化分類部３０９は、それら条件を満たすときのみ利用確率を修正する。
本実施形態の匿名化情報生成部３１０は、匿名化分類部３０９で得たグループごとに、匿名化要求を満たすように開示するコンテキスト情報を生成する。開示情報のあいまいさの条件が指定されている場合に、匿名化情報生成部３１０は、条件を満たすように開示情報を加工する。より具体的には、実施形態１の図５のＳ５０５の直後において、開示情報がユーザの匿名化要求を満たさないとき、匿名化情報生成部３１０は、更なるあいまい化を行う。例えば、匿名化要求が「位置情報の半径は５００ｍ以上」という条件であれば、Ｓ５０５にてユーザごとに求められた開示情報の半径が１００ｍのとき、匿名化情報生成部３１０は、条件を有するユーザの開示情報の半径を５００ｍ等と修正して開示情報とする。
以上によって、ユーザごとに匿名化要求を指定でできるため、一定以上の情報開示を行わないようにできる。 The anonymization classification unit 309 of the present exemplary embodiment classifies users into groups based on the similarity of context information to be anonymized for each layer obtained by the usage probability classification unit 308 so as to satisfy the anonymization request. When the condition regarding the anonymization group size is designated, the anonymization classification unit 309 creates an anonymization group so as to satisfy the condition. More specifically, between S501 and S502 in FIG. 5 of the first embodiment, the anonymization classification unit 309 corrects the use probability so as to satisfy the anonymization request. For example, when the anonymization request is a condition that “anonymization group size is 3 or more” and there is a user estimated to have a usage probability of 1 in S501, the anonymization classification unit 309 corrects as follows. . That is, the anonymization classification unit 309 corrects the use probability to 0.8 or less as a value of 0.7 or more and 1 based on the parameters of FIG. Further, if the anonymization request has a region, a time zone, or the like, the anonymization classification unit 309 corrects the use probability only when these conditions are satisfied.
The anonymization information generation unit 310 of this embodiment generates context information to be disclosed so as to satisfy the anonymization request for each group obtained by the anonymization classification unit 309. When the condition of the ambiguity of the disclosure information is designated, the anonymized information generation unit 310 processes the disclosure information so as to satisfy the condition. More specifically, immediately after S505 in FIG. 5 of the first embodiment, when the disclosed information does not satisfy the user's anonymization request, the anonymization information generation unit 310 performs further ambiguity. For example, if the anonymization request is a condition that “the radius of the position information is 500 m or more”, the anonymization information generation unit 310 has a condition when the radius of the disclosure information obtained for each user in S505 is 100 m. The radius of the user's disclosure information is corrected to 500 m or the like to make the disclosure information.
As described above, since an anonymization request can be designated for each user, information disclosure beyond a certain level can be prevented.

＜実施形態３＞
実施形態１では、図４（ａ）に示すようなパラメータ（開示パラメータ）を予め保持して、開示情報生成部３０５は、これに従う開示情報を生成していた。そのため、システムの推定する利用確率に基づいて、ユーザ一律に開示情報が生成されていた。しかしながら、ユーザによっては、積極的に情報を開示し、サービスの恩恵を受けることを願うユーザも存在する。そのようなユーザはサービス側の準備に対する要求（準備要求）が高いといえる。また、サービス側からしても、準備要求にこたえるために必要な情報品質等がある。サービス側の準備の段階等もサービスによって異なると考えられる。そこで、本実施形態では開示条件の一つである「準備要求」を満たすように開示情報を生成する方法を示す。特に、サービス側の求める準備要求と、ユーザ側の求める準備要求と、の双方を用いて開示情報を生成する方法について述べる。 <Embodiment 3>
In the first embodiment, parameters (disclosure parameters) as shown in FIG. 4A are stored in advance, and the disclosure information generation unit 305 generates disclosure information according to the parameters. Therefore, disclosure information is generated uniformly for the user based on the usage probability estimated by the system. However, some users may wish to actively disclose information and benefit from services. It can be said that such a user has a high request for preparation on the service side (preparation request). The service side also has information quality necessary to meet the preparation request. The stage of preparation on the service side may vary depending on the service. Therefore, in the present embodiment, a method for generating disclosure information so as to satisfy “preparation request” which is one of the disclosure conditions will be described. In particular, a description will be given of a method for generating disclosure information by using both a preparation request required by the service side and a preparation request required by the user side.

本実施形態における、開示する情報を生成するシステムの機能構成について説明する。本実施形態では、実施形態１の構成に加えて、ユーザ準備要求入力部と、サービス準備要求入力部と、開示設定生成部と、が存在する。加えて、匿名化分類部３０９が異なる。以下、順に説明を行う。
サービス準備要求入力部は、匿名化レベルと情報品質とを入力する。より具体的には、匿名化レベルは、開示可否と匿名化グループサイズとの組合せである。例えば、サービス準備要求入力部は、図４（ａ）に示すような開示可否と匿名化グループサイズとの組合せを入力する。この例では４つのレベルを入力していることになる。この説明では図４（ａ）に示す情報を入力する形で説明しているが、匿名化レベルを増やす等してもよい。更に、サービス準備要求入力部は、この匿名化レベルに対応づけて情報品質を入力するようにしてもよい。例えば、情報品質は限界誤差である。又は、情報品質としてグループの利用確率等を用いてもよい。例えば、サービス準備要求入力部は、ＰＣＡサーバ１０２のモニタ、入力デバイス等を介して入力された情報品質を受け取ってもよい。また、サービス準備要求入力部は、ＰＣＡクライアント１０１のモニタ、入力デバイス等を介して入力された情報品質を、ネットワークを介して受け取ってもよい。以下の、サービス準備要求入力部においても同様である。
加えて、サービス準備要求入力部は、初期値となる利用確率の区間を匿名化レベルに対応づけて入力しておいてもよい。更に、サービス準備要求入力部は、各匿名化レベルに対して、サービス側の準備の内容をコメントとして記すようにしてもよい。より具体的には、サービス準備要求入力部は、タクシーサービス等であれば、ユーザの半径１０００ｍ以内にタクシーが配車されるようにするという準備に関する情報を記す。この情報は、後述のユーザ準備要求入力部において用いられる。 A functional configuration of a system for generating information to be disclosed in the present embodiment will be described. In the present embodiment, in addition to the configuration of the first embodiment, a user preparation request input unit, a service preparation request input unit, and a disclosure setting generation unit exist. In addition, the anonymization classification unit 309 is different. Hereinafter, description will be made in order.
The service preparation request input unit inputs an anonymization level and information quality. More specifically, the anonymization level is a combination of disclosure permission and anonymization group size. For example, the service preparation request input unit inputs a combination of availability of disclosure and anonymized group size as shown in FIG. In this example, four levels are input. In this description, the information shown in FIG. 4A is input, but the anonymization level may be increased. Furthermore, the service preparation request input unit may input information quality in association with the anonymization level. For example, information quality is a marginal error. Alternatively, a group use probability or the like may be used as information quality. For example, the service preparation request input unit may receive information quality input via a monitor, input device, or the like of the PCA server 102. Further, the service preparation request input unit may receive the information quality input via the monitor, input device, etc. of the PCA client 101 via the network. The same applies to the following service preparation request input unit.
In addition, the service preparation request input unit may input a section of a usage probability that is an initial value in association with an anonymization level. Further, the service preparation request input unit may write the contents of preparation on the service side as a comment for each anonymization level. More specifically, the service preparation request input unit describes information regarding preparation that a taxi is to be dispatched within a radius of 1000 m of the user if the service is a taxi service or the like. This information is used in a user preparation request input unit described later.

ユーザ準備要求入力部は、どこまで情報を開示したいかに関する要求を入力する。より具体的には、ユーザ準備要求入力部は、利用確率に対応する匿名化人数を入力する。例えば、ユーザ準備要求入力部は、「利用確率が０．７以上０．８未満では匿名化グループサイズは３人以上としたい」「利用確率が０．８以上１以下では匿名化グループサイズは１人としたい」というような要求を入力する。又は、ユーザ準備要求入力部は、利用確率に対する誤差限界を入力する。例えば、開示情報が位置情報であれば、ユーザ準備要求入力部は、「利用確率が０．７以上０．８未満では誤差限界は２００ｍとしたい」というような要求を入力する。
システム側は、匿名化レベルを提示して、それに対応させて上記情報を入力させてもよい。より具体的には、ユーザはそれぞれに対して利用確率の区間を設定することや、誤差限界を設定する。このとき、利用確率の区間を対応づけない匿名化レベルがあってもよい。例えば、情報開示に対して寛容なユーザであれば、匿名化が強すぎると感じるレベルに対しては、利用確率の区間を対応づけないこと等ができる。
また、システム側から匿名化レベルを提示するときには、サービス準備要求入力部にて入力されたサービス側の準備内容のコメントを共に表示するようにしてもよい。これによって、プライバシ保護とサービス享受との双方を考慮しながら、ユーザは利用確率の区間を対応づけていくことができる。 The user preparation request input unit inputs a request regarding how much information is desired to be disclosed. More specifically, the user preparation request input unit inputs the anonymized number of people corresponding to the use probability. For example, the user preparation request input unit states that “if the usage probability is 0.7 or more and less than 0.8, the anonymization group size is 3 or more” “if the usage probability is 0.8 or more and 1 or less, the anonymization group size is 1 Enter a request that says "I want to be a person". Alternatively, the user preparation request input unit inputs an error limit for the use probability. For example, if the disclosure information is position information, the user preparation request input unit inputs a request such as “If the use probability is 0.7 or more and less than 0.8, the error limit is 200 m”.
The system side may present the anonymization level and input the information corresponding to it. More specifically, the user sets a use probability interval or sets an error limit for each. At this time, there may be an anonymization level that does not associate intervals of use probabilities. For example, if the user is tolerant of information disclosure, the usage probability interval may not be associated with a level at which anonymization is felt to be too strong.
Further, when the anonymization level is presented from the system side, the comment on the service-side preparation content input by the service preparation request input unit may be displayed together. Thus, the user can associate the usage probability intervals while considering both privacy protection and service enjoyment.

開示パラメータ生成部は、サービス準備要求入力部とユーザ準備要求入力部とで入力された情報を基に、図４（ａ）に示すような開示情報を生成するための開示パラメータを生成する。実施形態１、２では、開示パラメータはユーザによらず同じであったが、本実施形態ではユーザごとに開示パラメータを生成する。より具体的には、開示パラメータ生成部は、サービス準備要求入力部で入力された匿名化レベルに対して、ユーザ準備要求入力部で得た利用確率の区間を対応付ける。加えて、開示パラメータ生成部は、サービス準備要求入力部で入力された限界誤差を、ユーザ準備要求入力部で入力された限界誤差で上書きする。
また、匿名化レベルが細かすぎる場合、利用確率を対応づけていないユーザが多くなり、各匿名化レベルでの匿名化処理が難しくなることが考えられる。そこで、開示パラメータ生成部は、対応人数等を基にサービス提供側へ出力して、匿名化レベルの対応を丸めて作り直すことを提案してもよい。
本実施形態の匿名化分類部３０９は、ユーザごとの開示パラメータに従って匿名化グループを作成する。より具体的にはＳ５０２にて、匿名化分類部３０９は、ユーザごとの開示パラメータの利用確率の区間に従ってユーザをレイヤに分けることを行う。レイヤが同じユーザは匿名化グループサイズが同じであるため、以降の処理は限界誤差グループ解体処理（Ｓ６０９及びフローチャート図７）まで同じである。限界誤差グループ解体処理では、ユーザごとに限界誤差が異なるため処理が少し異なる。より具体的にはＳ７０４にて、匿名化分類部３０９は、現在の誤差と限界誤差の差が大きいユーザを優先的に選択する。そして、Ｓ７０７とＳ７１０とでは、匿名化分類部３０９は、ユーザ全ての限界誤差を満たすことを条件として処理を行う。
以上によって、サービス及びユーザが求める準備要求を基に開示情報を生成する。特に、サービス準備要求を設定することで、サービスごとに異なるサービス準備段階と必要な情報品質を反映した開示パラメータを生成できる。加えて、ユーザ準備要求を設定することで、ユーザごとの開示に対する許容性やサービス享受に対する積極性を反映した開示パラメータを生成できる。
本実施形態では、開示パラメータは状況によらずユーザに固定であった。しかしながら、状況に合わせて開示パラメータを変えるように構成してもよい。より具体的には、サービス準備要求入力部とユーザ準備要求入力部とでは、場所や時間帯等の条件を共に入力してもよい。そして、匿名化分類部３０９では条件に合わせて開示パラメータを切り替えて利用する。
つまり、サービス準備要求入力部では、開示レベルや情報品質は、場所や時間帯等の状況に応じて複数設定する。例えば、タクシーサービス等では、タクシーの少ないエリアでは匿名化レベルを細かく設定しても、対応できないために、匿名化レベルをエリアに合わせて変える。
また、ユーザ準備要求入力部では、時間帯や場所等の状況を条件として、準備要求を指定する。例えば「休日においては」等という条件を基に、より情報を開示するような設定する。これによって、休日のみ特別にサービスを積極的に受けられるような情報開示がなされることになる。反対に、ビジネスに関するサービスに対しては「平日の８：００〜１７：００」等を条件とすれば、ビジネス中には特別にサービスを積極的に受けられるような情報開示がなされる。 The disclosure parameter generation unit generates a disclosure parameter for generating disclosure information as illustrated in FIG. 4A based on information input by the service preparation request input unit and the user preparation request input unit. In the first and second embodiments, the disclosure parameter is the same regardless of the user, but in the present embodiment, the disclosure parameter is generated for each user. More specifically, the disclosure parameter generation unit associates the use probability section obtained by the user preparation request input unit with the anonymization level input by the service preparation request input unit. In addition, the disclosure parameter generation unit overwrites the limit error input by the service preparation request input unit with the limit error input by the user preparation request input unit.
Moreover, when the anonymization level is too fine, there are many users who do not associate usage probabilities, and anonymization processing at each anonymization level may be difficult. Therefore, the disclosure parameter generation unit may propose to output to the service providing side based on the number of responding persons and the like to round the correspondence of the anonymization level and recreate it.
The anonymization classification unit 309 of the present embodiment creates an anonymization group according to the disclosure parameters for each user. More specifically, in S502, the anonymization classification unit 309 divides the users into layers according to the use probability interval of the disclosure parameter for each user. Since the users with the same layer have the same anonymization group size, the subsequent processing is the same up to the marginal error group disassembly processing (S609 and flowchart 7). In the limit error group dismantling process, the process differs slightly because the limit error differs for each user. More specifically, in S704, the anonymization classification unit 309 preferentially selects a user having a large difference between the current error and the limit error. And in S707 and S710, the anonymization classification | category part 309 performs a process on condition that all the user's limit errors are satisfy | filled.
As described above, the disclosure information is generated based on the service and the preparation request requested by the user. In particular, by setting a service preparation request, it is possible to generate a disclosure parameter that reflects different service preparation stages and necessary information quality for each service. In addition, by setting a user preparation request, it is possible to generate a disclosure parameter that reflects the permissibility of disclosure for each user and the aggressiveness of service reception.
In the present embodiment, the disclosed parameter is fixed to the user regardless of the situation. However, you may comprise so that a disclosed parameter may be changed according to a condition. More specifically, the service preparation request input unit and the user preparation request input unit may input conditions such as a place and a time zone together. And in the anonymization classification | category part 309, a disclosure parameter is switched and used according to conditions.
That is, in the service preparation request input unit, a plurality of disclosure levels and information qualities are set according to the situation such as location and time zone. For example, in a taxi service or the like, even if the anonymization level is set finely in an area with few taxis, it cannot be handled, so the anonymization level is changed according to the area.
In addition, the user preparation request input unit designates a preparation request on condition of conditions such as a time zone and a place. For example, the information is set to be disclosed more on the basis of a condition such as “on holidays”. As a result, information is disclosed so that the service can be positively received only on holidays. On the other hand, for business-related services, if “weekdays from 8:00 to 17:00” or the like is a condition, information is disclosed so that the service can be actively received during business.

本実施形態では、実施形態２に示す匿名化要求を満たす開示情報の生成の方法は含めていなかったが、それを含めるように構成してもよい。より具体的には、開示パラメータ生成部は、匿名化要求入力部の匿名化グループサイズの条件も用いて、開示パラメータを生成するようにしてもよい。又は、匿名化要求入力部から匿名化グループサイズを入力しているが、ユーザ準備要求入力部から入力してもよい。また、匿名化要求入力部で、開示情報のあいまいさの条件を入力する。そして、実施形態２で示すように、匿名化情報生成部３１０にて、条件を満たす情報生成を行えばよい。 In the present embodiment, the method of generating the disclosure information that satisfies the anonymization request shown in the second embodiment is not included, but it may be configured to include it. More specifically, the disclosure parameter generation unit may generate the disclosure parameter using the anonymization group size condition of the anonymization request input unit. Or although the anonymization group size is input from the anonymization request input part, you may input from a user preparation request input part. In addition, the anonymization request input unit inputs the ambiguity condition of the disclosure information. Then, as shown in the second embodiment, the anonymized information generation unit 310 may generate information that satisfies a condition.

＜その他の実施形態＞
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給する。そして、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読み出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 <Other embodiments>
The present invention supplies a program that realizes one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium. It can also be realized by a process in which one or more processors in the computer of the system or apparatus read and execute the program. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

以上、本発明の好ましい実施形態について詳述したが、本発明は係る特定の実施形態に限定されるものではない。 As mentioned above, although preferable embodiment of this invention was explained in full detail, this invention is not limited to the specific embodiment which concerns.

以上、上述した各実施形態の処理によれば、プライバシ保護とタイミングギャップ解消とを両立させることができる。つまり、ユーザがサービスを利用する確率が低いときは、情報の精度が低くなるため、プライバシが保護される。一方で、ユーザがサービスを利用する確率が高まるにつれて、情報の精度が高まるため、徐々にタイミングギャップを小さくしていくことができる。そのため、サービスを利用する直前にはタイミングギャップを解消した状態を作り出すことができる。 As mentioned above, according to the process of each embodiment mentioned above, privacy protection and timing gap elimination can be made compatible. That is, when the probability that the user uses the service is low, the accuracy of information is low, and privacy is protected. On the other hand, since the accuracy of information increases as the probability that the user uses the service increases, the timing gap can be gradually reduced. Therefore, it is possible to create a state in which the timing gap is eliminated immediately before using the service.

１０２ＰＣＡサーバ
２０１ＣＰＵ
２０４外部記憶装置
２１０モニタ 102 PCA server 201 CPU
204 External storage device 210 Monitor

Claims

An estimation means for estimating service availability for each user;
Classification means for classifying the plurality of users into groups based on the availability and the context information of the plurality of users and the similarity of the context information according to the service;
Generating means for generating disclosure information disclosed to the service provider for each group based on user context information included in the group;
An information processing apparatus.

It further has an acquisition means for acquiring user context information,
The information processing apparatus according to claim 1, wherein the estimation unit estimates service availability for each user based on the context information.

The classification unit divides a plurality of users into layers based on the similarity of the availability, and for each of the divided layers, the context information of the plurality of users belonging to the layer and the similarity of the context information according to the service The information processing apparatus according to claim 1, wherein the plurality of users are classified into groups based on characteristics.

The information processing apparatus according to claim 3, wherein the classifying unit classifies the plurality of users into groups such that when the plurality of users are classified into groups, the group availability is equal to or higher than a set request probability.

The classification means sets the availability of the group as a probability that at least one user among the users included in the group uses the service, and the availability of the group is equal to or higher than a set request probability. The information processing apparatus according to claim 4, wherein the plurality of users are classified into groups.

The classification means approximates the availability of the group by the sum of the availability of services of a plurality of users included in the group, so that the availability of the group is equal to or higher than a set request probability. The information processing apparatus according to claim 4, wherein the plurality of users are classified into groups.

The information processing apparatus according to claim 3, wherein the classifying unit classifies the plurality of users into groups based on the similarity according to a group size set for each layer.

The classification means determines a group size based on the availability of the group, and classifies the plurality of users into groups based on the similarity according to the determined group size. The information processing apparatus according to claim 1.

The classification means is configured to classify the plurality of users into groups, and when the group error is larger than the setting error of the layer to which the group belongs among the setting errors set for each layer, the users included in the group The information processing apparatus according to claim 3, wherein the information processing apparatus is moved to another group.

10. The information processing according to claim 9, wherein when the group error is larger than the layer setting error, the classification unit moves a user included in the group to a group of layers having a lower availability section than the layer. apparatus.

The said production | generation means produces | generates the said disclosure information which weakened anonymization so that there was the availability of the said group for every said group based on the user's context information contained in the said group. Information processing device.

An information processing method executed by an information processing apparatus,
An estimation step for estimating service availability for each user;
A classification step of classifying the plurality of users into groups based on the availability and the context information of the plurality of users and the similarity of the context information according to the service;
Generating a disclosure information to be disclosed to the service provider for each group based on user context information included in the group;
An information processing method including:

The program for functioning a computer as each means of the information processing apparatus in any one of Claims 1 thru | or 11.