JP2011171995A

JP2011171995A - Device, method, and program for discriminating data

Info

Publication number: JP2011171995A
Application number: JP2010033663A
Authority: JP
Inventors: Satoru Kondo; 悟近藤; Takaaki Moriya; 高明森谷; Junichi Akahani; 淳一赤埴
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2010-02-18
Filing date: 2010-02-18
Publication date: 2011-09-01

Abstract

PROBLEM TO BE SOLVED: To attain data discrimination with a small storage capacity, where a Bloom filter without the necessity of high frequency decrement is used. SOLUTION: A data discrimination device discriminates whether input data belongs to a prescribed data aggregation or not by comparing the set of hash values to be obtained by applying a plurality of hash functions to the input data with the value of a corresponding element in the Bloom filter. Each element of the Bloom filter has a plurality of bits. When the Bloom filter is updated by adding new data to the prescribed data aggregation, the value of each element in the Bloom filter corresponding to the set of the hash values to be obtained by applying the plurality of hash functions to the added data is incremented by a prescribed probability which is larger than zero and smaller than one. COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、ネットワーク上で転送されるパケットなどのデータが所定のデータ集合に属するか否かを判別する技術に属する。 The present invention belongs to a technique for determining whether or not data such as a packet transferred on a network belongs to a predetermined data set.

従来、パケットの判別を行う技術としては、例えば、パケットのアプリケーション層データを扱って判別処理等を行う「Deep Packet Inspection」という技術がある。これは、オートマトン等を使うことで文字列パターンを高速に判定することが可能な技術である。また、可変長でないパターンを高速に検出する方法としては、ブルームフィルタ（Bloom Filter：以下「ＢＦ」とも称する。）を用いたものが存在する（非特許文献１参照）。 Conventionally, as a technique for discriminating a packet, for example, there is a technique called “Deep Packet Inspection” in which application layer data of a packet is handled and a discrimination process or the like is performed. This is a technique that can determine a character string pattern at high speed by using an automaton or the like. Further, as a method for detecting a pattern having a variable length at a high speed, there is a method using a Bloom filter (hereinafter also referred to as “BF”) (see Non-Patent Document 1).

ＢＦでは、複数のサンプルデータを正例（正解とするデータ）と負例（不正解とするデータ）に分別しておき、例えば、正例を複数のハッシュ（hash）関数にかけてバッファに格納しておくことで、新たなデータが正例に含まれるか否かを判別可能とする。 In BF, a plurality of sample data is classified into a positive example (data to be correct) and a negative example (data to be incorrect), and for example, the correct example is stored in a buffer by applying a plurality of hash functions. This makes it possible to determine whether or not new data is included in the positive example.

具体的には、ＢＦを実現する場合、例えば、最初すべて「０（ゼロ）」に初期化した長さＭビットのビット配列（初期状態のブルームフィルタ）と、ｋ個の互いに異なるハッシュ関数ｆ１，ｆ２，…，ｆｋと、複数のサンプルデータとを用いる。例えば、図１０（ａ）では、７ビットの初期状態のブルームフィルタを表している。 Specifically, when BF is realized, for example, a bit array (M Bloom filter in the initial state) of length M bits that is initially initialized to “0 (zero)” and k different hash functions f1, f2,..., fk and a plurality of sample data are used. For example, FIG. 10A shows a 7-bit Bloom filter in the initial state.

そして、あるサンプルデータについて、上記ｋ個の各ハッシュ関数により第１ハッシュ値ｈｆ１，ｈｆ２，…，ｈｆｋを得る。そして、これらの第１ハッシュ値をＭで除した第２ハッシュ値（剰余）を求め、求めたそれぞれの第２ハッシュ値に対応するビット配列上の各ビットを「１」とする処理を行う。これらを、各サンプルデータについて行う。つまり、ビット配列には多くの「１」が並ぶことになる。 Then, for a certain sample data, first hash values hf1, hf2,..., Hfk are obtained by the k hash functions. Then, a second hash value (remainder) obtained by dividing these first hash values by M is obtained, and a process is performed in which each bit on the bit array corresponding to each obtained second hash value is set to “1”. These are performed for each sample data. That is, many “1” s are arranged in the bit array.

例えば、図１０（ｂ）では、２つのサンプルデータについて、第２ハッシュ値に対応する要素（判定対象部分）の値を「１」とした７ビットのビット配列（（ｂ１）と（ｂ２））を表している。そして、図１０（ａ）に示す初期状態のブルームフィルタに対して、ビット配列（ｂ１）とビット配列（ｂ２）における「１」のビットを反映する処理を行う。そうすると、図１０（ｃ）に示すブルームフィルタとなる。 For example, in FIG. 10B, for two sample data, a 7-bit bit array ((b1) and (b2)) in which the value of the element (determination target part) corresponding to the second hash value is “1”. Represents. Then, a process of reflecting the bit “1” in the bit array (b1) and the bit array (b2) is performed on the Bloom filter in the initial state shown in FIG. Then, the Bloom filter shown in FIG.

次に、新たな判別対象データについて、同様に、上記ｋ個の各ハッシュ関数により第１ハッシュ値ｈｆ１，ｈｆ２，…，ｈｆｋを得て、さらに、これらの第１ハッシュ値をＭで除した第２ハッシュ値（剰余）を求める。そして、その求めた第２ハッシュ値のそれぞれに対応するビット配列上の各ビット値を参照し、それらがすべて「１」となっているか否かを判定する。これらがすべて「１」となっていれば、その判別対象データは、サンプルデータ集合に含まれていると判定する。一方、いずれか一つでも「１」となっていなければ、その判別対象データは、サンプルデータ集合に含まれていないと判定する。 Next, for the new discrimination target data, first hash values hf1, hf2,..., Hfk are obtained by the k hash functions, and the first hash values are divided by M. 2 Hash value (remainder) is obtained. Then, each bit value on the bit array corresponding to each of the obtained second hash values is referred to and it is determined whether or not all of them are “1”. If all of these are “1”, it is determined that the determination target data is included in the sample data set. On the other hand, if any one is not “1”, it is determined that the determination target data is not included in the sample data set.

例えば、新たな判別対象データが前記した２つのサンプルデータのうちのいずれかであり、ここではそのデータについての第２ハッシュ値に基づくビット配列が（ｄ１）（（ｂ１）と同じ）に示すようになったとする。そうすると、当該ビット配列（ｄ１）においてビットが「１」になっているのは第１要素、第４要素、第５要素である。そして、図１０（ｃ）に示すブルームフィルタにおける第１要素、第４要素、第５要素の値はすべて「１」となっているので、当該判別対象データはサンプルデータ集合に含まれていると判定する。この判定結果は正しい。 For example, the new discrimination target data is one of the two sample data described above, and here, the bit arrangement based on the second hash value for the data is shown in (d1) (same as (b1)) Suppose that Then, it is the first element, the fourth element, and the fifth element that have the bit “1” in the bit array (d1). Since the values of the first element, the fourth element, and the fifth element in the Bloom filter shown in FIG. 10C are all “1”, the determination target data is included in the sample data set. judge. This judgment result is correct.

一方、例えば、新たな判別対象データがＮＧデータ（サンプルデータ以外のデータ）であり、そのデータについての第２ハッシュ値に基づくビット配列が（ｄ２）に示すようになったとする。そうすると、当該ビット配列（ｄ２）においてビットが「１」になっているのは第２要素、第３要素、第７要素である。そして、図１０（ｃ）に示すブルームフィルタにおいて、第２要素、第７要素の値は「１」となっているが、第３要素の値は「０」であり「１」となっていないので、当該判別対象データは、サンプルデータ集合に含まれていないと判定する。この判定結果は正しい。 On the other hand, for example, it is assumed that new discrimination target data is NG data (data other than sample data), and the bit arrangement based on the second hash value for the data is as shown in (d2). Then, it is the second element, the third element, and the seventh element that have the bit “1” in the bit array (d2). In the Bloom filter shown in FIG. 10C, the values of the second element and the seventh element are “1”, but the value of the third element is “0” and not “1”. Therefore, it is determined that the determination target data is not included in the sample data set. This judgment result is correct.

しかし、非特許文献１のＢＦでは、正例にサンプルデータが追加される度に、偽陽性が高まるという性質がある。つまり、ブルームフィルタ上に「１」のビットが増えれば増えるほど、ＮＧデータについても、ブルームフィルタにおける対応する各要素の値がすべて「１」となっていて、サンプルデータ集合に含まれていると判定することがある。 However, the BF of Non-Patent Document 1 has a property that false positives increase every time sample data is added to a positive example. That is, as the number of “1” bits on the Bloom filter increases, the value of each corresponding element in the Bloom filter is also “1” for NG data, and is included in the sample data set. It may be judged.

例えば、新たな判別対象データがＮＧデータであり、そのデータについての第２ハッシュ値に基づくビット配列が図１０（ｄ）の（ｄ３）に示すようになったとする。そうすると、当該ビット配列（ｄ３）においてビットが「１」になっているのは第１要素、第２要素、第７要素である。そして、図１０（ｃ）に示すブルームフィルタにおける第１要素、第２要素、第７要素の値はすべて「１」となっているので、当該判別対象データはサンプルデータ集合に含まれていると判定する。しかし、この判定結果は誤っている。 For example, it is assumed that new discrimination target data is NG data, and the bit arrangement based on the second hash value for the data is as shown in (d3) of FIG. Then, it is the first element, the second element, and the seventh element that have the bit “1” in the bit array (d3). Since the values of the first element, the second element, and the seventh element in the Bloom filter shown in FIG. 10C are all “1”, the determination target data is included in the sample data set. judge. However, this determination result is incorrect.

これを解決するためには、例えば、カウント・ブルームフィルタ（以下、「ＣＢＦ」と称する。）と呼ばれる、ブルームフィルタの一要素に対して複数ビットを使用する方法がある（非特許文献２参照）。例えば、図１０（ｅ）では、ブルームフィルタの一要素に対して３ビットを使用したＣＢＦを示している。 In order to solve this, there is a method of using a plurality of bits for one element of a Bloom filter, for example, called a count Bloom filter (hereinafter referred to as “CBF”) (see Non-Patent Document 2). . For example, FIG. 10E shows a CBF that uses 3 bits for one element of a Bloom filter.

非特許文献２の技術では、具体的には、ブルームフィルタの一要素に複数ビットがあることを利用して、正例にサンプルデータの追加があって第２ハッシュ値の値の重複がある度に該当する要素の値をインクリメントする。さらに、所定時間の経過時に、全要素の値をデクリメントする。これにより、新しくて高頻度のサンプルデータを残した可能性が高い正例となり、偽陽性の増加を抑制することが可能となる。 Specifically, in the technique of Non-Patent Document 2, by using the fact that there is a plurality of bits in one element of the Bloom filter, there is an addition of sample data to the positive example and there is an overlap of the value of the second hash value. The value of the element corresponding to is incremented. Further, the values of all elements are decremented when a predetermined time elapses. Thereby, it becomes a positive example with high possibility of leaving the new and high frequency sample data, and it becomes possible to suppress the increase in false positives.

Sarang Dharmapurikar, et al., “Deep Packet Inspection using Parallel Bloom Filters”, IEEE Micro, IEEE Computer Society, January/February 2004, Volume24, Issue 1, p.52-61Sarang Dharmapurikar, et al., “Deep Packet Inspection using Parallel Bloom Filters”, IEEE Micro, IEEE Computer Society, January / February 2004, Volume24, Issue 1, p.52-61 Gianni Antichi, et al., “Counting Bloom Filters for Pattern Matching and Anti-Evasion at the Wire Speed”, IEEE Network Magagine, IEEE Computer Society, January/February 2009, Volume23, Issue 1, p.30-35Gianni Antichi, et al., “Counting Bloom Filters for Pattern Matching and Anti-Evasion at the Wire Speed”, IEEE Network Magagine, IEEE Computer Society, January / February 2009, Volume23, Issue 1, p.30-35

しかしながら、非特許文献２のＣＢＦでは、各要素に複数ビットを用いるため、ある程度の要素の値を数えることが可能であるが、大量に到達するデータに対しては、すぐに要素の値が上限に到達してしまうという欠点があり、頻繁にデクリメントする必要があった。そうすると、短期間であっても、その期間、未到達のデータについては、デクリメントによってＣＢＦから削除されてしまうことになるという問題があった。 However, since the CBF of Non-Patent Document 2 uses a plurality of bits for each element, it is possible to count the value of a certain amount of elements. There was a disadvantage that it reached the point, and it was necessary to decrement frequently. If it does so, even if it was a short period, there was a problem that the unreachable data in that period would be deleted from the CBF by decrement.

また、この問題を回避するために、ブルームフィルタの各要素に多くのビットを割り当てて、大量に到達するデータに対応する要素の値がすぐには上限に到達しないようにする方法もある。しかし、その場合、ＣＢＦに要する記憶容量が大きくなってしまい好ましくないという問題があった。 In addition, in order to avoid this problem, there is a method in which many bits are assigned to each element of the Bloom filter so that the value of the element corresponding to a large amount of data does not reach the upper limit immediately. However, in that case, there is a problem that the storage capacity required for the CBF increases, which is not preferable.

そこで、本発明は、前記した問題に鑑みてなされたものであり、高頻度のデクリメントを必要としないブルームフィルタを用いたデータ判別を、少ない記憶容量で実現することを課題とする。 Therefore, the present invention has been made in view of the above-described problems, and an object thereof is to realize data discrimination using a Bloom filter that does not require high-frequency decrement with a small storage capacity.

前記課題を解決するために、本発明は、入力データに複数のハッシュ関数を適用することにより得られるハッシュ値の組をブルームフィルタにおける対応する要素の値と比較し、当該得られたハッシュ値の組に対応するブルームフィルタにおける要素の値にゼロが１つもなければ、入力データが所定のデータ集合に属すると判別するデータ判別装置であって、ブルームフィルタを、要素それぞれが複数のビットを有する構成として記憶する記憶部と、所定のデータ集合に新たなデータを追加して記憶部に記憶されたブルームフィルタを更新する場合、当該追加するデータに複数のハッシュ関数を適用することにより得られるハッシュ値の組に対応するブルームフィルタにおける各要素の値を、０より大きく１より小さい所定の確率でインクリメントする処理部と、を備えることを特徴とする。 In order to solve the above problem, the present invention compares a set of hash values obtained by applying a plurality of hash functions to input data with the value of a corresponding element in a Bloom filter, and A data discriminating apparatus for discriminating that input data belongs to a predetermined data set if there is no zero in an element value in a Bloom filter corresponding to a set, wherein the Bloom filter has elements each having a plurality of bits When updating a Bloom filter stored in the storage unit by adding new data to a predetermined data set and a storage unit stored as a hash value obtained by applying a plurality of hash functions to the added data The value of each element in the Bloom filter corresponding to the set of is incremented with a predetermined probability greater than 0 and less than 1. Characterized in that it comprises a processing unit for cement, the.

かかる発明によれば、所定のデータ集合に新たなデータを追加してブルームフィルタを更新する場合、追加するデータに前記複数のハッシュ関数を適用することにより得られるハッシュ値の組に対応するブルームフィルタにおける各要素の値を、０より大きく１より小さい所定の確率でインクリメントすることで、各要素の値が上限値に達する可能性を低減させることができ、高頻度のデクリメントを必要としないブルームフィルタを用いたデータ判別を、少ない記憶容量で実現することができる。 According to this invention, when new data is added to a predetermined data set to update the Bloom filter, the Bloom filter corresponding to a set of hash values obtained by applying the plurality of hash functions to the added data. By incrementing the value of each element at a predetermined probability greater than 0 and less than 1, the possibility that the value of each element reaches the upper limit value can be reduced, and a Bloom filter that does not require frequent decrement Can be realized with a small storage capacity.

また、本発明は、処理部が、ブルームフィルタにおける各要素の値を所定の確率でインクリメントする場合、要素の値が大きいほど当該要素の値を低い確率でインクリメントすることを特徴とする。 In addition, the present invention is characterized in that, when the value of each element in the Bloom filter is incremented with a predetermined probability, the processing unit increments the value of the element with a lower probability as the element value is larger.

かかる発明によれば、ブルームフィルタにおける各要素の値を所定の確率でインクリメントする場合、要素の値が大きいほど当該要素の値を低い確率でインクリメントすることで、各要素の値が上限値に達する可能性をさらに低減することができ、また、要素の値が小さい場合に高い確率でインクリメントするのでブルームフィルタの判別精度をさらに上げることができる。 According to this invention, when the value of each element in the Bloom filter is incremented with a predetermined probability, the value of each element reaches the upper limit value by incrementing the value of the element with a lower probability as the element value increases. The possibility can be further reduced, and when the value of the element is small, it is incremented with a high probability, so that the discrimination accuracy of the Bloom filter can be further increased.

また、本発明は、処理部が、所定時間ごとに、すべての要素の値をデクリメントすることを特徴とする。 Further, the present invention is characterized in that the processing unit decrements the values of all elements every predetermined time.

かかる発明によれば、所定時間ごとにデクリメントすることにより、軽い処理負担で、各要素の値が上限値に達してから長時間が経過する事態を回避つつ、各要素で実質的にカウントできる数を増やすことができる。 According to such an invention, by decrementing every predetermined time, a number that can be substantially counted by each element while avoiding a situation in which a long time elapses after the value of each element reaches the upper limit with a light processing load. Can be increased.

また、本発明は、処理部が、ブルームフィルタにおける要素の値のいずれかが上限値に達したときに、すべての要素の値をデクリメントすることを特徴とする。 The present invention is also characterized in that the processing unit decrements all element values when any of the element values in the Bloom filter reaches an upper limit value.

かかる発明によれば、ブルームフィルタにおける要素の値のいずれかが上限値に達したときにデクリメントすることにより、各要素の値を適切なタイミングでデクリメントすることができ、各要素の値が上限値に達してから長時間が経過する事態をより確実に回避つつ、各要素で実質的にカウントできる数を増やすことができる。 According to this invention, when any of the element values in the Bloom filter reaches the upper limit value, the value of each element can be decremented at an appropriate timing, and the value of each element becomes the upper limit value. It is possible to increase the number that can be substantially counted by each element while more reliably avoiding a situation in which a long time has passed since reaching the value.

また、本発明は、記憶部が、所定のＩＰアドレスごとに対応したブルームフィルタを記憶しており、処理部が、入力データを受け付けたとき、当該入力データの宛先ＩＰアドレスに対応するブルームフィルタを用いて、判別を行うことを特徴とする。 In the present invention, the storage unit stores a Bloom filter corresponding to each predetermined IP address, and when the processing unit accepts input data, the Bloom filter corresponding to the destination IP address of the input data is stored. And performing the discrimination.

かかる発明によれば、入力データの宛先ＩＰアドレスごとに別々のブルームフィルタを用いることで、ブルームフィルタの各要素の値が上限値に達する可能性をさらに低減することができる。 According to this invention, it is possible to further reduce the possibility that the value of each element of the Bloom filter reaches the upper limit value by using a separate Bloom filter for each destination IP address of the input data.

また、本発明は、コンピュータをデータ判別装置として動作させるためのプログラムである。かかる発明によれば、このプログラムをインストールしたコンピュータをデータ判別装置として動作させることができる。 The present invention is also a program for operating a computer as a data discrimination device. According to this invention, a computer in which this program is installed can be operated as a data discrimination device.

本発明によれば、高頻度のデクリメントを必要としないブルームフィルタを用いたデータ判別を、少ない記憶容量で実現することができる。 According to the present invention, data discrimination using a Bloom filter that does not require frequent decrementing can be realized with a small storage capacity.

本実施形態におけるデータ判別装置と他の装置との接続関係を示す図である。It is a figure which shows the connection relation of the data discrimination | determination apparatus in this embodiment, and another apparatus. データ判別装置の機能構成図である。It is a functional block diagram of a data discrimination device. 端末とデータ判別装置の動作のシーケンス図である。It is a sequence diagram of operation | movement of a terminal and a data discrimination device. 正／負例用バッファへのデータ追加の説明図である。It is explanatory drawing of the data addition to the buffer for positive / negative examples. 正／負例用バッファへデータを追加する際のインクリメント確率の説明図である。It is explanatory drawing of the increment probability at the time of adding data to the buffer for positive / negative examples. 正／負例用バッファにおけるデクリメントの説明図である。It is explanatory drawing of the decrement in the buffer for positive / negative examples. 正／負例用バッファへデータを追加する動作を示すフローチャートである。It is a flowchart which shows the operation | movement which adds data to the buffer for positive / negative examples. データ判別装置の適用例の説明図である。It is explanatory drawing of the example of application of a data discrimination device. データ判別装置の適用例の説明図である。It is explanatory drawing of the example of application of a data discrimination device. ブルームフィルタの説明図である。It is explanatory drawing of a Bloom filter.

以下、本発明を実施するための形態（以下、「実施形態」と称する。）について、図面を参照（言及図以外の図面も適宜参照）して詳細に説明する。 DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, modes for carrying out the present invention (hereinafter referred to as “embodiments”) will be described in detail with reference to the drawings (refer to drawings other than the referenced drawings as appropriate).

図１に示すように、端末２ａ（２）、２ｂ（２）は、ネットワーク４を介してサーバ３に対してリクエストやレスポンスのパケットを送受信することでアクセスを行い、その経路上にデータ判別装置１が配置されている。 As shown in FIG. 1, the terminals 2a (2) and 2b (2) access the server 3 by transmitting and receiving request and response packets via the network 4, and the data discriminating apparatus on the path. 1 is arranged.

図２に示すように、コンピュータ装置であるデータ判別装置１は、パケット受信部１１、パケット組立部１２、データハッシュ部１３（処理部）、フィルタＤＢ（Data Base）１４（記憶部）、フィルタ検査部１５（処理部）、リダイレクト生成部１６、パケット転送部１７、認証部１８を備え、それらはＣＰＵ（Central Processing Unit）、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）、ＨＤＤ（Hard Disk Drive）、各種インタフェースなどにより実現される。各部の動作等は、図３を参照して後記する。 As shown in FIG. 2, the data discriminating apparatus 1 which is a computer apparatus includes a packet receiving unit 11, a packet assembling unit 12, a data hash unit 13 (processing unit), a filter DB (Data Base) 14 (storage unit), and a filter check. Unit 15 (processing unit), redirect generation unit 16, packet transfer unit 17, and authentication unit 18, which are CPU (Central Processing Unit), RAM (Random Access Memory), ROM (Read Only Memory), HDD (Hard Disk) Drive) and various interfaces. The operation of each part will be described later with reference to FIG.

なお、この図２では、端末２からのパケットをパケット受信部１１で受信して、パケット転送部１７からサーバ３にパケットを送信するものとしたが、逆に、サーバ３からのパケットをパケット受信部１１で受信し、パケット転送部１７から端末２にパケットを送信する場合も、同様の処理が行われる。 In FIG. 2, the packet reception unit 11 receives the packet from the terminal 2 and transmits the packet from the packet transfer unit 17 to the server 3, but conversely, the packet reception from the server 3 is received. Similar processing is performed when the packet is received by the unit 11 and transmitted from the packet transfer unit 17 to the terminal 2.

次に、図３を参照して、データ判別装置１において、正例にサンプルデータを追加する動作のシーケンスについて説明する。 Next, with reference to FIG. 3, the sequence of the operation | movement which adds sample data to a positive example in the data discrimination device 1 is demonstrated.

＜（ａ）条件不適合時の処理＞
図３に示すように、端末２から送信されたパケット（ストリーム）は、パケット受信部１１で受信され、パケット組立部１２に渡される（ステップＳ１１）。パケット組立部１２では、受信した複数のパケットをＬ（layer）３もしくはＬ４レベルでデータに構築し（ステップＳ１２）、その構築したデータを次のデータハッシュ部１３に渡す（ステップＳ１３）。 <(A) Processing when conditions are not met>
As shown in FIG. 3, the packet (stream) transmitted from the terminal 2 is received by the packet receiving unit 11 and passed to the packet assembling unit 12 (step S11). The packet assembling unit 12 constructs the received plurality of packets into data at the L (layer) 3 or L4 level (step S12), and passes the constructed data to the next data hash unit 13 (step S13).

データハッシュ部１３では、構築されたデータの特定の箇所を用いてハッシュ関数を適用し、ビット列を生成し、フィルタ検査部１５に渡す（ステップＳ１４）。フィルタ検査部１５は、フィルタＤＢ１４に要求して（ステップＳ１５）正負例用バッファ（正例用バッファと負例用バッファ）を参照しながら（ステップＳ１６）、このビット列が正例に該当するか否かを判定する。なお、ステップＳ１６では、正例用バッファのみを参照して、このビット列が正例に該当するか否かを判定してもよい。 In the data hash part 13, a hash function is applied using the specific location of the constructed | assembled data, a bit sequence is produced | generated, and it passes to the filter test | inspection part 15 (step S14). The filter checking unit 15 makes a request to the filter DB 14 (step S15) while referring to the positive / negative example buffers (positive example buffer and negative example buffer) (step S16), and whether or not this bit string corresponds to the positive example. Determine whether. In step S16, it may be determined whether or not this bit string corresponds to the positive example with reference to only the positive example buffer.

フィルタ検査部１５は、このビット列を正例と判定しなかった（条件に不適合であった）場合（ステップＳ１７）、構築前のパケットを破棄し、リダイレクト生成部１６で生成したリダイレクトパケットを認証部１８に渡す（ステップＳ１８）。 When the filter checking unit 15 does not determine that this bit string is a positive example (it is incompatible with the conditions) (step S17), the filter checking unit 15 discards the packet before construction and recognizes the redirect packet generated by the redirect generation unit 16 as an authentication unit. 18 (step S18).

＜（ｂ）認証処理＞
認証部１８は、端末２に、ポップアップ等からなる認証画面を表示させ、ステップＳ１７で条件に不適合であったデータをサンプルデータとして正例用バッファに追加登録するか否かを問い合わせる（ステップＳ２１）。ユーザの操作によって端末２からＹｅｓの返信があった場合（ステップＳ２２）、認証部１８は、そのデータに対応するビット列（データハッシュ部１３が作成。詳細は後記）を正例用バッファに追加する（ステップＳ２３）。また、認証部１８は、リダイレクトパケットを端末２に送信する（ステップＳ２４）。 <(B) Authentication process>
The authentication unit 18 causes the terminal 2 to display an authentication screen including a pop-up or the like, and inquires whether or not to additionally register the data that does not meet the conditions in step S17 as sample data in the positive example buffer (step S21). . When a reply from the terminal 2 is received by the user's operation (step S22), the authentication unit 18 adds a bit string corresponding to the data (created by the data hash unit 13, details will be described later) to the positive example buffer. (Step S23). Further, the authentication unit 18 transmits a redirect packet to the terminal 2 (step S24).

＜（ｃ）条件適合時の処理＞
リダイレクトパケットを受信した端末２は、再度同じパケット（ストリーム）を送信する（ステップＳ３１）。ステップＳ３２〜Ｓ３６の処理はステップＳ１２〜Ｓ１６の処理と同様であるため、説明を省略する。 <(C) Processing when conditions are met>
The terminal 2 that has received the redirect packet transmits the same packet (stream) again (step S31). Since the process of steps S32 to S36 is the same as the process of steps S12 to S16, the description thereof is omitted.

ステップＳ３６の後、フィルタ検査部１５は、受信したビット列が正例に該当するか否かを判定するが、ステップＳ１７の場合と異なり、正例用バッファに当該データに対応するビット列が追加登録されているため、受信したビット列は正例用バッファ（条件）に適合する（ステップＳ３７）。 After step S36, the filter checking unit 15 determines whether or not the received bit string corresponds to the positive example, but unlike the case of step S17, the bit string corresponding to the data is additionally registered in the positive example buffer. Therefore, the received bit string conforms to the positive example buffer (condition) (step S37).

フィルタ検査部１５は当該データをパケット転送部１７に渡し（ステップＳ３８）、パケット転送部１７は当該データを本来の宛先のサーバ３に転送する（ステップＳ３９）。 The filter inspection unit 15 passes the data to the packet transfer unit 17 (step S38), and the packet transfer unit 17 transfers the data to the original destination server 3 (step S39).

なお、ステップＳ２２でＮｏの返信があった場合、ステップＳ２３で当該データに対応するビット列を負例用バッファに追加登録するようにしてもよい。 If there is a reply of No in step S22, a bit string corresponding to the data may be additionally registered in the negative example buffer in step S23.

次に、図３のシーケンスのステップＳ２３の処理イメージについて、図４を参照して説明する。図４に示すように、まず、データハッシュ部１３で生成されたビット列（以下、「生成ビット列」とも称する。）は、７要素、３ビット／要素であり、また、ＢＦ用のハッシュ関数が４つである場合を例にとって説明する。 Next, the processing image of step S23 in the sequence of FIG. 3 will be described with reference to FIG. As shown in FIG. 4, first, the bit string generated by the data hash unit 13 (hereinafter also referred to as “generated bit string”) is 7 elements, 3 bits / element, and the hash function for BF is 4 An example will be described.

なお、ＢＦ用のハッシュ関数が４つであるので、生成ビット列における７要素のうち４要素の値が「００１」となっている。また、特許請求の範囲における「ハッシュ値の組」とは、生成ビット列において「００１」となっている要素の組（前記した第２ハッシュ値の組）のことを指す。 Since there are four hash functions for BF, the value of four elements among the seven elements in the generated bit string is “001”. In addition, the “set of hash values” in the claims refers to a set of elements that are “001” in the generated bit string (the set of second hash values described above).

ステップＳ２２の返信内容がＹｅｓの場合、生成ビット列は正例用バッファに追加される。なお、ステップＳ２２の返信内容がＮｏの場合、生成ビット列を負例用バッファに追加するようにしてもよい。 When the reply content in step S22 is Yes, the generated bit string is added to the positive example buffer. If the reply content in step S22 is No, the generated bit string may be added to the negative example buffer.

次に、図５を参照して、正／負例用バッファ（正例用バッファまたは負例用バッファ。以下、単に「バッファ」とも称する。）へデータを追加する際の各要素におけるインクリメント確率について説明する。図５に示すように、正／負例用バッファに生成ビット列を追加する場合、その追加は要素毎に行われ、インクリメント確率も要素毎に独立に計算される。 Next, referring to FIG. 5, the increment probability in each element when data is added to the positive / negative example buffer (positive example buffer or negative example buffer; hereinafter also simply referred to as “buffer”). explain. As shown in FIG. 5, when the generated bit string is added to the positive / negative example buffer, the addition is performed for each element, and the increment probability is calculated independently for each element.

各要素におけるインクリメント確率は、現在バッファの各要素に入っている要素の値を基に計算される。ここでは、インクリメント確率が、所定の基底（ｍ：０＜ｍ＜１）の要素の値乗に比例する場合を例にとって説明する。例えば、図５に示すように、第１要素では、要素の値が「０１１（１０進数で「３」）」なので、「ｍ^３」がインクリメント確率となる。なお、第５要素のように、要素の値が「１１１」と上限値（最大値）となっている場合は、例外的にインクリメント確率を「０」とする。 The increment probability for each element is calculated based on the value of the element currently contained in each element of the buffer. Here, a case where the increment probability is proportional to the value power of an element of a predetermined base (m: 0 <m <1) will be described as an example. For example, as shown in FIG. 5, in the first element, since the value of the element is “011 (“ 3 ”in decimal number)”, “m ³ ” is the increment probability. When the element value is “111” and the upper limit (maximum value) as in the fifth element, the increment probability is exceptionally set to “0”.

このようにすると、要素の値が大きいほどインクリメント確率が下がるため、要素の値が上限値に達する確率を低く抑えることができる。つまり、全要素の値をデクリメントする頻度を大きく低減することができる。 In this way, since the increment probability decreases as the element value increases, the probability that the element value reaches the upper limit value can be kept low. That is, the frequency of decrementing the values of all elements can be greatly reduced.

次に、正／負例用バッファにおけるデクリメントについて説明する。図６に示すように、しばらく時間がたって正／負例用バッファに何個かのビット列が追加された後、正／負例用バッファをデクリメントする場合、全要素の値について、一律に「１」を減算する。ただし、すでに「０」になっている要素については、そのまま「０」とする。 Next, decrement in the positive / negative example buffer will be described. As shown in FIG. 6, when several bits are added to the positive / negative example buffer after a while and then the positive / negative example buffer is decremented, the values of all elements are uniformly “1”. "Is subtracted. However, elements that are already “0” are set to “0” as they are.

これにより、各要素における格納個数の上限を１／ｍ倍（ｍが１／２なら２倍）したことと同様の効果を得ることができる。つまり、各要素の値に関する相対的な比率を変化させずに、バッファへの新しいビット列の反映を残すことができる。 As a result, the same effect as that obtained by multiplying the upper limit of the number of storage in each element by 1 / m times (or twice if m is 1/2) can be obtained. That is, it is possible to leave a new bit string reflected in the buffer without changing the relative ratio of the values of the elements.

次に、正／負例用バッファへデータを追加する動作（図３のステップＳ２３に相当）について説明する。図７に示すように、データハッシュ部１３は、パケットから構築したデータにハッシュ関数を適用することでビット列を生成し（ステップＳ１０１）、ビット列中の各要素についてステップＳ１０３、Ｓ１０４の処理を行う（ステップＳ１０２〜Ｓ１０５）。 Next, the operation of adding data to the positive / negative example buffer (corresponding to step S23 in FIG. 3) will be described. As shown in FIG. 7, the data hash unit 13 generates a bit string by applying a hash function to the data constructed from the packet (step S101), and performs the processes of steps S103 and S104 for each element in the bit string ( Steps S102 to S105).

データハッシュ部１３は、ある要素について、ビット列のビットが立っている（「１」である）か否かを判定し（ステップＳ１０３）、ＹｅｓであればステップＳ１０４に進み、ＮｏであればステップＳ１０４をスキップして次の要素の処理（次のステップＳ１０３）に進む。 The data hash unit 13 determines whether or not a bit of the bit string is raised (“1”) for a certain element (step S103). If Yes, the process proceeds to step S104, and if No, step S104 is performed. Is skipped and the process proceeds to the next element process (next step S103).

データハッシュ部１３は、図５を参照して説明したように、要素の値に応じたインクリメント確率でその要素の値をインクリメントし（ステップＳ１０４）、次の要素の処理（次のステップＳ１０３）に進む。 As described with reference to FIG. 5, the data hash unit 13 increments the value of the element with an increment probability corresponding to the value of the element (step S 104), and proceeds to processing of the next element (next step S 103). move on.

このようにして、本実施形態のデータ判別装置１によれば、ブルームフィルタを用いたデータ判別において、各要素の値を所定のインクリメント確率でインクリメントするため、高頻度のデクリメントを必要としないブルームフィルタを用いたデータ判別を、少ない記憶容量で実現することができる。 Thus, according to the data discrimination device 1 of the present embodiment, in the data discrimination using the Bloom filter, the value of each element is incremented with a predetermined increment probability, so that the Bloom filter that does not require frequent decrement is required. Can be realized with a small storage capacity.

つまり、例えば、バッファ（ブルームフィルタ）の各要素が３ビットの場合、従来技術では、ある要素についてビットの立ったビット列が７つ適用されると、バッファにおけるその要素の値が上限値（２進数で「１１１」）に達する。これに対して、本実施形態のデータ判別装置１では、同様の条件で、基底ｍが「１／２」の場合、要素の値が「０００」から「００１」になるのにビット列が１つ、要素の値が「００１」から「０１０」になるのにビット列が平均２つ、要素の値が「０１０」から「０１１」になるのにビット列が平均４つ、以下同様に、要素の値が１つ増えるのに、平均８つ、平均１６個、平均３２個、平均６４個、つまり、バッファにおけるその要素の値が上限値に達するまでにビット列が合計１２７個必要となる。 That is, for example, when each element of a buffer (Bloom filter) is 3 bits, in the conventional technique, when seven bit strings with bits are applied to a certain element, the value of that element in the buffer is an upper limit value (binary number). To "111"). On the other hand, in the data discriminating apparatus 1 according to the present embodiment, when the base m is “1/2” under the same conditions, one bit string is used even though the element value is changed from “000” to “001”. , The average value of the bit string is 2 for the element value from “001” to “010”, the average of the bit string is 4 for the element value from “010” to “011”, and so on. In order to increase one, the average of 8, the average of 16, the average of 32, and the average of 64, that is, a total of 127 bit strings are required until the value of the element in the buffer reaches the upper limit value.

一般化して言えば、バッファ（ブルームフィルタ）の各要素がｉビットの場合、従来技術では、ある要素についてビットの立ったビット列が（２^ｉ−１）個適用されると、バッファにおけるその要素の値が上限値に達する。これに対して、本実施形態のデータ判別装置１では、同様の条件で、基底ｍの場合、ビット列が｛（１／ｍ）^（２^ｉ−１）−１｝個適用されると、バッファにおけるその要素の値が上限値に達する。このように、本実施形態のデータ判別装置１におけるブルームフィルタによれば、従来技術に比較して、各要素のビット数が同じでも、各要素で実質的にカウントできる数を飛躍的に大きくできるという効果を奏する。 Generally speaking, when each element of a buffer (bloom filter) is i bits, according to the prior art, when (2 ⁱ −1) bit strings having a bit are applied to an element, The value reaches the upper limit. On the other hand, in the data discriminating apparatus 1 of the present embodiment, when {(1 / m) ^ (2 ⁱ −1) −1} bit strings are applied in the case of the base m under the same conditions, the buffer The value of that element in reaches the upper limit. Thus, according to the Bloom filter in the data discriminating apparatus 1 of the present embodiment, even if the number of bits of each element is the same, the number that can be substantially counted by each element can be dramatically increased as compared with the prior art. There is an effect.

なお、基底ｍは、データ判別装置１の管理者が「０＜ｍ＜１」を満たす範囲で適宜設定すればよい。具体的には、要素の値が上限値に達する確率の低減を重視するのであればｍを小さい値に設定すればよいし、また、デクリメントによる情報損失の可能性の低減を重視するのであればｍを大きい値に設定すればよい。なお、特許請求の範囲において「ブルームフィルタにおける各要素の値を、０より大きく１より小さい所定の確率でインクリメントする」と記載しているが、これは、要素の値が０のときに１の確率でインクリメントしたり、要素の値が上限値のときにインクリメントしないことまで排除しているわけではない。 The base m may be set as appropriate as long as the administrator of the data discrimination device 1 satisfies “0 <m <1”. Specifically, if importance is placed on reducing the probability that an element value will reach the upper limit value, m may be set to a small value, and if importance is placed on reducing the possibility of information loss due to decrement. What is necessary is just to set m to a large value. In the claims, “the value of each element in the Bloom filter is incremented with a predetermined probability larger than 0 and smaller than 1”, but this is 1 when the value of the element is 0. It is not excluded that it is incremented by probability or not incremented when the value of the element is the upper limit value.

また、デクリメントのタイミングは、例えば、所定時間ごとに、あるいは、ブルームフィルタにおける要素の値のいずれかが上限値に達したときなどであればよい。例えば、所定時間ごとにデクリメントする場合は、処理負担が軽くて済む。また、ブルームフィルタにおける要素の値のいずれかが上限値に達したときにデクリメントする場合は、各要素の値が上限値に達してから長時間が経過する事態をより確実に回避できる。 The decrement timing may be, for example, every predetermined time or when any of the element values in the Bloom filter reaches the upper limit value. For example, when decrementing every predetermined time, the processing load can be reduced. Further, when decrementing when any of the element values in the Bloom filter reaches the upper limit value, it is possible to more reliably avoid a situation where a long time elapses after the value of each element reaches the upper limit value.

なお、データ判別装置１に実行させるためのプログラムを作成し、データ判別装置１にインストールすれば、データ判別装置１は、そのプログラムに基づいた各機能を実現することができる。 If a program for causing the data discriminating apparatus 1 to be executed is created and installed in the data discriminating apparatus 1, the data discriminating apparatus 1 can realize each function based on the program.

（適用例）
次に、本実施形態のデータ判別装置１の適用例について説明する。図８は、ＮＡＴ（Network Address Translation）装置５やプロキシ装置６に関するユーザをネットワーク４上で判別するシステムにデータ判別装置１を適用した場合を表している。 (Application example)
Next, an application example of the data discrimination device 1 according to the present embodiment will be described. FIG. 8 shows a case where the data discriminating apparatus 1 is applied to a system that discriminates users related to a NAT (Network Address Translation) apparatus 5 and a proxy apparatus 6 on the network 4.

ネットワーク４上には、多くのＮＡＴ装置５やプロキシ装置６が存在し、その総ユーザ数は膨大である。したがって、本実施形態のデータ判別装置１であっても、すべてのＮＡＴ装置５やプロキシ装置６に対して単一の正／負例用バッファを構築するよりは、正／負例用バッファをいくつかに分割したほうが、偽陽性をより低減できる。 There are many NAT devices 5 and proxy devices 6 on the network 4, and the total number of users is enormous. Therefore, even in the data discriminating apparatus 1 of the present embodiment, the number of positive / negative example buffers can be increased rather than constructing a single positive / negative example buffer for all NAT apparatuses 5 and proxy apparatuses 6. Dividing into crab can reduce false positives more.

そこで、ＮＡＴ装置５やプロキシ装置６のＩＰアドレスごとに、正／負例用バッファを持つようにする。例えば、データ判別装置１では、ＩＰアドレスが「Ａ」のＮＡＴ装置５について、また、ＩＰアドレスが「Ｂ」のプロキシ装置６について、別々の正／負例用バッファ（正例用バッファと負例用バッファ）を持つようにする。 Therefore, a positive / negative example buffer is provided for each IP address of the NAT device 5 or the proxy device 6. For example, in the data discriminating apparatus 1, separate positive / negative example buffers (positive example buffer and negative example) are used for the NAT device 5 whose IP address is “A” and for the proxy device 6 whose IP address is “B”. Have a buffer).

また、データハッシュ部１３でハッシュ関数を適用する対象としては、ＨＴＴＰ（Hyper Text Transfer Protocol）リクエストに含まれるＣｏｏｋｉｅ（ＨＴＴＰボディの一部）を想定する。同じドメインの宛先に関しては、基本的に同じＣｏｏｋｉｅが付与されることが多いため、ユーザの区別が可能となるからである。 In addition, a cookie (part of the HTTP body) included in an HTTP (Hyper Text Transfer Protocol) request is assumed as a target to which the hash function is applied in the data hash unit 13. This is because, basically, the same cookie is often assigned to destinations in the same domain, so that users can be distinguished.

また、図９は、データ判別装置１を用いたサービス例を表している。図９（ａ）の例では、データ判別装置１がネットワーク４（ＮＷ）上のファイアウォール装置７と連携し、ＮＡＴ装置５のユーザを区別して、特定のユーザからのアクセスの場合にファイアウォール装置７において通信を遮断（「×」の箇所）する。 FIG. 9 shows a service example using the data discrimination device 1. In the example of FIG. 9A, the data discrimination device 1 cooperates with the firewall device 7 on the network 4 (NW), distinguishes the user of the NAT device 5, and the firewall device 7 in the case of access from a specific user. Block communication (“×”).

また、図９（ｂ）では、データ判別装置１は、ネットワーク４上に存在するＤＰＩ（Deep Packet Inspection）装置８及び広告サーバ９と連携して、プロキシ装置６のユーザに応じて適切な広告を挿入するサービスを表している。なお、ＤＰＩ装置８は、転送されるパケットの中身を書き換えることが可能なネットワーク装置である。データ判別装置１によって識別されたユーザ情報をこのＤＰＩ装置８に通知し、それに合わせた広告データを広告サーバ９によって挿入するようにパケットを改変することで、ネットワーク側での広告配信を実現することが可能となる。 Further, in FIG. 9B, the data discrimination device 1 cooperates with a DPI (Deep Packet Inspection) device 8 and an advertisement server 9 existing on the network 4 to perform an appropriate advertisement according to the user of the proxy device 6. Represents the service to be inserted. The DPI device 8 is a network device that can rewrite the contents of the transferred packet. Realizing advertisement distribution on the network side by notifying the user information identified by the data discriminating apparatus 1 to the DPI apparatus 8 and modifying the packet so that the advertisement server 9 inserts advertisement data corresponding thereto. Is possible.

以上で本実施形態の説明を終えるが、本発明の態様はこれらに限定されるものではない。
例えば、インクリメント確率は、前記した例のように、要素の値が大きいほど低い確率になるようにしてもよいが、これに限定されない。例えば、バッファの各要素が３ビットの場合、要素の値が「０００」から「０１０」まではインクリメント確率を２／３とし、「０１１」から「１１１」まではインクリメント確率を１／３とするなど、インクリメント確率を段階的に設定してもよい。その他、インクリメント確率は、データ判別装置１の管理者が所定の確率に適宜設定すればよい。
また、データ判別装置１において、複数のブルームフィルタと複数の並列プロセッサを用意し、データ判別に関して並列処理を行うようにしてもよい。 Although description of this embodiment is finished above, the aspect of the present invention is not limited to these.
For example, the increment probability may be lower as the element value is larger, as in the above example, but is not limited to this. For example, when each element of the buffer is 3 bits, the increment probability is 2/3 when the element value is “000” to “010”, and the increment probability is 1/3 when “011” to “111”. For example, the increment probability may be set stepwise. In addition, the increment probability may be appropriately set to a predetermined probability by the administrator of the data discrimination device 1.
Further, in the data discriminating apparatus 1, a plurality of Bloom filters and a plurality of parallel processors may be prepared to perform parallel processing regarding data discrimination.

また、本発明は、パケットのトラフィック傾向の把握を可能とするため、トラフィック解析の技術に適用することができる。
また、本発明は、固定計算量によるデータ識別を行うため、機械学習の分野に適用することができる。 Further, the present invention can be applied to a traffic analysis technique in order to make it possible to grasp a traffic trend of a packet.
Further, the present invention can be applied to the field of machine learning because data identification is performed with a fixed calculation amount.

また、本発明は、大量のパケットを対象に高速識別が可能であるため、ストリーム処理技術に適用することができる。
その他、具体的な構成について、本発明の主旨を逸脱しない範囲で適宜変更が可能である。 Further, the present invention can be applied to a stream processing technique because high-speed identification is possible for a large number of packets.
In addition, about a concrete structure, it can change suitably in the range which does not deviate from the main point of this invention.

１データ判別装置
２ａ、２ｂ、２端末
３サーバ
４ネットワーク
５ＮＡＴ装置
６プロキシ装置
７ファイアウォール装置
８ＤＰＩ装置
９広告サーバ
１１パケット受信部
１２パケット組立部
１３データハッシュ部（処理部）
１４フィルタＤＢ（記憶部）
１５フィルタ検査部（処理部）
１６リダイレクト生成部
１７パケット転送部
１８認証部 DESCRIPTION OF SYMBOLS 1 Data discrimination apparatus 2a, 2b, 2 terminal 3 Server 4 Network 5 NAT apparatus 6 Proxy apparatus 7 Firewall apparatus 8 DPI apparatus 9 Advertising server 11 Packet receiving part 12 Packet assembly part 13 Data hash part (processing part)
14 Filter DB (storage unit)
15 Filter inspection unit (processing unit)
16 redirect generation unit 17 packet transfer unit 18 authentication unit

Claims

A set of hash values obtained by applying a plurality of hash functions to input data is compared with the value of the corresponding element in the Bloom filter, and the value of the element in the Bloom filter corresponding to the obtained set of hash values is compared. A data discriminating apparatus that discriminates that the input data belongs to a predetermined data set if there is no zero.
A storage unit for storing the Bloom filter as a configuration in which each of the elements has a plurality of bits;
When updating the Bloom filter stored in the storage unit by adding new data to the predetermined data set, it corresponds to a set of hash values obtained by applying the plurality of hash functions to the added data A processing unit that increments the value of each element in the Bloom filter with a predetermined probability greater than 0 and less than 1;
A data discriminating apparatus comprising:

The processor is
The data determination apparatus according to claim 1, wherein when the value of each element in the Bloom filter is incremented with a predetermined probability, the value of the element is incremented with a lower probability as the element value is larger.

The processor is
The data discriminating apparatus according to claim 1 or 2, wherein the values of all the elements are decremented every predetermined time.

The processor is
The data discriminating apparatus according to claim 1 or 2, wherein when any of the element values in the Bloom filter reaches an upper limit value, the values of all the elements are decremented.

The storage unit
The Bloom filter corresponding to each predetermined IP (Internet Protocol) address is stored,
The processor is
The data according to any one of claims 1 to 4, wherein when the input data is received, the determination is performed using the Bloom filter corresponding to a destination IP address of the input data. Discriminator.

A set of hash values obtained by applying a plurality of hash functions to input data is compared with the value of the corresponding element in the Bloom filter, and the value of the element in the Bloom filter corresponding to the obtained set of hash values is compared. A data discriminating method by a data discriminating apparatus that discriminates that the input data belongs to a predetermined data set if there is no zero.
The data determination device includes a storage unit that stores the Bloom filter as a configuration in which each of the elements has a plurality of bits, and a processing unit.
The processor is
When updating the Bloom filter stored in the storage unit by adding new data to the predetermined data set, it corresponds to a set of hash values obtained by applying the plurality of hash functions to the added data A value of each element in the Bloom filter is incremented with a predetermined probability greater than 0 and less than 1. A data discrimination method.

A program for causing a computer to operate as the data discriminating apparatus according to any one of claims 1 to 5.