JP2012242975A

JP2012242975A - Distributed parallel processing cache device and method, resource management node and program

Info

Publication number: JP2012242975A
Application number: JP2011110847A
Authority: JP
Inventors: Tsuyoshi Ozawa; 健史小沢; Kazutaka Morita; 和孝森田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2011-05-17
Filing date: 2011-05-17
Publication date: 2012-12-10

Abstract

PROBLEM TO BE SOLVED: To process a job at high speed and speed up processing which has been done in the past when the job includes a task which is to be done more than one time.SOLUTION: A resource management node requests a hash value of input data from a calculation node having the input data, acquires it with an execution file and assigns a task to the calculation node having the input data. The calculation node executes the assigned task and stores a result in association with the execution file and the input data in calculation result temporal storage means. The resource management node acquires the execution file and the input data as an execution result and stores them in association with the calculation node in calculation result storage means. When assigning a task to a calculation node, the management node searches the calculation result storage means by using the hash value of the input data and the execution file as keys and skips processing of the task if finding a calculation result of the task executed by the calculation node.

Description

本発明は、分散並列処理キャッシュ装置及び方法及び資源管理ノード及びプログラムに係り、特に、分散処理フレームワークにおける分散並列処理を行う際に、入力データを有する計算ノードにタスクを割り当てるためのキャッシュ装置及び方法及び資源管理ノード及びプログラムに関する。 The present invention relates to a distributed parallel processing cache device and method, a resource management node, and a program, and more particularly to a cache device for assigning a task to a computation node having input data when performing distributed parallel processing in a distributed processing framework. The present invention relates to a method, a resource management node, and a program.

分散処理フレームワークにおいて、PCクラスタ上にファイルを分割して保存する分散記憶装置を前提としたデータ解析基盤ソフトウェアがある。これは、図１に示すように、有向グラフの形式で動作フローを記述することで、互いに依存関係のない入力データに対して並列計算を行うプログラムを実行基盤とするものである。図１において、辺がデータの流れであり、頂点が行う処理内容を意味する。頂点上で実行される処理をタスクという。実行時は、分散ファイルシステムにアクセスし、入力ファイルを処理単位に分割する初期化処理を行い、分割したファイルが保存されているデータノードに近い計算ノードに、処理内容が記述されたプログラムを配置し、計算が完了したら結果を分散ファイルシステムに書き込むものである。なお、「計算ノード」とは、実際にタスクを実行する計算機を指す。 In a distributed processing framework, there is data analysis infrastructure software based on a distributed storage device that divides and stores files on a PC cluster. As shown in FIG. 1, this is based on a program that performs parallel computation on input data that are not dependent on each other by describing an operation flow in the form of a directed graph. In FIG. 1, an edge is a data flow, which means processing contents performed by a vertex. Processing executed on the top is called a task. At the time of execution, the distributed file system is accessed, the input file is divided into processing units, initialization processing is performed, and the program describing the processing contents is placed on the calculation node close to the data node where the divided file is stored When the calculation is completed, the result is written into the distributed file system. The “calculation node” refers to a computer that actually executes a task.

このような環境において、１つ前の頂点の処理結果を分割し、入力データとして処理する方法として、MapReduceを拡張し、ループ処理に特化させたシステムがある（例えば、非特許文献１、２参照）、Map処理、Reduceで記述されたプログラムを分散並列処理するシステム（例えば、特許文献1参照）や、非循環有向グラフ処理で記述されたプログラムを分散並列処理するシステム（特許文献２参照）がある。 In such an environment, there is a system that extends MapReduce and specializes in loop processing as a method of dividing the processing result of the previous vertex and processing it as input data (for example, Non-Patent Documents 1 and 2). System) that performs distributed parallel processing of programs described in Map processing and Reduce (for example, see Patent Document 1), and system that performs distributed parallel processing of programs described in acyclic directed graph processing (see Patent Document 2). is there.

以下に従来の技術における分散並列処理キャッシュ処理を説明する。 The distributed parallel processing cache processing in the prior art will be described below.

図２は、従来技術の概略フローチャートである。 FIG. 2 is a schematic flowchart of the prior art.

資源管理ノードがキャッシュを考慮しながら計算ノードにタスクを割り当て（ステップ２１０）、計算ノードが割り当てられたタスクを実行する（ステップ２２０）。タスクの実行結果を計算ノードに一時ファイルとして保存し（ステップ２３０）、ジョブが終了していない場合はステップ２１０に戻る。ジョブが終了したら計算ノードの一時ファイルを削除する。なお、ここで、「ジョブ」とは、動作フローを記述したプログラムが行う一連の処理を指し、各頂点で実行される処理と、データの流れを表現するデータフローで構成される。 The resource management node assigns a task to the computation node while considering the cache (step 210), and executes the task to which the computation node is assigned (step 220). The task execution result is stored as a temporary file in the computation node (step 230). If the job has not ended, the process returns to step 210. When the job is finished, delete the temporary file on the compute node. Here, “job” refers to a series of processes performed by a program describing an operation flow, and includes a process executed at each vertex and a data flow expressing a data flow.

上記の計算ノードにタスクを割り当てるステップ２１０の処理を図３に示す。 FIG. 3 shows the process of step 210 for assigning a task to the calculation node.

特許文献１の技術に基づいて資源管理ノードが次に実行すべきタスクを決定する（ステップ４１０）。次に、資源管理ノードが次に実行ジョブ中における再利用するデータの宣言を確認する（ステップ４２０）。ステップ４１０で選択したタスクの入力は再利用可能かを判定し、再利用可能であり、１回目の実行である場合は、計算結果を保持するノードで処理が終了したと見做し、計算をスキップし（ステップ４４０）、ステップ４１０の処理に戻る。一方、再利用ができない、または、２回目以降の実行である場合は、特許文献１の技術に基づいて実行する計算ノードを決定する（ステップ４６０）。 Based on the technique of Patent Document 1, the resource management node determines a task to be executed next (step 410). Next, the resource management node confirms the declaration of data to be reused in the next execution job (step 420). It is determined whether or not the input of the task selected in step 410 is reusable. If it is the first execution, it is assumed that the processing is completed at the node holding the calculation result, and the calculation is performed. Skip (step 440), the process returns to step 410. On the other hand, if it cannot be reused, or if it is the second or subsequent execution, a calculation node to be executed is determined based on the technique of Patent Document 1 (step 460).

上記のステップ２３０では、図４に示すように、完了したタスクの計算結果を計算ノードの外部記憶装置（ローカルディスク）に保存する（ステップ６１０）。 In step 230, as shown in FIG. 4, the calculation result of the completed task is stored in the external storage device (local disk) of the calculation node (step 610).

最後に、ステップ２４０では、図５に示すように、完了したタスクの計算結果を計算ノードの外部記憶装置（ローカルディスク）に保存する（ステップ６１０）。 Finally, in step 240, as shown in FIG. 5, the calculation result of the completed task is stored in the external storage device (local disk) of the calculation node (step 610).

United States Patent 7,650,331, System and method for efficient large-scale data processing (Map Reduce) Google, January 19, 2010.United States Patent 7,650,331, System and method for efficient large-scale data processing (Map Reduce) Google, January 19, 2010. USPTO Applicaton #20080082644, Distributed Parallel Computing（Dryad)， Microsoft Corporation，September 29, 2006.USPTO Applicaton # 20080082644, Distributed Parallel Computing (Dryad), Microsoft Corporation, September 29, 2006.

HaLoop: Efficient Iterative Data Processing on Large Clusters, Yingyi Bu, Bill Howe, Magdalena Balazinska, Michael D. Ernst. In VLDB'10: The 36the International Conference on Very Large Data Bases, Singapore, 24-30, September, 2010.HaLoop: Efficient Iterative Data Processing on Large Clusters, Yingyi Bu, Bill Howe, Magdalena Balazinska, Michael D. Ernst.In VLDB'10: The 36the International Conference on Very Large Data Bases, Singapore, 24-30, September, 2010. Twister: A Runtime for Iterative MapReduce Jaliya Ekanayake, Hui Li, Bingjing Zhang, Thilina Gunarathne, Seung-Hee Bae, Judy Qiu, Geoffrey Fox, Twister: A Runtime for Iterative MapReduce. The first International Workshop on MapReduce and its Applications (MAPREDUCE'10) - HPDC2010.Twister: A Runtime for Iterative MapReduce Jaliya Ekanayake, Hui Li, Bingjing Zhang, Thilina Gunarathne, Seung-Hee Bae, Judy Qiu, Geoffrey Fox, Twister: A Runtime for Iterative MapReduce.The first International Workshop on MapReduce and its Applications (MAPREDUCE ' 10)-HPDC2010.

しかしながら、上記特許文献１，２、非特許文献１，２のシステムのよい箇所を組み合わせたシステムを構築した場合でも、以下のような問題がある。 However, even when a system in which the good parts of the systems of Patent Documents 1 and 2 and Non-Patent Documents 1 and 2 are combined is constructed, there are the following problems.

（１）ジョブ内で過去に行われた計算結果を再利用するためには、再利用するデータをユーザが宣言する必要がある。 (1) In order to reuse a calculation result performed in the past in a job, the user needs to declare data to be reused.

（２）ジョブが複数回実行される場合、過去に行ったことがあるタスクであっても再度同じ計算、転送を行う必要がある。詳しくは、
・過去に行ったことがある計算が実行中のジョブに含まれていたとしても、複数ジョブを実行する際には再度同じ計算を行う必要がある。 (2) When a job is executed a plurality of times, it is necessary to perform the same calculation and transfer again even for tasks that have been performed in the past. For more information,
Even if a calculation that has been performed in the past is included in the job being executed, it is necessary to perform the same calculation again when executing multiple jobs.

・過去に行ったことがある計算が実行中のジョブに含まれていたとしても、計算ノード間における転送が発生してしまう。 Even if a calculation that has been performed in the past is included in the job being executed, transfer between the calculation nodes occurs.

本発明は、上記の点に鑑みなされたもので、ジョブ内に複数回実行されるタスクが含まれている場合、ジョブを高速に処理すると共に、過去に行ったことがある処理を高速化することが可能な分散並列処理キャッシュ装置及び方法及びプログラムを提供することを目的とする。 The present invention has been made in view of the above points. When a task to be executed a plurality of times is included in a job, the job is processed at a high speed, and the processing that has been performed in the past is accelerated. It is an object of the present invention to provide a distributed parallel processing cache device, method, and program that can be used.

上記の課題を解決するため、本発明（請求項１）は、分散並列実行フレームワークにおいて実行するプログラムのフローが循環有向グラフとして表現されている場合に、タスクを割り当てる資源管理ノード及びタスクを割り当てられる計算ノードから構成される並列キャッシュシステムであって、
前記資源管理ノードは、
タスクの入力データを有する計算ノードに対して入力データのハッシュ値の計算を要求し、実行ファイルと共に取得するハッシュ値取得手段と、
タスクを実行した計算ノードから取得した入力データのハッシュ値、実行ファイル、入力ファイル取得し、該計算ノードと関連付けて計算結果記憶手段に格納するタスク管理手段と、
前記入力データのハッシュ値と前記実行ファイルを検索キーとして、前記計算結果記憶手段を検索し、該キーに該当する計算ノードのデータがある場合には、計算をスキップし、ない場合にはタスクを割り当てる計算ノードを決定するタスク割当手段と、
を有し、
前記計算ノードは、
前記資源管理ノードから要求された前記入力データのハッシュ値を求め、該資源管理ノードに返却するハッシュ値計算手段と、
前記資源管理ノードにより割り当てられたタスクを実行し、その結果を実行ファイル、入力データと関連付けて計算結果一時記憶手段に格納すると共に、該資源管理ノードに送信するタスク実行手段と、を有する。 In order to solve the above-described problem, the present invention (Claim 1) can allocate a resource management node and a task to which a task is allocated when a flow of a program executed in the distributed parallel execution framework is expressed as a circular directed graph. A parallel cache system composed of compute nodes,
The resource management node is:
A hash value acquisition means for requesting calculation of the hash value of the input data to the calculation node having the input data of the task, and acquiring it together with the execution file;
A task management unit that obtains a hash value of an input data acquired from a calculation node that executed the task, an execution file, an input file, and stores the hash value in association with the calculation node in a calculation result storage unit;
The calculation result storage means is searched using the hash value of the input data and the execution file as a search key. If there is data of a calculation node corresponding to the key, the calculation is skipped, and if not, the task is A task assignment means for determining a calculation node to be assigned;
Have
The compute node is
A hash value calculation means for obtaining a hash value of the input data requested from the resource management node and returning the hash value to the resource management node;
Task execution means for executing a task assigned by the resource management node, storing the result in an execution file and input data in association with the input data, and transmitting the result to the resource management node.

また、本発明（請求項２）は、前記資源管理ノードにおいて、
ジョブが終了すると、前記検索キーに基づいて前記計算結果記憶手段から計算結果を保持している計算ノードを抽出し、該計算ノードと同じ物理計算機上で動作しているデータノードに対して、該計算結果を送信し、格納する手段を更に有する。 The present invention (Claim 2) is the resource management node,
When the job is completed, a calculation node holding the calculation result is extracted from the calculation result storage unit based on the search key, and the data node operating on the same physical computer as the calculation node is extracted from the calculation node. A means for transmitting and storing the calculation result is further included.

上記のように、本発明によれば、入力データを保持する計算ノードにおいて実行されたタスクの実行結果を資源管理ノードで記録しておくことにより、
（１）ジョブ内に複数回実行されるタスクが含まれている場合、ジョブの高速化を自動的に行うことができる。 As described above, according to the present invention, by recording the execution result of the task executed in the calculation node holding the input data in the resource management node,
(1) When a task that is executed a plurality of times is included in a job, the job can be automatically accelerated.

（２）ジョブが複数回実行される場合において、過去に行ったことのある処理を高速化できる。詳しくは、過去に行ったことのある計算が実行中のジョブに含まれていた場合、その結果を入力ブロック単位で再利用し計算をスキップすることができる。 (2) When a job is executed a plurality of times, processing that has been performed in the past can be accelerated. Specifically, when a calculation that has been performed in the past is included in the job being executed, the calculation can be skipped by reusing the result in units of input blocks.

有向グラフ形式の動作フローである。It is an operation | movement flow of a directed graph format. 従来技術の概略フローチャートである。It is a schematic flowchart of a prior art. 図２のＳ２１０を詳細化したフローチャートである。It is the flowchart which detailed S210 of FIG. 図２のＳ２３０を詳細化したフローチャートである。It is the flowchart which detailed S230 of FIG. 図２のＳ２４０を詳細化したフローチャートである。It is the flowchart which detailed S240 of FIG. 本発明の一実施の形態におけるシステム構成図である。1 is a system configuration diagram according to an embodiment of the present invention. 本発明の一実施の形態における全体動作のフローチャートである。It is a flowchart of the whole operation | movement in one embodiment of this invention. 本発明の一実施の形態における図７のＳ３１０を詳細化したフローチャートである。It is the flowchart which detailed S310 of FIG. 7 in one embodiment of this invention. 本発明の一実施の形態における図７のＳ３３０を詳細化したフローチャートである。It is the flowchart which detailed S330 of FIG. 7 in one embodiment of this invention. 本発明の一実施の形態における図７のＳ３４０を詳細化したフローチャートである。It is the flowchart which detailed S340 of FIG. 7 in one embodiment of this invention. 本発明の一実施の形態における全体動作のフローチャートである。It is a flowchart of the whole operation | movement in one embodiment of this invention.

以下図面と共に、本発明の実施の形態を説明する。 Embodiments of the present invention will be described below with reference to the drawings.

図６は、本発明の一実施の形態におけるシステム構成を示す。 FIG. 6 shows a system configuration in an embodiment of the present invention.

同図に示すシステムは、資源管理ノード１０と計算ノード２１、分散ファイルシステムのノードであり、データを保存するデータノード２２を有する複数の物理計算機２０から構成される。 The system shown in FIG. 1 is a resource management node 10, a calculation node 21, and a distributed file system node, and includes a plurality of physical computers 20 having data nodes 22 for storing data.

資源管理ノード１０は、どの計算ノードにどのタスクを割り当てるかを管理する計算機であり、タスク管理部１１、検索部１２、ノード管理ＤＢ１３を有し、外部記憶として計算結果記憶部１（ローカルディスク）が接続されている。ノード管理ＤＢ１３は、物理計算機毎に、計算ノード、データノード、及び、タスク用の入力データを関連付けて保持している。また、計算結果記憶部１は、計算ノード毎にハッシュ値及び実行ファイルが格納される。以下では外部記憶装置として説明するが、資源管理ノード内のディスク装置やメモリとして構築されていてもよい。 The resource management node 10 is a computer that manages which task is assigned to which calculation node, and includes a task management unit 11, a search unit 12, and a node management DB 13, and a calculation result storage unit 1 (local disk) as an external storage. Is connected. The node management DB 13 holds a calculation node, a data node, and task input data in association with each physical computer. The calculation result storage unit 1 stores a hash value and an execution file for each calculation node. Although described below as an external storage device, it may be constructed as a disk device or memory in the resource management node.

物理計算機２０の計算ノード２１は、ハッシュ計算部２１１を有し、計算結果一時記憶部２を有するローカルディスクが接続されている。データノード２２はデータ記憶部２２１を有する。 The calculation node 21 of the physical computer 20 has a hash calculation unit 211 and is connected to a local disk having a calculation result temporary storage unit 2. The data node 22 has a data storage unit 221.

上記の構成のシステムの処理について説明する。 Processing of the system having the above configuration will be described.

図７は、本発明の一実施の形態における全体動作のフローチャートである。 FIG. 7 is a flowchart of the overall operation in one embodiment of the present invention.

ステップ３１０）資源管理ノード１０のタスク管理部１１は、キャッシュを考慮しながら計算ノード２１にタスクを割り当てる。詳細については図８で後述する。 Step 310) The task management unit 11 of the resource management node 10 assigns a task to the computation node 21 while considering the cache. Details will be described later with reference to FIG.

ステップ３２０）タスクを割り当てられた計算ノード２１は、資源管理ノード１０の指示によりタスクを実行する。 Step 320) The computing node 21 to which the task is assigned executes the task in accordance with an instruction from the resource management node 10.

ステップ３３０）計算ノード２１は、ステップ３２０の結果を実行ファイル、入力データと関連付けて一時ファイルとして計算結果一時記憶部２（ローカルディスク）に保存すると共に、資源管理ノード１０に計算結果を送信し、ジョブが終了すればステップ３４０に移行し、終了していなければステップ３１０に戻る。詳細については図９にて後述する。 Step 330) The calculation node 21 associates the result of step 320 with the execution file and the input data, saves it as a temporary file in the calculation result temporary storage unit 2 (local disk), and transmits the calculation result to the resource management node 10. If the job ends, the process proceeds to step 340, and if not completed, the process returns to step 310. Details will be described later with reference to FIG.

ステップ３４０）ジョブが終了したら、資源管理ノード１０のタスク管理部１１は、計算結果を再利用できるように計算結果記憶部１に保存し、永続化する。詳細は図１０にて後述する。 Step 340) When the job is finished, the task management unit 11 of the resource management node 10 saves the calculation result in the calculation result storage unit 1 so that the calculation result can be reused and makes it permanent. Details will be described later with reference to FIG.

次に、上記のステップ３１０について説明する。 Next, step 310 will be described.

図８は、本発明の一実施の形態における図７のＳ３１０の詳細化のフローチャートである。 FIG. 8 is a detailed flowchart of S310 in FIG. 7 according to the embodiment of the present invention.

ステップ５１０）資源管理ノード１０のタスク管理部１１は、特許文献１の技術に基づいて、次に実行すべきタスクを決定する。 Step 510) The task management unit 11 of the resource management node 10 determines a task to be executed next based on the technique of Patent Document 1.

ステップ５２０）資源管理ノード１０のタスク管理部１１は、ステップ５１０で選択したタスクの入力データを保持している計算ノード２１をノード管理ＤＢ１３から検索し、当該計算ノード２１に対して入力データのハッシュ値を計算するように要求する。これにより計算ノード２１は、入力データのハッシュ値を計算してハッシュ値と実行ファイルを資源管理ノード１０に送信する。 Step 520) The task management unit 11 of the resource management node 10 searches the node management DB 13 for the computation node 21 holding the input data of the task selected in Step 510, and hashes the input data to the computation node 21. Requests that a value be calculated. Thereby, the calculation node 21 calculates a hash value of the input data and transmits the hash value and the execution file to the resource management node 10.

ステップ５３０）資源管理ノード１０のノード検索部１２は、ステップ５２０において、計算ノード２１に要求して取得した、入力データのハッシュ値と実行ファイルをキーとして、計算結果記憶部１から過去の計算結果の有無と当該計算結果の保存場所を検索する。計算結果記憶部１に入力データのハッシュ値に該当する計算結果が存在する場合には、すでに計算が実行されているものとしてステップ５４０に移行し、存在しない場合はまだ計算が実行されていないものとしてステップ５５０に移行する。 Step 530) The node search unit 12 of the resource management node 10 sends a past calculation result from the calculation result storage unit 1 using the hash value of the input data and the execution file obtained by requesting the calculation node 21 in Step 520 as keys. Search the existence of and the storage location of the calculation result. If there is a calculation result corresponding to the hash value of the input data in the calculation result storage unit 1, it is assumed that the calculation has already been performed, and the process proceeds to step 540, and if it does not exist, the calculation has not been executed yet As shown in FIG.

ステップ５４０）タスク管理部１１は、計算結果を保持する計算ノードで処理が終了したと見做し、計算をスキップし、ステップ５１０に移行する。 Step 540) The task management unit 11 considers the processing to be completed at the calculation node holding the calculation result, skips the calculation, and proceeds to Step 510.

ステップ５５０）タスク管理部１１は、特許文献１の技術に基づいて、タスクを実行する計算ノードを決定する。 Step 550) The task management unit 11 determines a calculation node that executes a task based on the technique of Patent Document 1.

次に、図７のタスク終了時の処理（ステップ３３０）について説明する。 Next, the process at the time of task termination (step 330) in FIG. 7 will be described.

図９は、本発明の一実施の形態における図７のＳ３３０の詳細化のフローチャートである。 FIG. 9 is a detailed flowchart of S330 in FIG. 7 according to the embodiment of the present invention.

ステップ７１０）計算ノード２１は、実行完了したタスクの計算結果（入力データのハッシュ値、実行ファイル、実行結果）を自ノードに接続された計算結果一時記憶部２に保存すると共に、資源管理ノード１０に送信する。 Step 710) The calculation node 21 stores the calculation result (the hash value of the input data, the execution file, and the execution result) of the task that has been executed in the calculation result temporary storage unit 2 connected to the own node, and also the resource management node 10 Send to.

ステップ７２０）資源管理ノード１０のタスク管理部１１は、計算ノードからキー（入力データのハッシュ値と実行ファイル）と実行結果を取得し、計算ノードと関連付けて計算結果記憶部１に格納する。 Step 720) The task management unit 11 of the resource management node 10 acquires the key (the hash value of the input data and the execution file) and the execution result from the calculation node, and stores them in the calculation result storage unit 1 in association with the calculation node.

次に、図７のジョブ終了時の処理（ステップ３４０）について説明する。 Next, the job end processing (step 340) in FIG. 7 will be described.

図１０は、本発明の一実施の形態における図７のＳ３４０のフローチャートである。 FIG. 10 is a flowchart of S340 in FIG. 7 according to the embodiment of the present invention.

ステップ９１０）ジョブが終了すると、資源管理ノード１０のノード検索部１２は、検索キー（入力データのハッシュ値と実行ファイル）に基づいて計算結果記憶部１を検索し、当該キーに対応する計算結果を保持している計算ノードを検索する。 Step 910) When the job is completed, the node search unit 12 of the resource management node 10 searches the calculation result storage unit 1 based on the search key (the hash value of the input data and the execution file), and the calculation result corresponding to the key Search for compute nodes that hold

ステップ９２０）ノード検索部１２は、ステップ９１０で検索された計算ノードと同じ物理計算機上で動作しているデータノードをノード管理ＤＢ１３から検索する。タスク管理部１１は、当該データノード２２に計算結果を送信する。これにより物理計算機２０のデータノード２２は当該計算結果をデータ記憶部２２１に格納する。 Step 920) The node search unit 12 searches the node management DB 13 for a data node operating on the same physical computer as the calculation node searched in Step 910. The task management unit 11 transmits the calculation result to the data node 22. As a result, the data node 22 of the physical computer 20 stores the calculation result in the data storage unit 221.

図１１は、本発明の一実施の形態における全体動作のシーケンスチャートである。同図におけるステップ番号は前述のフローチャートのステップ番号に対応する。 FIG. 11 is a sequence chart of the overall operation in one embodiment of the present invention. The step numbers in the figure correspond to the step numbers in the flowchart described above.

上記のような処理を行うことにより、入力データが時系列に従って増えていくような処理を含む計算を高速に行うことができる。 By performing the processing as described above, calculation including processing in which input data increases in time series can be performed at high speed.

これにより、
（１）検索エンジンの更新が高速に行えるようになる。 This
(1) The search engine can be updated at high speed.

（２）ユーザへのレコメンデーションが高速に更新できるようになる。 (2) Recommendations to users can be updated at high speed.

（３）ログ解析を高速に行うことができる。 (3) Log analysis can be performed at high speed.

なお、上記の資源管理ノード及び計算ノードの動作をプログラムとして構築し、資源管理ノード、計算ノードとして利用されるコンピュータにインストールして実行させる、または、ネットワークを介して流通させることが可能である。 Note that the operations of the resource management node and the calculation node can be constructed as a program, installed in a computer used as the resource management node and the calculation node, executed, or distributed via a network.

本発明は、上記の実施の形態に限定されることなく特許請求の範囲内において種々変更・応用が可能である。 The present invention is not limited to the above-described embodiments, and various modifications and applications can be made within the scope of the claims.

１計算結果記憶部
２計算結果一時記憶部
１０資源管理ノード
１１タスク管理部
１２検索部
１３ノード管理ＤＢ
２０物理計算機
２１計算ノード
２２データノード
２１１ハッシュ計算部
２２１データ記憶部 DESCRIPTION OF SYMBOLS 1 Calculation result memory | storage part 2 Calculation result temporary storage part 10 Resource management node 11 Task management part 12 Search part 13 Node management DB
20 physical computer 21 calculation node 22 data node 211 hash calculation unit 221 data storage unit

Claims

When the flow of a program executed in the distributed parallel execution framework is expressed as a circular directed graph, a parallel cache system including a resource management node to which a task is assigned and a computation node to which the task is assigned,
The resource management node is:
A hash value acquisition means for requesting calculation of the hash value of the input data to the calculation node having the input data of the task, and acquiring it together with the execution file;
A task management unit that obtains a hash value of an input data acquired from a calculation node that executed the task, an execution file, an input file, and stores the hash value in association with the calculation node in a calculation result storage unit;
The calculation result storage means is searched using the hash value of the input data and the execution file as a search key. If there is data of a calculation node corresponding to the key, the calculation is skipped, and if not, the task is A task assignment means for determining a calculation node to be assigned;
Have
The compute node is
A hash value calculation means for obtaining a hash value of the input data requested from the resource management node and returning the hash value to the resource management node;
A task execution unit that executes the task assigned by the resource management node, stores the result in an execution file and input data in association with the input data, and transmits the result to the resource management node; A parallel cache system.

The resource management node is:
When the job is completed, a calculation node holding the calculation result is extracted from the calculation result storage unit based on the search key, and the data node operating on the same physical computer as the calculation node is extracted from the calculation node. The parallel cache system according to claim 1, further comprising means for transmitting and storing the calculation result.

When the flow of a program executed in the distributed parallel execution framework is expressed as a circular directed graph, a parallel cache method comprising a resource management node to which a task is assigned and a computation node to which the task is assigned,
A hash value acquisition unit of the resource management node requests a hash value of the input data from a calculation node holding the input data, and acquires the hash value together with the execution file;
A task allocation step in which the task allocation means of the resource management node allocates a task to a computation node holding input data;
The task execution means of the calculation node executes the task assigned from the resource management node, and stores the result in the calculation result temporary storage means in association with the execution file and input data;
An execution result management step in which the execution result management unit of the resource management node acquires an execution file and input data as a task execution result from the calculation node, and stores it in the calculation result storage unit in association with the calculation node;
And
In the task assignment step,
When assigning a task to the calculation node, the calculation result storage unit is searched using the hash value of the input data and the execution file as a key, and when the calculation result of the task by the calculation node exists A parallel cache method characterized in that the processing of the task is skipped.

In the resource management node,
When the job is completed, a calculation node holding the calculation result is extracted from the calculation result storage unit based on the key, and the calculation is performed on the data node operating on the same physical computer as the calculation node. 4. The parallel cache method according to claim 3, wherein the result is transmitted and stored.

A resource management node that assigns a task to a computation node when the flow of a program executed in the distributed parallel execution framework is expressed as a circular directed graph,
Task management means for acquiring a hash value of input data obtained from a calculation node having task input data, an execution file, input data, and storing the calculation data storage means in association with the calculation node;
A hash value acquisition unit that requests the calculation node to calculate a hash value of input data and acquires a hash value and an executable file;
A calculation node that searches the calculation result storage means using the hash value of the input data and the execution file as a search key, skips the calculation when there is data corresponding to the key, and allocates a task when there is no data Task assignment means for determining
A resource management node characterized by comprising:

Computer
The parallel cache program for functioning as each means of the resource management node according to claim 5.