JP4846736B2

JP4846736B2 - Parallel processing support device

Info

Publication number: JP4846736B2
Application number: JP2007550972A
Authority: JP
Inventors: 均上原; 英治佐々木; 義一笹井
Original assignee: Japan Agency for Marine Earth Science and Technology
Current assignee: Japan Agency for Marine Earth Science and Technology
Priority date: 2005-12-22
Filing date: 2005-12-22
Publication date: 2011-12-28
Anticipated expiration: 2025-12-22
Also published as: JPWO2007072567A1; WO2007072567A1

Description

本発明は、計算機科学分野において、大規模シミュレーションデータのような大量のデータファイルを処理する技術に関する。 The present invention relates to a technique for processing a large amount of data files such as large-scale simulation data in the field of computer science.

海洋シミュレーションシステムのような大規模シミュレーションシステムで行われるシミュレーションの結果は、大量のデータファイルから構成される。大量のデータファイルは、サイズにして数十テラバイト以上、ファイル数にして一万個以上になることは珍しくない。一般に、このような大量のデータファイルは、同じ形式で作成されておらず、データの内容に応じた若干異なる形式を有する場合が多い。 The result of a simulation performed in a large-scale simulation system such as an ocean simulation system is composed of a large amount of data files. It is not uncommon for a large number of data files to be tens of terabytes or more in size and 10,000 or more files. In general, such a large amount of data files are not created in the same format and often have slightly different formats depending on the contents of the data.

また、数値シミュレーションでは、一般に、図１６Ａに示すような線分の交点が計算グリッドとして規定され、この計算グリッドに基づいて数値データに対する処理(例えば物理量の計算)が進められる。しかしながら、数値データが持つ物理的な特性や、計算式の都合等の諸事情により、一部の数値データに対する処理(物理量の計算)は、図１６Ｂに示すような異なる計算グリッドに基づいて行われることがしばしばある。 In numerical simulation, the intersection of line segments as shown in FIG. 16A is generally defined as a calculation grid, and processing (for example, calculation of physical quantities) on numerical data proceeds based on this calculation grid. However, due to the physical characteristics of the numerical data and various circumstances such as the convenience of calculation formulas, processing (calculation of physical quantities) for some numerical data is performed based on different calculation grids as shown in FIG. 16B. There is often.

ここに、図１６Ａに示された計算グリッド(「第１の計算グリッド」と呼ぶ)に基づいて算出された第１のデータファイルと、図１６Ｂに示された計算グリッド(「第２の計算グリッド」と呼ぶ)に基づいて算出された第２のデータファイルとが存在する場合を仮定する。さらに、第１及び第２のデータファイルから、例えば、東経１２０．１度から１２１．６度までの領域(切り出し範囲)に存するデータを切り出す場合を仮定する。 Here, the first data file calculated based on the calculation grid shown in FIG. 16A (referred to as “first calculation grid”) and the calculation grid shown in FIG. 16B (“second calculation grid”). Suppose that there is a second data file calculated based on “. Furthermore, it is assumed that data existing in an area (cutout range) from 120.1 degrees to 121.6 degrees east longitude is cut out from the first and second data files.

この場合、第１のデータファイルから２つの計算グリッドに対応するデータが切り出され、第２のデータファイルから４つの計算グリッドに対応するデータが切り出される。ところが、第１のデータファイルからも４つの計算グリッドに対応するデータの切り出しが望まれる場合には、上記した切り出し範囲に従って切り出されるデータ(切り出しデータ)は不十分なものとなる。 In this case, data corresponding to two calculation grids are cut out from the first data file, and data corresponding to four calculation grids are cut out from the second data file. However, when it is desired to cut out data corresponding to the four calculation grids from the first data file, the data cut out in accordance with the above-described cutting range (cutout data) is insufficient.

上記したような第１のデータファイルがどのような計算グリッドに基づいて算出されたデータを含んでいるかという、データファイル中のデータの詳細を示すデータを、本明細書では「メタデータ」と呼ぶ。 Data indicating the details of the data in the data file, such as what kind of calculation grid the first data file includes, is referred to as “metadata” in this specification. .

従来では、メタデータをシミュレーションデータから分離して保管したり、ユーザが、処理対象のシミュレーションデータファイル毎に、対応するメタデータを指定入力したりしていた。 Conventionally, metadata is stored separately from simulation data, or a user designates and inputs corresponding metadata for each simulation data file to be processed.

大規模シミュレーションの結果として生成されたシミュレーションデータをシミュレーション後にさらにデータ処理仕様とする場合、それらの大量のデータファイルのそれぞれに対して、ユーザがメタデータを指定入力することは、ユーザに多大な労力を払うことを強いることになる。また、ユーザがメタデータを誤って指定するおそれもあった。 When the simulation data generated as a result of a large-scale simulation is used as a data processing specification after simulation, it is a great effort for the user to specify and input metadata for each of these large data files. Will be forced to pay. There is also a risk that the user may specify the metadata by mistake.

また、大量のデータ処理を効率的に行うには並列処理(並列計算)が有効であるが、その並列計算を並列計算機群に実行させるための制御プログラム(スクリプト)を用意する必要がある。従来、スクリプトは、ユーザによって記述されていた。このため、ユーザには並列計算に係るスクリプトの記述知識が要求され、これがシステム利用の簡便性を阻害するとともに、ユーザに労力負担を強いることになっていた。また、スクリプトの記述ミスにより適正な並列計算が実行されないおそれもあった。 In addition, parallel processing (parallel computation) is effective for efficiently performing a large amount of data processing, but it is necessary to prepare a control program (script) for causing the parallel computer group to execute the parallel computation. Traditionally, scripts have been written by users. For this reason, the user is required to have script description knowledge related to parallel computation, which impedes the ease of using the system and imposes a burden on the user. In addition, there is a possibility that proper parallel calculation may not be executed due to a script description error.

本発明の目的は、大量のデータファイルに対する処理を簡便に行うことが可能な技術を提供することである。 An object of the present invention is to provide a technique capable of simply processing a large amount of data files.

本発明は、上記目的を達成するため、以下の手段を採用する。 In order to achieve the above object, the present invention employs the following means.

すなわち、本発明は、処理対象データファイル群，この処理対象データファイル群に対する並列処理を行う並列計算機群中の複数の計算ノードの数，前記処理対象データファイル群に対する処理内容の指定を含む並列処理指定情報を受け付ける受付手段と、
前記並列計算機群に含まれる複数の計算ノードのそれぞれに対する使用及び負荷状況を格納した記憶手段と、
前記指定された計算ノードの数と、前記使用及び負荷状況とに基づいて、並列処理を行う前記指定された数の計算ノード、及びこれらの計算ノード前記指定された数の計算ノードに対する前記処理対象データファイル群を構成する各処理対象データファイルの配置を決定する決定手段と、
前記各処理対象データファイルを前記決定手段による配置の決定結果に従って前記指定された数の計算ノードに配置するデータ配置命令文と、前記各処理対象データファイルが配置された各計算ノードに対する前記処理対象データファイル群の並列処理実行の命令文とを含む並列処理用ジョブスクリプトを自動的に生成する制御プログラム生成手段と、
前記各処理対象データファイルが配置された各計算ノードが自身に配置された処理対象データファイルの処理を行う場合に参照される並列処理の設定ファイルであって、処理対象データファイル毎に、処理対象データファイルのファイル識別子と、処理データファイルが配置される計算ノードの識別子と、指定された処理内容の記述を含む設定ファイルを自動的に生成するファイル生成手段と、
を含む並列処理支援装置である。 That is, the present invention provides a parallel processing including a processing target data file group, the number of a plurality of calculation nodes in a parallel computer group performing parallel processing on the processing target data file group, and designation of processing contents for the processing target data file group. A receiving means for receiving the specified information;
Storage means storing usage and load status for each of a plurality of computing nodes included in the parallel computer group;
Based on the specified number of calculation nodes and the usage and load status, the specified number of calculation nodes performing parallel processing , and the processing targets for the specified number of calculation nodes Determining means for determining the arrangement of each processing target data file constituting the data file group;
A data placement command for placing each processing object data file on the designated number of calculation nodes according to the placement determination result by the determining means, and the processing object for each computation node on which each processing object data file is placed Control program generation means for automatically generating a parallel processing job script including a parallel processing execution statement of the data file group;
Wherein a configuration file for parallel processing by each computing node in which each subject data file is located is referred to when performing processing of the processing target data files that are placed on itself, for each processing target data file, processing A file generation means for automatically generating a setting file including a file identifier of a target data file, an identifier of a calculation node in which the processing data file is arranged, and a description of a specified processing content;
Is a parallel processing support device.

好ましくは、本発明において、前記決定手段は、前記処理対象データファイル群を構成する各処理対象データファイルについて、前記並列計算機群に含まれる前記指定された計算ノード数の計算ノードから、処理対象データファイルとこれに対する処理結果ファイルとを格納可能な記憶容量を有する計算ノードを選出し、
選出された計算ノードのうち、現在の処理負荷が最も小さい計算ノードを前記処理対象データファイルを配置すべき計算ノードとして決定する。Preferably, in the present invention, the determination unit is configured to process, for each processing target data file constituting the processing target data file group, processing target data from the specified number of calculation nodes included in the parallel computer group. Select a computation node having a storage capacity capable of storing a file and a processing result file for the file,
Among the selected computation nodes, the computation node having the smallest current processing load is determined as the computation node where the processing target data file is to be placed.

また、好ましくは、本発明において、前記並列処理指定情報は、処理対象データファイルに対する処理の結果として生成される処理結果ファイルの保管位置の指定を含み、
前記制御プログラム生成手段は、処理結果ファイルを前記保管位置へ転送することを示す命令文を含む前記制御プログラムを生成する。Preferably, in the present invention, the parallel processing designation information includes designation of a storage location of a processing result file generated as a result of processing on the processing target data file,
The control program generation unit generates the control program including a command statement indicating that the processing result file is transferred to the storage location.

また、好ましくは、本発明は、処理対象データファイル群，この処理対象データファイル群に対する並列処理を行う並列計算機群中の複数の計算ノードの数，前記処理対象データファイル群に対する処理内容の指定を含む並列処理指定情報を受け付ける受付手段と、
前記並列計算機群に含まれる複数の計算ノードのそれぞれに対する使用及び負荷状況を格納した記憶手段と、
前記指定された計算ノードの数と、前記使用及び負荷状況とに基づいて、前記指定された数の計算ノードに対する前記処理対象データファイル群を構成する各処理対象データファイルの配置を決定する決定手段と、
前記各処理対象データファイルを前記決定手段による配置の決定結果に従って前記指定された数の計算ノードに配置するデータ配置命令文と、前記各処理対象データファイルが配置された各計算ノードに対する前記処理対象データファイル群の並列処理実行の命令文とを含む制御プログラムを自動的に生成する制御プログラム生成手段と、
前記各処理対象データファイルが配置された各計算ノードが自身に配置された処理対象データファイルの処理を行う場合に参照される並列処理の設定ファイルであって、処理対象データファイル毎に、処理対象データファイルのファイル識別子と、処理データファイルが配置される計算ノードの識別子と、指定された処理内容の記述とを含む設定ファイルを自動的に生成するファイル生成手段と、を含み、
処理対象データファイルの指定は、ディレクトリ構造を構成する複数のディレクトリの一つに格納されるデータファイルのファイルパスを含むデータファイルのファイル識別子が複数表示されたリスト中から選択されたファイル識別子の指定により行われ、
処理対象データのメタデータを格納したメタデータ格納手段と、
メタデータと関連づけられたキーワード群を有するキーワードリストと、
指定されたファイル識別子で特定されるデータファイルに対応するメタデータを検索するためのキーワードを、この指定されたファイル識別子のファイルパス部分から抽出するために、前記指定されたファイル識別子のファイルパス部分の一部をなす文字列と前記キーワードリストとを対比して、前記キーワードリスト中の少なくとも一つのキーワードと一致する文字列をキーワードとして抽出する抽出手段と、
抽出されたキーワードに対応するメタデータを前記メタデータ格納手段から検索する検索手段と、
前記処理データファイル毎に、処理データファイルに関連する関連データファイルがあるか否かを判定するために、前記検索手段によって検索されたメタデータに基づいて関連データファイルがあるか否かを判定する判定手段と、をさらに含み、
関連データファイルを有する処理対象データファイルが前記判定手段で検知された場合に、前記決定手段は、処理対象データファイル及びこれに対する関連データファイルを同一の計算ノードに配置し、前記制御プログラム生成手段は、前記関連データファイルを処理対象データファイルの一つとして含む前記処理対象データファイル群に対する前記制御プログラムを生成し、前記ファイル生成手段は、前記関連データファイルに対する前記設定ファイルを生成する、ように構成しても良い。 Preferably, the present invention specifies the processing target data file group, the number of a plurality of calculation nodes in the parallel computer group that performs parallel processing on the processing target data file group, and the processing contents for the processing target data file group. Receiving means for receiving parallel processing designation information including;
Storage means storing usage and load status for each of a plurality of computing nodes included in the parallel computer group;
Determination means for deciding the arrangement of each processing target data file constituting the processing target data file group with respect to the specified number of calculation nodes based on the specified number of calculation nodes and the usage and load status When,
A data placement command for placing each processing object data file on the designated number of calculation nodes according to the placement determination result by the determining means, and the processing object for each computation node on which each processing object data file is placed Control program generation means for automatically generating a control program including a statement for executing parallel processing of data files, and
A parallel processing setting file that is referred to when each computation node in which each processing object data file is arranged performs processing of the processing object data file arranged in itself. A file generation means for automatically generating a configuration file including a file identifier of the data file, an identifier of a calculation node in which the processing data file is arranged, and a description of the specified processing content;
The data file to be processed is specified by selecting a file identifier selected from a list that displays multiple file identifiers of the data file that contains the file path of the data file stored in one of the multiple directories that make up the directory structure. Made by
Metadata storage means for storing metadata of processing target data;
A keyword list having keywords associated with the metadata;
The file path portion of the specified file identifier is used to extract a keyword for searching metadata corresponding to the data file specified by the specified file identifier from the file path portion of the specified file identifier. An extraction means for comparing a character string forming a part of the keyword list with the keyword list, and extracting a character string that matches at least one keyword in the keyword list as a keyword;
Search means for searching for metadata corresponding to the extracted keyword from the metadata storage means;
For each processing data file, in order to determine whether there is a related data file related to the processing data file, determine whether there is a related data file based on the metadata searched by the search means And a determination means,
When a processing target data file having a related data file is detected by the determination unit, the determination unit arranges the processing target data file and the related data file for the processing target data file in the same calculation node, and the control program generation unit Generating the control program for the processing target data file group including the related data file as one of the processing target data files, and generating the setting file for the related data file. You may do it.

また、本発明は、上記した並列処理支援装置と同様の特徴を有する並列処理支援方法，プログラム，このプログラムを記録した記録媒体として特定することができる。 In addition, the present invention can be specified as a parallel processing support method and program having the same characteristics as the parallel processing support device described above, and a recording medium on which the program is recorded.

本発明によれば、大量のデータファイルに対する処理を簡便に行うことが可能となる。また、本発明によれば、処理対象データに対するメタデータの指定をユーザが行わなくて済む。 According to the present invention, it is possible to easily process a large amount of data files. Further, according to the present invention, it is not necessary for the user to specify metadata for the processing target data.

本発明を適用可能なシミュレーションシステムの構成例を示す図である。It is a figure which shows the structural example of the simulation system which can apply this invention. 図１に示した制御用コンピュータの構成例を示す図である。It is a figure which shows the structural example of the computer for control shown in FIG. 図１に示したノードの構成例を示す図である。It is a figure which shows the structural example of the node shown in FIG. 図２に示した処理対象データファイルを格納するファイルデータベースのディレクトリ構造例を示す図である。It is a figure which shows the example of a directory structure of the file database which stores the process target data file shown in FIG. 図２に示したメタデータテーブルのデータ構造例を示す図である。FIG. 3 is a diagram illustrating an example of a data structure of a metadata table illustrated in FIG. 2. 図２に示した使用及び負荷分散状況テーブルのデータ構造例を示す図である。It is a figure which shows the example of a data structure of the use and load distribution condition table shown in FIG. システムのユーザに提供されるユーザインタフェース(指定画面)の表示例を示す図である。It is a figure which shows the example of a display of the user interface (designation screen) provided to the user of a system. ユーザインタフェースを用いて入力される並列処理指定情報のファイルの記述例を示す図である。It is a figure which shows the example of a description of the file of the parallel processing designation | designated information input using a user interface. 並列処理用ジョブスクリプト及び並列処理プログラム用設定ファイルの作成処理のメインルーチンを示すフローチャートである。It is a flowchart which shows the main routine of the creation process of the job script for parallel processing, and the setting file for parallel processing programs. 並列処理用ジョブスクリプト及び並列処理プログラム用設定ファイルの作成処理のメインルーチンを示すフローチャートである。It is a flowchart which shows the main routine of the creation process of the job script for parallel processing, and the setting file for parallel processing programs. 並列処理用ジョブスクリプト及び並列処理プログラム用設定ファイルの作成処理のメインルーチンを示すフローチャートである。It is a flowchart which shows the main routine of the creation process of the job script for parallel processing, and the setting file for parallel processing programs. メタデータの解析・取得に係るサブルーチンを示すフローチャートである。It is a flowchart which shows the subroutine which concerns on analysis and acquisition of metadata. 処理対象データファイルの配置先となるノードの検索及び決定処理のサブルーチンを示すフローチャートである。It is a flowchart which shows the subroutine of the search of the node used as the arrangement destination of a process target data file, and a determination process. 並列処理プログラム用設定ファイルの記述例を示す図である。It is a figure which shows the example of a description of the setting file for parallel processing programs. 並列処理プログラムの実行処理を示すフローチャートである。It is a flowchart which shows the execution process of a parallel processing program. 処理対象データに対するメタデータとして用意される計算グリッドの例を示す図である。It is a figure which shows the example of the calculation grid prepared as metadata with respect to process target data. 処理対象データに対するメタデータとして用意される、図１６Ａの計算グリッドと異なる計算グリッドの例を示す図である。It is a figure which shows the example of the calculation grid different from the calculation grid of FIG. 16A prepared as metadata with respect to process target data.

Explanation of symbols

Ｘ・・・並列計算機群
Ｙ・・・制御用のコンピュータ
１，１１・・・ＣＰＵ
２，１２・・・メインメモリ
３，１４・・・外部記憶装置
７・・・入力装置
８・・・表示装置
６，１５・・・通信インタフェース
３１・・・ファイルデータベース
３２・・・メタデータテーブル
３３・・・使用及び負荷分散情報テーブルX ... parallel computer group Y ... control computer 1, 11 ... CPU
2, 12 ... Main memory 3, 14 ... External storage device 7 ... Input device 8 ... Display device 6, 15 ... Communication interface 31 ... File database 32 ... Metadata table 33 ... Usage and load balancing information table

以下、図面を参照して本発明の実施形態について説明する。実施形態における構成は例示であり、本発明は、実施形態の構成に限定されない。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. The configuration in the embodiment is an exemplification, and the present invention is not limited to the configuration in the embodiment.

〔シミュレーションシステム〕
図１は、本発明を適用可能なシミュレーションシステムの構成例を示す図である。図１に示す例では、シミュレーションシステムは、並列計算機群Ｘと、並列計算機群Ｘに通信回線(ネットワーク)を介して接続された制御用のコンピュータ(情報処理装置)Ｙとからなる。[Simulation system]
FIG. 1 is a diagram showing a configuration example of a simulation system to which the present invention can be applied. In the example shown in FIG. 1, the simulation system includes a parallel computer group X and a control computer (information processing device) Y connected to the parallel computer group X via a communication line (network).

並列計算機群Ｘは、海洋大循環モデルのような大規模シミュレーションデータを構成する多数のデータファイルに対する並列処理を行う複数の計算ノード(ノード)＃０〜＃ｎ(ｎは自然数)からなる。 The parallel computer group X is composed of a plurality of calculation nodes (nodes) # 0 to #n (n is a natural number) that performs parallel processing on a large number of data files constituting large-scale simulation data such as an ocean general circulation model.

コンピュータＹは、並列計算機群Ｘでの処理対象となるシミュレーションデータ(処理対象データ)を管理しており、ユーザの操作に応じて、シミュレーションデータを用いた並列処理を並列計算機群Ｘに実行させる場合の制御を行う。 The computer Y manages simulation data (processing target data) to be processed in the parallel computer group X, and causes the parallel computer group X to execute parallel processing using the simulation data in accordance with a user operation. Control.

シミュレーションシステムのユーザは、コンピュータＹによって提供されるＵＩ(ユーザインタフェース)を通じて、並列計算機群Ｘを用いた大量の処理対象データ(処理対象データ群)の並列処理を実行するための並列処理指定情報を入力する。 The user of the simulation system provides parallel processing designation information for executing parallel processing of a large amount of processing target data (processing target data group) using the parallel computer group X through a UI (user interface) provided by the computer Y. input.

ここに、並列処理指定情報は、並列処理の対象となる複数のシミュレーションデータファイル(処理対象データファイル群)，処理対象データファイル群に対する並列計算機群Ｙの処理内容(処理種別，処理詳細パラメータ)，並列処理を行う複数のノード(ノード数)，並列処理の結果生成されるファイル(処理済みデータファイル(処理結果ファイル))の保管位置などの指定を含むことができる。 Here, the parallel processing designation information includes a plurality of simulation data files (processing target data file group) to be processed in parallel processing, processing details of the parallel computer group Y for the processing target data file group (processing type, processing detailed parameters), It can include designation of a plurality of nodes (number of nodes) performing parallel processing, a storage position of a file (processed data file (processing result file)) generated as a result of parallel processing, and the like.

コンピュータＹは、入力された並列処理指定情報に基づいて、並列計算機群Ｘに並列処理に係る制御指示を与えるための並列処理用ジョブスクリプト(並列計算機群Ｙの制御用プログラム：以下「スクリプト」と表記することもある)と、並列処理を実行する各ノードが処理対象データファイルを処理する際に参照される並列処理プログラム用設定ファイル(以下「設定ファイル」と表記することもある)とを自動的に生成する。 The computer Y uses a parallel processing job script (a control program for the parallel computer group Y: hereinafter referred to as “script”) to give a parallel computer group X a control instruction related to parallel processing based on the input parallel processing designation information. And a parallel processing program configuration file (hereinafter also referred to as a “configuration file”) that is referenced when each node that executes parallel processing processes the data file to be processed. Generate automatically.

コンピュータＹは、スクリプトの生成過程において、各処理対象データファイルに対するメタデータ(処理対象データの詳細情報)の取得，及び並列計算機群Ｘに対する処理対象データファイル群の配置決定を行う。メタデータ及び配置決定結果は、スクリプトの記述内容に反映される。 In the script generation process, the computer Y acquires metadata (detailed information of the processing target data) for each processing target data file and determines the arrangement of the processing target data file group for the parallel computer group X. The metadata and the arrangement determination result are reflected in the description content of the script.

コンピュータＹは、スクリプトの実行を通じて、処理対象データファイル群を複数のノードへ分散配置(分配)するとともに、これらのノードに対して並列処理プログラム(ジョブ)の実行を指示する。各ノードは、設定ファイルの記述に従って並列処理プログラムを実行し、分配された処理対象データファイルに対する処理を、対応するメタデータに基づいて行う。当該処理を通じて処理結果ファイルが作成される。処理結果ファイルは、並列処理指定情報として指定された保管位置にて保管される。 The computer Y distributes (distributes) the processing target data file group to a plurality of nodes through execution of the script, and instructs the nodes to execute a parallel processing program (job). Each node executes the parallel processing program according to the description of the setting file, and performs processing on the distributed processing target data file based on the corresponding metadata. A processing result file is created through this processing. The processing result file is stored at a storage location specified as parallel processing specification information.

〈コンピュータＹ〉
図２は、コンピュータＹの構成例を示す図である。図２において、コンピュータＹは、バスＢを介して相互に接続されたＣＰＵ１，メインメモリ(ＭＭ：例えばＲＡＭ)２，外部記憶装置(例えばハードディスク)３，入出力インタフェース(Ｉ／Ｆ)４及び５，並びに通信インタフェース６を備えている。<Computer Y>
FIG. 2 is a diagram illustrating a configuration example of the computer Y. In FIG. 2, a computer Y includes a CPU 1, a main memory (MM: for example, RAM) 2, an external storage device (for example, a hard disk) 3, and input / output interfaces (I / F) 4 and 5 that are connected to each other via a bus B. , And a communication interface 6.

Ｉ／Ｆ４には、入力手段としての入力装置(キーボード，ポインティングデバイス(例えばマウス)等)が接続されており、Ｉ／Ｆ５には、出力手段としての表示装置(ディスプレイ)８が接続されている。さらに、通信Ｉ／Ｆ６は、通信回線(ネットワーク)を介して各ノード＃０〜＃ｎに接続されている。 An input device (keyboard, pointing device (eg, mouse)) is connected to the I / F 4, and a display device (display) 8 is connected to the I / F 5. . Further, the communication I / F 6 is connected to each of the nodes # 0 to #n via a communication line (network).

外部記憶装置３には、大規模シミュレーションデータを構成する大量のシミュレーションデータファイルを格納したファイルデータベース(ファイルＤＢ)３１と、各データファイルに対応するメタデータ(シミュレーションデータの詳細情報)を格納したメタデータテーブル３２と、処理対象データファイル群を複数のノードに分散配置する場合に参照される各ノードの使用及び負荷分散状況テーブル３３(以下、「状況テーブル３３」と表記)とが格納されている。ファイルＤＢ３１とメタデータテーブル３２とは異なる記憶領域上に作成されている。 The external storage device 3 includes a file database (file DB) 31 that stores a large amount of simulation data files constituting large-scale simulation data, and a meta data that stores metadata (detailed information of simulation data) corresponding to each data file. A data table 32 and a use and load distribution status table 33 (hereinafter referred to as “situation table 33”) of each node that is referred to when a processing target data file group is distributed and arranged in a plurality of nodes are stored. . The file DB 31 and the metadata table 32 are created on different storage areas.

さらに、外部記憶装置３には、コンピュータＹを、シミュレーションデータやメタデータの管理装置として機能させるとともに、並列計算機群Ｙ(ノード＃０〜＃ｎ)の制御装置として機能させるためのプログラムが格納されている。 Further, the external storage device 3 stores a program for causing the computer Y to function as a simulation data and metadata management device and to function as a control device for the parallel computer group Y (nodes # 0 to #n). ing.

ＣＰＵ１は、外部記憶装置３に記録されたプログラムをＭＭ２にロードして実行することにより、例えば、次のような機能を実現する。
(１)シミュレーションシステムのユーザに対し、入力装置７及び表示装置８を用いた並列処理指定情報の入力(指定)環境(ＵＩ：ユーザインタフェース)を提供する。
(２)並列処理指定情報に基づいてスクリプト及び設定ファイルを作成する。
(３)スクリプトの作成時において、処理対象としてユーザにより指定された複数のシミュレーションデータファイル(処理対象データファイル群)のそれぞれに対応するメタデータを検索及び取得する。
(４)スクリプトの作成時において、処理対象データファイル群を構成する各処理対象データファイルを処理するノード(処理対象データファイルの配置)を決定する。
(５)処理対象データファイル群、及び処理対象データファイル群の並列処理によって生成される処理結果ファイルの転送制御を行う。The CPU 1 realizes the following functions, for example, by loading the program recorded in the external storage device 3 into the MM 2 and executing the program.
(1) Provide an input (designation) environment (UI: user interface) for parallel processing designation information using the input device 7 and the display device 8 to the user of the simulation system.
(2) Create a script and a configuration file based on the parallel processing designation information.
(3) When a script is created, metadata corresponding to each of a plurality of simulation data files (processing target data file group) designated by the user as processing targets is searched and acquired.
(4) At the time of creating a script, a node (arrangement of processing target data files) for processing each processing target data file constituting the processing target data file group is determined.
(5) The transfer control of the processing result file generated by the parallel processing of the processing target data file group and the processing target data file group is performed.

なお、ＣＰＵ１が本発明に係る受付手段，決定手段，制御プログラム生成手段，ファイル生成手段，判定手段に相当する。また、ＣＰＵ１は、ファイル識別子の指定を受け付ける受付手段，ファイル識別子からメタデータ検索用のキーワードを抽出する抽出手段，及びキーワードに対応するメタデータを検索する検索手段として機能することができる。また、外部記憶装置３が本発明に係る記憶手段に相当する。また、外部記憶装置３は、検索手段によって検索されるメタデータを格納したメタデータ格納手段として機能する。 The CPU 1 corresponds to an accepting unit, a determining unit, a control program generating unit, a file generating unit, and a determining unit according to the present invention. In addition, the CPU 1 can function as a receiving unit that accepts designation of a file identifier, an extracting unit that extracts a metadata search keyword from the file identifier, and a search unit that searches for metadata corresponding to the keyword. The external storage device 3 corresponds to a storage unit according to the present invention. The external storage device 3 functions as a metadata storage unit that stores metadata searched by the search unit.

〈並列計算機群Ｘ〉
並列計算機群Ｘを構成する各ノード＃０〜＃ｎは、同じ構成を有している。図３は、ノードの構成例を示す図である。ノードは、バスＢ１を介して相互に接続されたＣＰＵ１１，メインメモリ１２，計算プロセッサ１３，外部記憶装置(例えばハードディスク)１４，及び通信インタフェース(通信Ｉ／Ｆ)１５を備えている。通信Ｉ／Ｆ１５は、ネットワークを介してコンピュータＹ及び他のノードに接続されている。<Parallel computer group X>
The nodes # 0 to #n constituting the parallel computer group X have the same configuration. FIG. 3 is a diagram illustrating a configuration example of a node. The node includes a CPU 11, a main memory 12, a calculation processor 13, an external storage device (for example, a hard disk) 14, and a communication interface (communication I / F) 15 that are connected to each other via a bus B 1. The communication I / F 15 is connected to the computer Y and other nodes via a network.

ノードは、コンピュータＹから転送されてくる処理対象データファイルを通信Ｉ／Ｆ１５で受信し、これを外部記憶装置１４に格納する。また、ノードは、コンピュータＹからの並列処理命令や設定ファイルを通信Ｉ／Ｆ１５を介して受信する。 The node receives the processing target data file transferred from the computer Y by the communication I / F 15 and stores it in the external storage device 14. Further, the node receives a parallel processing command and a setting file from the computer Y via the communication I / F 15.

すると、ＣＰＵ１１が、設定ファイルの記述に従って、外部記憶装置３に予め格納されている並列処理プログラムの実行を開始する。処理対象データを用いた計算には計算プロセッサ１３が使用される。計算プロセッサ１３は、外部記憶装置１４に格納された処理対象データファイルをＭＭ１２上に読み出し、これを用いた所定の処理(例えば、データファイル中の所定領域の切り出し、物理量の計算)を実行する。この所定の処理は、メタデータに基づいて実行される。 Then, the CPU 11 starts executing the parallel processing program stored in advance in the external storage device 3 in accordance with the description of the setting file. A calculation processor 13 is used for the calculation using the processing target data. The calculation processor 13 reads out the processing target data file stored in the external storage device 14 on the MM 12, and executes predetermined processing (for example, extraction of a predetermined area in the data file, calculation of a physical quantity) using the processing target data file. This predetermined process is executed based on the metadata.

所定の処理によって、処理結果ファイルが生成され、外部記憶装置１４に格納される。外部記憶装置１４に格納された処理結果ファイルは、所定の保管位置に移動(転送)される。 A processing result file is generated by a predetermined process and stored in the external storage device 14. The processing result file stored in the external storage device 14 is moved (transferred) to a predetermined storage location.

ＣＰＵ１１は、設定ファイルに従って、並列処理を実行する並列処理手段として機能する。また、ＣＰＵ１１は、ファイル識別子の指定を受け付ける受付手段，ファイル識別子からメタデータ検索用のキーワードを抽出する抽出手段，及びキーワードに対応するメタデータを検索する検索手段として機能することができる。また、外部記憶装置１４は、検索手段によって検索されるメタデータを格納したメタデータ格納手段として機能する。 The CPU 11 functions as a parallel processing unit that executes parallel processing according to the setting file. In addition, the CPU 11 can function as a receiving unit that receives designation of a file identifier, an extracting unit that extracts a metadata search keyword from the file identifier, and a search unit that searches for metadata corresponding to the keyword. The external storage device 14 functions as a metadata storage unit that stores metadata searched by the search unit.

〈ＤＢ及びテーブルのデータ構造〉
次に、図２に示したファイルＤＢ３１，メタデータテーブル３２，並びに、使用及び負荷状況テーブル(状況テーブル)３３の詳細を説明する。<Data structure of DB and table>
Next, details of the file DB 31, the metadata table 32, and the usage and load situation table (situation table) 33 shown in FIG. 2 will be described.

《ファイルＤＢ３１》
ファイルＤＢ３１は、大量のシミュレーションデータファイル(以下、単に「データファイル」と表記することもある)を、ディレクトリ構造を用いて分類及び格納している。<< File DB31 >>
The file DB 31 classifies and stores a large amount of simulation data files (hereinafter sometimes simply referred to as “data files”) using a directory structure.

図４は、ファイルＤＢ３１のディレクトリ構造の例を示す図である。ファイルＤＢ３１内には、ルートディレクトリ(図４ではディレクトリ“data”)を起点としたディレクトリツリーが形成されており、各階層のディレクトリには、所定のディレクトリ名が付与されている。データファイルは、ディレクトリツリー中の末端に位置するディレクトリ内に格納され、所定のデータファイル名が付与されている。 FIG. 4 is a diagram illustrating an example of the directory structure of the file DB 31. A directory tree starting from the root directory (the directory “data” in FIG. 4) is formed in the file DB 31, and a predetermined directory name is assigned to each directory in the hierarchy. The data file is stored in a directory located at the end of the directory tree and given a predetermined data file name.

データファイルは、ファイル識別子を用いて識別される。ファイル識別子は、ルートディレクトリから末端のディレクトリまでに至るまでの、ディレクトリツリーの経路(パス)上に位置する各ディレクトリの名称(パス名)と、データファイル名との羅列により表現される。 A data file is identified using a file identifier. The file identifier is represented by a list of names (path names) of the directories located on the path (path) of the directory tree from the root directory to the terminal directory and the data file name.

例えば、図４におけるデータファイル名“timeXXX.000.000.dat”を有するデータファイルのファイル識別子は、“/data/experimentA/3D/statisticsA/variableB/timeXXX.000.000”である。このように、ファイル識別子は、データファイルの格納位置情報(ファイルパス)を含んでいる。 For example, the file identifier of the data file having the data file name “timeXXX.000.000.dat” in FIG. 4 is “/data/experimentA/3D/statisticsA/variableB/timeXXX.000.000”. As described above, the file identifier includes the storage location information (file path) of the data file.

また、ファイル識別子中のディレクトリ名(“3D”,“statsticsA”，“variableB”等)やデータファイル名(“timeXXX.000.000”)は、データファイル中のデータの詳細(性質等)を示すキーワードとして規定されている。キーワードは、任意の１以上の文字で構成され、ディレクトリ名及びデータファイル名中の、少なくとも１箇所に配置される。但し、ファイル名の拡張子部分にキーワードは設定されない。キーワードは、処理対象データに対応するメタデータを検索するための検索キーとして機能する。 In addition, the directory name (“3D”, “statsticsA”, “variableB”, etc.) and the data file name (“timeXXX.000.000”) in the file identifier are keywords that indicate the details (properties, etc.) of the data in the data file. It is prescribed. The keyword is composed of any one or more characters, and is arranged in at least one place in the directory name and the data file name. However, no keyword is set in the extension part of the file name. The keyword functions as a search key for searching for metadata corresponding to the processing target data.

なお、データファイルは、必ずしも１つの記憶領域に格納される必要はなく、コンピュータＹの内部又は外部に配置される複数の記憶領域上に分散して格納されていても良い。 The data file does not necessarily need to be stored in one storage area, and may be distributed and stored in a plurality of storage areas arranged inside or outside the computer Y.

《メタデータテーブル》
メタデータテーブル３２は、ファイル識別子中のキーワードに対応するメタデータを格納している。図５は、メタデータテーブル３２のデータ構造例を示す図である。<Metadata table>
The metadata table 32 stores metadata corresponding to keywords in the file identifier. FIG. 5 is a diagram illustrating a data structure example of the metadata table 32.

図５に示す例では、メタデータテーブル３２は、検索キー(キーワード)と、これに対応するメタデータとを格納した複数のレコードからなる。キーワードは、ユーザにより指定されたデータファイル(処理対象データファイル)のファイル識別子から検索キーとして抽出される。 In the example illustrated in FIG. 5, the metadata table 32 includes a plurality of records storing search keys (keywords) and metadata corresponding to the search keys (keywords). The keyword is extracted as a search key from the file identifier of the data file (processing target data file) designated by the user.

メタデータは、シミュレーションデータ(処理対象データ)の詳細(性質や属性等)を示す情報であり、例えば、処理対象データの物性を示す情報であったり、統計処理や時空間(縦、横、高さ、時間(年月日時))に関する情報であったりする。例えば、図１６Ａや図１６Ｂに示した計算グリッドの情報は、空間に関する情報である。このような計算グリッドの情報を表すキーワードとして、例えば、任意の文字数で表される変数名が適用される。 Metadata is information indicating details (properties, attributes, etc.) of simulation data (processing target data) .For example, it is information indicating physical properties of processing target data, statistical processing and time-space (vertical, horizontal, high). It may be information about time (year / month / day). For example, the information on the calculation grid shown in FIGS. 16A and 16B is information about the space. As a keyword representing such calculation grid information, for example, a variable name represented by an arbitrary number of characters is applied.

なお、図５では、ファイル識別子に含まれるディレクトリ名の一つが、１つのメタデータに対応する場合を示している。これに代えて、例えば、１つのファイル識別子に含まれる複数のキーワードの組み合わせから１つのメタデータが検索されるように構成しても良い。また、ディレクトリ名やデータファイル名(拡張子を除く)の一部に、キーワードが含まれ、部分一致検索でキーワードがファイル識別子から抽出されるようにしても良い。また、ファイル識別子中のファイルパス部分のみに、キーワードが設定される構成を採用することもできる。 FIG. 5 shows a case where one of the directory names included in the file identifier corresponds to one metadata. Instead of this, for example, one metadata may be searched from a combination of a plurality of keywords included in one file identifier. Further, a keyword may be included in a part of the directory name or data file name (excluding the extension), and the keyword may be extracted from the file identifier by the partial match search. A configuration in which a keyword is set only in the file path portion in the file identifier may be employed.

《状況テーブル３３》
図６は、状況テーブル３３のデータ構造例を示す図である。状況テーブル３３は、ノード毎に用意された複数の小テーブル３４からなる。各小テーブル３４は、同じデータ構造を有している。小テーブル３４は、ノードの使用が許可されているユーザの識別情報(ユーザＩＤ)と、ユーザが使用可能な当該ノードの外部記憶装置１４の最大サイズ(許可最大容量)と、ユーザが現在使用している外部記憶装置１４の容量(負荷)とを要素(項目)とするレコードの集合で構成されている。各小テーブル３４には、ノード識別子が付与されており、ノード識別子と対応する情報が当該小テーブルに格納される。<< Situation Table 33 >>
FIG. 6 is a diagram illustrating an example of the data structure of the situation table 33. The situation table 33 includes a plurality of small tables 34 prepared for each node. Each small table 34 has the same data structure. The small table 34 includes identification information (user ID) of a user who is permitted to use the node, a maximum size (permitted maximum capacity) of the external storage device 14 of the node that can be used by the user, and a user currently using. It is composed of a set of records whose elements (items) are the capacity (load) of the external storage device 14. Each small table 34 is given a node identifier, and information corresponding to the node identifier is stored in the small table.

〈ユーザインタフェース(ＵＩ)〉
図２に示すコンピュータＹにおいて、ＣＰＵ１は、プログラムの実行を通じて、コンピュータＹのユーザに対し、並列処理指定情報の入力環境(ＵＩ)を提供する。<User interface (UI)>
In the computer Y shown in FIG. 2, the CPU 1 provides an input environment (UI) for parallel processing designation information to the user of the computer Y through execution of the program.

ユーザは、ＵＩを用いて、並列処理指定情報の要素(項目)たる、処理対象データファイル群(ファイル識別子)，処理対象データファイル群を処理する複数のノード，処理対象データファイル群に対する処理内容(処理種別及び詳細パラメータ)，処理結果ファイルの保管位置等を指定することができる。 The user can use the UI to process the processing target data file group (file identifier), a plurality of nodes that process the processing target data file group, and the processing content for the processing target data file group (elements) of the parallel processing designation information (item). The processing type and detailed parameters), the storage location of the processing result file, etc. can be specified.

図７は、ＵＩとして提供される並列処理指定情報の指定画面の例を示す図である。指定画面は、ＣＰＵ１によるプログラムの実行を通じて、表示装置８のスクリーンに表示される。 FIG. 7 is a diagram illustrating an example of a designation screen for parallel processing designation information provided as a UI. The designation screen is displayed on the screen of the display device 8 through execution of the program by the CPU 1.

図７に示す例では、指定画面は、ファイルパス表示欄８１と、ファイルリスト表示欄８２と、コマンド入力欄８３とを備えている。ファイルパス表示欄８１には、ユーザが入力装置７を用いて選択したファイルＤＢ３１内のディレクトリ(ファイルパス)が表示される。 In the example shown in FIG. 7, the designation screen includes a file path display field 81, a file list display field 82, and a command input field 83. The file path display field 81 displays a directory (file path) in the file DB 31 selected by the user using the input device 7.

また、ファイルリスト表示欄８２には、ファイルパス表示欄８１に表示されたファイルパスに対応するデータファイル(ファイルパス中の末端のディレクトリに格納されたデータファイル)のリスト(ファイルリスト)を表示する。また、コマンド入力欄８３は、処理対象データファイルに対する処理に係るコマンドを入力するために使用される。 The file list display field 82 displays a list (file list) of data files (data files stored in the terminal directory in the file path) corresponding to the file path displayed in the file path display field 81. . The command input field 83 is used to input a command related to processing for the processing target data file.

ユーザは、入力装置７を操作して、ファイルパス表示欄８１に所望のファイルパスを表示させる(ファイルパスを選択する)ことができる。ファイルパスの選択結果に応じて、ファイルリスト表示欄８２の表示内容が変更され、ファイルパスに応じたファイルリストが当該表示欄８２に表示される。 The user can operate the input device 7 to display a desired file path (select a file path) in the file path display field 81. Depending on the selection result of the file path, the display content of the file list display field 82 is changed, and the file list corresponding to the file path is displayed in the display field 82.

ユーザは、入力装置７を用いたカーソル操作で、ファイルリスト表示欄８２に表示されたファイルリストから所望のファイル名をしていすることで、処理対象データファイルのファイル識別子を指定することができる。このとき、カーソル操作を通じて、複数のデータファイルを一時に指定することもできる。このように、ユーザは、ファイルパス表示欄８１及びファイルリスト表示欄８２を用いて、処理対象データファイルのファイル識別子を指定することができる。 The user can specify the file identifier of the processing target data file by giving a desired file name from the file list displayed in the file list display field 82 by a cursor operation using the input device 7. At this time, a plurality of data files can be designated at a time through the cursor operation. Thus, the user can specify the file identifier of the processing target data file using the file path display field 81 and the file list display field 82.

また、ユーザは、コマンド入力欄８３を用いて、並列処理に使用するノード(ノード数)，処理対象データファイル群に対する処理内容，処理結果ファイルの保管位置等を指定入力することができる。 Further, the user can designate and input the node (number of nodes) used for parallel processing, the processing content for the processing target data file group, the storage location of the processing result file, and the like using the command input field 83.

なお、ノード数，処理パラメータ，保管位置の指定に際して、スクリーン上に指定内容の選択肢が表示され、ユーザがカーソル操作で所望の選択肢を選択することにより、これらが指定されるように構成することができる。 It should be noted that when specifying the number of nodes, processing parameters, and storage position, options of the specified contents are displayed on the screen, and these are specified by the user selecting a desired option by operating the cursor. it can.

〈スクリプト及び設定ファイルの生成〉
ユーザが、上述したようなＵＩを用いて並列処理指定情報の各要素を指定し、その指定内容の確定操作を行うと、並列処理指定情報は、所定のフォーマットで記述された並列処理指定情報ファイルとして、外部記憶装置３の所定位置に格納される。<Generation of script and setting file>
When the user designates each element of the parallel processing designation information using the UI as described above and confirms the designated content, the parallel processing designation information is stored in a parallel processing designation information file described in a predetermined format. Is stored in a predetermined position of the external storage device 3.

図８は、並列処理指定情報ファイルの記述例を示す図である。図８において、並列処理指定情報ファイルは、計算機資源の指定行と、処理詳細(処理内容)の指定行と、処理対象データファイル及びこれに対する処理結果の保管位置の指定行とを含む。 FIG. 8 is a diagram illustrating a description example of the parallel processing designation information file. In FIG. 8, the parallel processing specification information file includes a computer resource specification line, a processing detail (processing content) specification line, a processing target data file, and a processing result storage position specification line corresponding thereto.

計算機資源の指定行(図８の第１行)では、その識別子(“NODE”)と、並列処理に使用するノード数を表す引数(図７の例では“３”)が記述される。 In the computer resource designation line (first line in FIG. 8), an identifier (“NODE”) and an argument (“3” in the example in FIG. 7) indicating the number of nodes used for parallel processing are described.

また、処理詳細の指定行(図８の第２行)では、その識別子(“PROC”)と、処理種別(“PROC＿A”)と、処理詳細を表す処理パラメータ(“120.0 150.0 20.0 50.0”)を表す引数が記述される。 In the process detail designation line (second line in FIG. 8), the identifier (“PROC”), the process type (“PROC_A”), and the process parameter (“120.0 150.0 20.0 50.0”) indicating the process details are displayed. The argument to represent is described.

また、処理対象データファイル及び保管位置の指定行(図８の第３及び４行)では、その識別子(“DATA”)と、処理対象データファイルのファイル識別子と、対応する処理結果ファイルの保管位置の識別情報(“xxxxx”や“xxxxy”で図示)とが記述される。当該指定行は、処理対象データファイル毎に作成される。 In addition, in the designated rows (third and fourth rows in FIG. 8) of the processing target data file and the storage location, the identifier (“DATA”), the file identifier of the processing target data file, and the storage location of the corresponding processing result file are stored. Identification information (illustrated by “xxxxx” or “xxxxy”) is described. The designated line is created for each data file to be processed.

このような記述(並列処理指定情報ファイル)は、ユーザが、ＵＩを用いて、ノード数，処理内容，処理対象データファイル群，保管位置をそれぞれ指定することで、ＣＰＵ１により自動的に作成される。 Such a description (parallel processing designation information file) is automatically created by the CPU 1 when the user designates the number of nodes, processing contents, processing target data file group, and storage location using the UI. .

図９，図１０及び図１１は、ＣＰＵ１(図２)によって実行されるスクリプト及び設定ファイル作成処理のメインルーチンの例を示すフローチャートである。当該処理の実行は、例えば、並列処理指定情報ファイルの作成終了や、ユーザからの処理開始指示の入力を契機として、開始される。 9, 10 and 11 are flowcharts showing an example of a main routine of script and setting file creation processing executed by the CPU 1 (FIG. 2). The execution of the process is started when, for example, the creation of the parallel processing designation information file is completed or a process start instruction is input from the user.

図９に示す処理が開始されると、最初に、ＣＰＵ１は、初期化処理を行う(ステップＳ００１)。次に、ＣＰＵ１は、外部記憶装置３に格納された並列処理指定情報ファイル(図８)をＭＭ２に読み込む(ステップＳ００２)。 When the process shown in FIG. 9 is started, first, the CPU 1 performs an initialization process (step S001). Next, the CPU 1 reads the parallel processing designation information file (FIG. 8) stored in the external storage device 3 into the MM 2 (step S002).

次に、ＣＰＵ１は、並列処理指定情報の解析ループ処理を実行する。この解析ループ処理において、ＣＰＵ１は、並列処理指定情報ファイルから指定行を１行ずつ取り出し、取り出した行を解析対象行に設定し、この解析対象行の解析を行う。 Next, the CPU 1 executes an analysis loop process for the parallel processing designation information. In this analysis loop process, the CPU 1 extracts specified lines one by one from the parallel processing specification information file, sets the extracted lines as analysis target lines, and analyzes the analysis target lines.

ＣＰＵ１は、並列処理指定情報ファイルから取り出した解析対象行が、計算機資源の指定行か否かを判定する(ステップＳ００３)。 The CPU 1 determines whether or not the analysis target row extracted from the parallel processing designation information file is a designated row of computer resources (step S003).

このとき、解析対象行が計算機資源の指定行であれば(Ｓ００３；ＹＥＳ)、ＣＰＵ１は、この解析対象行中の引数(ノード数：図８の例であれば“３”)を並列処理に係る計算機資源パラメータとして決定し、所定位置(ＭＭ２上の所定の作業領域)に保存する(ステップＳ００４)。その後、ＣＰＵ１は、次の指定行を解析対象行に決定し、処理をステップＳ００３に戻す。 At this time, if the analysis target row is a computer resource designated row (S003; YES), the CPU 1 performs parallel processing on the argument (number of nodes: “3” in the example of FIG. 8) in this analysis target row. The computer resource parameter is determined and stored in a predetermined position (a predetermined work area on MM2) (step S004). Thereafter, the CPU 1 determines the next designated line as an analysis target line, and returns the process to step S003.

ステップＳ００３にて、解析対象行が計算機資源の指定行でないと判定されると(Ｓ００３；ＮＯ)、ＣＰＵ１は、解析対象行が処理詳細の指定行であるか否かを判定する(ステップＳ００５)。 If it is determined in step S003 that the analysis target row is not a computer resource designation row (S003; NO), the CPU 1 determines whether or not the analysis target row is a processing detail designation row (step S005). .

このとき、解析対象行が処理詳細の指定行であれば(Ｓ００５；ＹＥＳ)、ＣＰＵ１は、この解析対象行中の処理種別指定及び引数(指定された処理パラメータ：図８の例であれば“PROC＿A”(手続きＡ)が処理種別指定に相当し、“120.0 150.0 20.0 50.0”が処理パラメータに相当する)を取り出し、この処理種別及び引数を並列処理に係る処理パラメータとして決定し、所定位置(作業領域)に保存する(ステップＳ００６)。その後、ＣＰＵ１は、次の指定行を解析対象行に決定し、処理をステップＳ００３に戻す。 At this time, if the analysis target line is a process detail designation line (S005; YES), the CPU 1 designates a process type and an argument (designated process parameter: in the example of FIG. “PROC_A” (procedure A) corresponds to the processing type designation, and “120.0 150.0 20.0 50.0” corresponds to the processing parameter), and this processing type and argument are determined as processing parameters related to parallel processing, and the predetermined position (work (Area) (step S006). Thereafter, the CPU 1 determines the next designated line as an analysis target line, and returns the process to step S003.

ステップＳ００５にて、解析対象行が処理詳細の指定行でないと判定されると(Ｓ００５；ＮＯ)、ＣＰＵ１は、解析対象行が処理対象データファイル及び保管位置の指定行と判断し、この判断に従って、この解析対象行中のファイル識別子及び保管位置の識別情報を取り出し、所定位置(作業領域)に保存する(Ｓ００７)。 If it is determined in step S005 that the analysis target line is not a process detail designation line (S005; NO), the CPU 1 determines that the analysis target line is a process target data file and a storage position designation line, and follows this determination. Then, the file identifier and the storage position identification information in the analysis target line are extracted and stored in a predetermined position (work area) (S007).

上記した解析ループ処理は、並列処理指定情報ファイルの最終行に対する処理が終了すると、終了する。続いて、ＣＰＵ１は、処理を図１０のステップＳ００８に進める。 The analysis loop process described above ends when the process for the last line of the parallel processing designation information file ends. Subsequently, the CPU 1 advances the processing to step S008 in FIG.

ステップＳ００８では、ＣＰＵ１は、並列処理用ジョブスクリプトのヘッダ部分を出力する。当該ヘッダは、定型文として予め外部記憶装置３の所定位置に格納されている。ヘッダには、設定ファイルの転送命令が含まれる。ステップＳ００８において、処理対象データファイルと、指定された並列処理に使用されるノード数とに基づいて、並列処理に使用されるノードが決定される。各ノード＃０〜＃ｎに対する使用及び負荷状況は、例えば、コンピュータＹのＯＳ(オペレーティングシステム)にて管理されている。ＯＳには、並列処理指定情報ファイル中の処理データファイル数及びノード数が引き渡される。 In step S008, the CPU 1 outputs the header portion of the parallel processing job script. The header is stored in a predetermined position in the external storage device 3 in advance as a fixed sentence. The header includes a setting file transfer instruction. In step S008, a node used for parallel processing is determined based on the data file to be processed and the number of nodes used for the specified parallel processing. The usage and load status for each of the nodes # 0 to #n is managed by, for example, the OS (operating system) of the computer Y. The number of processing data files and the number of nodes in the parallel processing designation information file are delivered to the OS.

ＯＳは、例えば、ノード＃０〜＃ｎから、ユーザの使用が許可されているノードを抽出し、抽出された複数のノードの使用及び負荷状況やファイル数を考慮して、指定ノード数のノードを選択する。例えば、抽出されたノードから、負荷が少ない順で、指定ノード数のノードを並列処理に使用するノードとして決定する。決定された各ノードの使用及び負荷状況は、状況テーブル３３に小テーブル３４として設定される。これによって、処理対象データファイル群は、ＯＳにより決定された指定ノード数のノードによって並列処理されることになる。 For example, the OS extracts nodes that are allowed to be used by the user from the nodes # 0 to #n, and considers the use of the plurality of extracted nodes, the load status, and the number of files, and the number of nodes specified. Select. For example, a specified number of nodes are determined as nodes to be used for parallel processing in the order of decreasing load from the extracted nodes. The determined usage and load status of each node is set as a small table 34 in the status table 33. As a result, the processing target data file group is processed in parallel by the specified number of nodes determined by the OS.

なお、状況テーブル３３(図６)に、すべてのノード＃０〜＃ｎに対する小テーブル３４が格納され、ＯＳが小テーブル３４を参照して、負荷の少ない順で、指定ノード数分のノードを選択し、選択されなかったノードに対応する小テーブル３４にマスクがセットされる(参照不可状態にされる)ようにしても良い。 The status table 33 (FIG. 6) stores a small table 34 for all the nodes # 0 to #n, and the OS refers to the small table 34 so that nodes corresponding to the designated number of nodes are arranged in order of decreasing load. The mask may be set in the small table 34 corresponding to the node that has been selected and not selected (cannot be referred to).

続いて、ＣＰＵ１は、処理対象データファイルの解析・処理のループ処理を実行する。当該ループ処理は、ステップＳ００７で得られたファイル識別子(処理対象データファイル)毎に実行される。このループでは、ＣＰＵ１は、最初に、指定された処理対象データファイル群(ステップＳ００７で得られたファイル識別子を持つ処理対象データファイル群)の一つ(解析対象ファイルと呼ぶ)を特定する。続いて、ＣＰＵ１は、この解析対象ファイルのメタデータ解析処理のサブルーチンを起動し(ステップＳ００９)、解析対象ファイルのファイル識別子をサブルーチンに渡す。 Subsequently, the CPU 1 executes a loop process for analyzing and processing the processing target data file. The loop processing is executed for each file identifier (processing target data file) obtained in step S007. In this loop, the CPU 1 first identifies one of the specified processing target data file groups (processing target data file group having the file identifier obtained in step S007) (referred to as an analysis target file). Subsequently, the CPU 1 activates a subroutine for the metadata analysis processing of the analysis target file (step S009), and passes the file identifier of the analysis target file to the subroutine.

図１２は、メタデータ解析・取得のサブルーチンの例を示すフローチャートである。図１２において、最初に、ＣＰＵ１は、データファイル指定の入力を受け付ける(ステップＳ１０１)。即ち、ＣＰＵ１は、解析対象ファイルのファイル識別子を受け取る。 FIG. 12 is a flowchart illustrating an example of a metadata analysis / acquisition subroutine. In FIG. 12, first, the CPU 1 accepts an input for specifying a data file (step S101). That is, the CPU 1 receives the file identifier of the analysis target file.

次に、ＣＰＵ１は、ファイル識別子が、正しい形式を有するか否かを判定する(ステップＳ１０２)。このとき、ファイル識別子が正しい形式を有しない場合(Ｓ１０２；ＮＯ)には、処理が失敗(ＮＧ)であるものとして、スクリプト及び設定ファイル作成処理が終了する。この場合、エラー表示処理が行われ、ユーザにエラーが通知されるようにする構成することができる。 Next, the CPU 1 determines whether or not the file identifier has a correct format (step S102). At this time, if the file identifier does not have the correct format (S102; NO), it is determined that the process has failed (NG), and the script and setting file creation process ends. In this case, an error display process can be performed to notify the user of the error.

これに対し、ファイル識別子が正しい形式である場合(Ｓ１０２；ＹＥＳ)には、ＣＰＵ１は、キーワードの取得ループ処理を開始する。当該ループ処理では、最初に、ＣＰＵ１は、メタデータを表すキーワードがファイル識別子中に含まれているか否かを判定する(Ｓ１０３)。 On the other hand, if the file identifier is in the correct format (S102; YES), the CPU 1 starts a keyword acquisition loop process. In the loop processing, first, the CPU 1 determines whether or not a keyword representing metadata is included in the file identifier (S103).

例えば、ＣＰＵ１は、ファイル識別子中のルートディレクトリの次のディレクトリ名を抽出し、このディレクトリ名とメタデータテーブル３２(図５)中のキーワードのリスト(メタデータテーブル３２に格納されたキーワード群)とを照合し、抽出されたディレクトリ名と合致するキーワードを検索する。 For example, the CPU 1 extracts the directory name next to the root directory in the file identifier, and lists the directory name and a keyword list (keyword group stored in the metadata table 32) in the metadata table 32 (FIG. 5). And search for a keyword that matches the extracted directory name.

このとき、キーワードが検索できなかった場合には、ＣＰＵ１は、次のディレクトリ名を抽出し、キーワードリストとの照合を行う。このようにして、ＣＰＵ１は、キーワードの１つと合致するディレクトリ名又はデータファイル名が見つかるまで、上述したようなディレクトリ名又はデータファイル名の抽出処理及びキーワードリストとの照合処理を繰り返す。 At this time, if the keyword cannot be searched, the CPU 1 extracts the next directory name and collates with the keyword list. In this manner, the CPU 1 repeats the directory name or data file name extraction processing and the keyword list matching processing described above until a directory name or data file name that matches one of the keywords is found.

ＣＰＵ１は、抽出したディレクトリ名又はデータファイル名と合致するキーワードが見つかった場合には(Ｓ１０３；ＹＥＳ)、抽出処理を中断し、キーワードに対応するメタデータをメタデータテーブル３２から取り出して取得する(ステップＳ１０４)。 If the CPU 1 finds a keyword that matches the extracted directory name or data file name (S103; YES), the CPU 1 interrupts the extraction process, and retrieves and acquires metadata corresponding to the keyword from the metadata table 32 ( Step S104).

例えば、ファイル識別子“/data/experimentA/3D/statisticsA/variableB/timeXXX.000.000.dat”(図４)に関して、図５に示す格納内容のメタデータテーブル３２を用いて上記した処理が行われた場合、ディレクトリ名“3D”がファイル識別子から抽出され、キーワードリストとの照合が行われた時点で、“3D”に対応するメタデータ“meta01”が、メタデータテーブル３２から取得されることになる。 For example, for the file identifier “/data/experimentA/3D/statisticsA/variableB/timeXXX.000.000.dat” (FIG. 4), the above processing is performed using the metadata table 32 of the stored contents shown in FIG. When the directory name “3D” is extracted from the file identifier and collated with the keyword list, the metadata “meta01” corresponding to “3D” is acquired from the metadata table 32.

ＣＰＵ１は、メタデータテーブル３２からメタデータを取得すると、当該ファイル識別子について、ディレクトリ名又はデータファイル名の抽出及びキーワードリストとの照合処理を再開する。これによって、例えば、ディレクトリ名“3D”の次のディレクトリ名“statisticsA”をキーワードとして、対応するメタデータ“meta1”がメタデータテーブル３２から取得される。 When acquiring the metadata from the metadata table 32, the CPU 1 resumes the extraction of the directory name or data file name and the matching process with the keyword list for the file identifier. Accordingly, for example, the corresponding metadata “meta1” is acquired from the metadata table 32 using the directory name “statisticsA” next to the directory name “3D” as a keyword.

その後、データファイル名を対象とした照合処理が終了した時点(合致するキーワードが検索された場合は、対応するメタデータの取得が終了した時点)で、キーワードの取得ループ処理が終了し(Ｓ１０４；ＮＯ)、図１２に示すサブルーチン(Ｓ００９)が終了し、処理がメインルーチンのステップＳ０１０(図１０)に戻る。 Thereafter, the keyword acquisition loop process ends at the time when the collation process for the data file name is completed (when the matching keyword is searched, the acquisition of the corresponding metadata is completed) (S104; NO), the subroutine (S009) shown in FIG. 12 ends, and the process returns to step S010 (FIG. 10) of the main routine.

このようにして、コンピュータＹは、ユーザが処理対象データのファイル識別子を指定すると、処理対象データに対応するメタデータをファイル識別子に含まれる性質情報(キーワード)を用いて自動的に特定(取得)する。 In this way, when the user specifies the file identifier of the processing target data, the computer Y automatically specifies (acquires) the metadata corresponding to the processing target data using the property information (keyword) included in the file identifier. To do.

ステップＳ０１０では、ＣＰＵ１は、メタデータの解析を行い、ループ処理で対象となっている処理対象データファイル(解析対象ファイル)の並列処理において、当該解析対象ファイルのみではなく、この解析対象ファイルに関連するデータ(関連データファイル)が必要か否かを判定する。 In step S010, the CPU 1 analyzes the metadata, and in parallel processing of the processing target data file (analysis target file) that is the target of the loop processing, the CPU 1 relates not only to the analysis target file but also to the analysis target file. It is determined whether or not data (related data file) to be used is necessary.

例えば、流体の流速計算が並列処理で実行される場合、速度のＸ成分，Ｙ成分，Ｚ成分が必要である。ここで、ステップＳ０１０での判定処理の対象となっている解析対象ファイルが、速度のＸ成分を示すデータファイルであれば、Ｙ及びＺ成分を示す各データファイルが関連データファイルとして必要となる。 For example, when the fluid flow velocity calculation is executed in parallel processing, the velocity X component, Y component, and Z component are required. Here, if the analysis target file that is the target of the determination process in step S010 is a data file that indicates the X component of the velocity, each data file that indicates the Y and Z components is required as the related data file.

ここに、ファイル識別子は、ディレクトリ名又はデータファイル名中にＸ成分，Ｙ成分，Ｚ成分のいずれであるかを示す成分情報を示す文字又は文字列を含むことができる。或る成分(例えばＸ成分)のデータファイルに対応するＹ成分及びＺ成分のデータファイルのファイル識別子は、Ｘ成分のデータファイルのファイル識別子中の成分情報の文字又は文字列の記述を定型的に変更することで作成されている。例えば、ファイル識別子中に含まれた成分情報の文字“Ｘ”を、Ｙ成分やＺ成分を示す文字“Ｙ”や“Ｚ”に置換すれば、対応するＹ成分又はＺ成分のデータファイルのファイル識別子となる。 Here, the file identifier can include a character or a character string indicating component information indicating whether it is an X component, a Y component, or a Z component in a directory name or a data file name. The file identifier of the data file of the Y component and the Z component corresponding to the data file of a certain component (for example, the X component) is a description of the character or character string of the component information in the file identifier of the data file of the X component. Created by changing. For example, if the character “X” of the component information included in the file identifier is replaced with the characters “Y” or “Z” indicating the Y component or Z component, the file of the corresponding Y component or Z component data file It becomes an identifier.

ステップＳ０１０において、ＣＰＵ１は、ステップＳ００９で得られたメタデータの解析を通じて、解析対象ファイルが例えばＸ成分のデータファイルであることが分かった場合には、関連データファイルが必要と判定し(Ｓ０１０；ＹＥＳ)、処理をステップＳ０１１に進める。そうでなければ(Ｓ０１０；ＮＯ)、ＣＰＵ１は、処理をステップＳ０１２に進める。 In step S010, if the CPU 1 finds that the analysis target file is, for example, an X component data file through the analysis of the metadata obtained in step S009, it determines that a related data file is necessary (S010; YES), the process proceeds to step S011. Otherwise (S010; NO), the CPU 1 advances the process to step S012.

ステップＳ０１１では、ＣＰＵ１は、関連データファイルのファイル識別子を生成する。関連データファイルのファイル識別子は、例えば、上述したように、解析対象ファイルのファイル識別子の一部を変更することで、生成することができる。生成された関連データファイルのファイル識別子は、解析対象ファイルのファイル識別子と一組にして、ＭＭ２上の作業領域に記憶される。 In step S011, the CPU 1 generates a file identifier of the related data file. For example, as described above, the file identifier of the related data file can be generated by changing a part of the file identifier of the analysis target file. The file identifier of the generated related data file is stored in the work area on the MM 2 as a set with the file identifier of the analysis target file.

なお、作成された関連データファイルのファイル識別子で示されるファイルパス上に実際の関連データファイルが格納されているように、関連データファイルは、ファイルＤＢ３１に格納されている。その後、処理がステップＳ０１２に進む。 The related data file is stored in the file DB 31 so that the actual related data file is stored on the file path indicated by the file identifier of the generated related data file. Thereafter, the process proceeds to step S012.

ステップＳ０１２では、解析対象ファイル(指定データファイル)又は解析対象ファイル及び関連データファイルの配置を決定するサブルーチンを実行する。 In step S012, a subroutine for determining the arrangement of the analysis target file (designated data file) or the analysis target file and the related data file is executed.

図１３は、配置決定サブルーチン(Ｓ０１２)の例を示すフローチャートである。図１３において、ＣＰＵ１は、処理を開始すると、最初に、ノードに配置されるデータファイルのサイズ、及び処理に要する計算機資源Ａを概算する(ステップＳ２０１)。 FIG. 13 is a flowchart illustrating an example of an arrangement determination subroutine (S012). In FIG. 13, when the process is started, the CPU 1 first estimates the size of the data file arranged in the node and the computer resource A required for the process (step S201).

すなわち、ＣＰＵ１は、解析対象ファイルのサイズ(例えば、メタデータから得られる)を取得する。続いて、ＣＰＵ１は、解析対象ファイルに対し、ステップＳ００６(図９)で得た処理詳細パラメータで指定された処理を対応するメタデータに従って実行した場合に作成される処理結果ファイルのサイズを概算する。ＣＰＵ１は、解析対象ファイルのサイズと処理結果ファイルのサイズとの合計値を計算機資源Ａとして算出する。 That is, the CPU 1 acquires the size of the analysis target file (for example, obtained from metadata). Subsequently, the CPU 1 approximates the size of the processing result file created when the processing specified by the processing detailed parameter obtained in step S006 (FIG. 9) is executed according to the corresponding metadata for the analysis target file. . The CPU 1 calculates the total value of the size of the analysis target file and the size of the processing result file as the computer resource A.

処理結果ファイルのサイズは、例えば、処理詳細パラメータで指定された処理内容が、解析対象ファイルの一部を指定された抽出範囲から抽出する処理である場合、その抽出範囲から割り出される。 The size of the processing result file is determined from the extraction range, for example, when the processing content specified by the processing detail parameter is processing for extracting a part of the analysis target file from the specified extraction range.

解析対象ファイルに対する関連データファイルが存在する場合、解析対象ファイルと関連データファイルとは同じノードで処理されることが、処理効率を高める上で好ましい。このため、ステップＳ２０１において、関連データファイルが存在する場合には、関連データファイルのサイズ、及び関連データファイルに対する処理結果ファイルのサイズも、計算機資源Ａに含められる。関連データファイルのサイズ及びこれに対する処理結果ファイルのサイズは、例えば、解析対象ファイルのサイズ及びこれに対する処理結果ファイルのサイズから概算することができる。 When there is a related data file for the analysis target file, it is preferable to increase the processing efficiency that the analysis target file and the related data file are processed by the same node. For this reason, when the related data file exists in step S201, the size of the related data file and the size of the processing result file for the related data file are also included in the computer resource A. The size of the related data file and the size of the processing result file corresponding thereto can be estimated from the size of the analysis target file and the size of the processing result file corresponding thereto, for example.

次に、ＣＰＵ１は、状況テーブル３３(図６)を参照し、計算機資源Ａに相当する容量をユーザに対して提供することができ、且つ現在の負荷分散状況において最も負荷が軽いと予測されるノードを検索する(ステップＳ２０２)。 Next, the CPU 1 refers to the situation table 33 (FIG. 6), can provide the user with a capacity corresponding to the computer resource A, and is predicted to have the lightest load in the current load distribution situation. A node is searched (step S202).

すなわち、ＣＰＵ１は、状況テーブル３３を参照し、各小テーブル３４中のユーザのレコードを参照する。ユーザＩＤは、例えば、シミュレーションシステムの利用を開始する際に、ユーザによって既にコンピュータＹに入力されており、ＣＰＵ１は、このユーザＩＤに対応するレコードを参照する。 That is, the CPU 1 refers to the situation table 33 and refers to the user record in each small table 34. For example, when the use of the simulation system is started, the user ID has already been input to the computer Y by the user, and the CPU 1 refers to the record corresponding to the user ID.

次に、ＣＰＵ１は、各レコード中の最大サイズから負荷(現在の使用サイズ)を減じて、各ノードにおけるユーザの残りの使用可能サイズを求める。続いて、ＣＰＵ１は、使用可能サイズが最も大きい(負荷が最も小さい)ノードを、解析対象ファイル(及び関連データファイル)を配置すべきノードとして決定する。 Next, the CPU 1 calculates the remaining usable size of the user in each node by subtracting the load (current used size) from the maximum size in each record. Subsequently, the CPU 1 determines the node having the largest usable size (the smallest load) as the node where the analysis target file (and the related data file) should be arranged.

次に、ＣＰＵ１は、計算機資源Ａに基づいて状況テーブル３３を更新する(ステップＳ２０３)。即ち、ＣＰＵ１は、決定されたノードに対応する小テーブル３４の負荷の値(使用サイズ)に、計算機資源Ａの値を加算する。 Next, the CPU 1 updates the status table 33 based on the computer resource A (step S203). That is, the CPU 1 adds the value of the computer resource A to the load value (used size) of the small table 34 corresponding to the determined node.

例えば、図６に示す例において、ユーザＡの計算機資源Ａ(例えば、１０ギガバイトと仮定する)をノード＃０に配置することが決定された場合には、対応する小テーブル３４中の負荷の値が、“20Gbyte”に更新される。 For example, in the example shown in FIG. 6, when it is determined that the computer resource A of the user A (for example, 10 gigabytes) is allocated to the node # 0, the load value in the corresponding small table 34 Is updated to "20Gbyte".

状況テーブル３３の更新が終了すると、ＣＰＵ１は、当該サブルーチンの処理を終了し、ファイルの配置先として決定したノードの識別子をメインルーチンに渡す。 When the update of the status table 33 is finished, the CPU 1 finishes the processing of the subroutine, and passes the identifier of the node determined as the file placement destination to the main routine.

処理がメインルーチンのステップＳ０１３に進むと、ＣＰＵ１は、ノードへのデータ配置に関する命令文(「データ配置命令文」と称する)を出力する。 When the process proceeds to step S013 of the main routine, the CPU 1 outputs a command statement relating to data placement on the node (referred to as “data placement command statement”).

すなわち、ＣＰＵ１は、データ配置命令文の雛形(予め外部記憶装置３に記憶されている)を読み出す。雛形は、定型の命令文の所定位置に、配置対象のファイル識別子と、ノード識別子を記述すれば、当該命令文が完成するように構成されている。ＣＰＵ１は、雛形の所定位置に、解析対象ファイル(及び関連データファイル)の識別子を記述するとともに、ステップＳ０１２で得たノード識別子を記述する。このようにして、完成されたデータ配置命令文は、並列処理用ジョブスクリプトの一部となる。 That is, the CPU 1 reads a template of the data arrangement command statement (stored in the external storage device 3 in advance). The template is configured so that a statement is completed when a file identifier and a node identifier to be arranged are described at predetermined positions of a fixed statement. The CPU 1 describes the identifier of the analysis target file (and related data file) at a predetermined position of the template and also describes the node identifier obtained in step S012. In this way, the completed data arrangement command becomes part of the parallel processing job script.

次に、ＣＰＵ１は、並列処理の終了後に、処理済みデータ(処理結果ファイル)を保管位置に移動させる命令文(「処理結果移動命令文」と称する)を出力する(ステップＳ０１４)。 Next, after completion of the parallel processing, the CPU 1 outputs a command statement (referred to as “processing result transfer command statement”) that moves the processed data (processing result file) to the storage position (step S014).

すなわち、ＣＰＵ１は、処理結果移動命令文の雛形(予め外部記憶装置３に記憶されている)を読み出す。雛形は、定型の命令文の所定位置に、ＵＩで指定された保管位置を記述すれば、当該命令文が完成するように構成されている。ＣＰＵ１は、雛形の所定位置に、ステップＳ００７で得た解析対象ファイルに対する処理結果ファイルの保管位置を書き込む。このようにして、完成された処理結果移動命令文は、並列処理用ジョブスクリプトの一部となる。 That is, the CPU 1 reads a template of the processing result movement command statement (stored in the external storage device 3 in advance). The template is configured such that when a storage position specified by the UI is described at a predetermined position of a standard command statement, the command statement is completed. The CPU 1 writes the storage position of the processing result file for the analysis target file obtained in step S007 at a predetermined position of the template. In this way, the completed processing result movement command statement becomes a part of the parallel processing job script.

次に、ＣＰＵ１は、データ配置情報を記憶する(ステップＳ０１５)。すなわち、ＣＰＵ１は、データ配置情報としての、ファイル識別子とノード識別子との対応関係を所定の記憶領域に格納する。 Next, the CPU 1 stores data arrangement information (step S015). That is, the CPU 1 stores the correspondence relationship between the file identifier and the node identifier as data arrangement information in a predetermined storage area.

ステップＳ０１５が終了した時点で、解析対象ファイルとなっていない処理対象データファイルのファイル識別子があれば、処理がステップＳ００９に戻り、上述したステップＳ００９〜Ｓ０１５の処理が実行される。すべての処理対象データファイルのファイル識別子に対する処理が終了すると、処理がステップＳ０１６に進む。 If there is a file identifier of the processing target data file that is not the analysis target file when step S015 ends, the processing returns to step S009, and the above-described processing of steps S009 to S015 is executed. When the processes for the file identifiers of all the data files to be processed are completed, the process proceeds to step S016.

上記したループ処理によって、処理対象データファイル群に含まれる各処理対象データファイルの配置先が、並列処理における負荷が最も小さくなるように、決定される。 By the loop processing described above, the placement destination of each processing target data file included in the processing target data file group is determined so that the load in parallel processing is minimized.

ステップＳ０１６では、ＣＰＵ１は、並列処理プログラム実行文を出力する。すなわち、ＣＰＵ１は、外部記憶装置３に予め格納されている並列処理プログラム実行文を読み出し、並列処理用ジョブスクリプトの一部として設定する。このようにして、ヘッダ，データ配置命令文，処理結果移動命令文，並列処理プログラム実行文を含む並列処理用ジョブスクリプトが自動的に生成される。 In step S016, the CPU 1 outputs a parallel processing program execution statement. That is, the CPU 1 reads a parallel processing program execution statement stored in advance in the external storage device 3 and sets it as a part of the parallel processing job script. In this manner, a parallel processing job script including a header, a data arrangement command statement, a processing result movement command statement, and a parallel processing program execution statement is automatically generated.

次に、ＣＰＵ１は、並列処理プログラムの設定ファイルの作成処理を開始する(ステップＳ０１７：図１１)。ＣＰＵ１は、並列処理プログラム設定の作成ループ処理を開始する。このループ処理は、処理対象データファイル毎に実行される。 Next, the CPU 1 starts processing for creating a setting file for the parallel processing program (step S017: FIG. 11). The CPU 1 starts a parallel processing program setting creation loop process. This loop process is executed for each data file to be processed.

処理が開始されると、ＣＰＵ１は、データ配置情報(ファイル識別子とノード識別子との対応関係)を基に、処理対象データファイルに対する設定を作成する(Ｓ０１８)。 When the processing is started, the CPU 1 creates a setting for the processing target data file based on the data arrangement information (correspondence between the file identifier and the node identifier) (S018).

すなわち、ＣＰＵ１は、ステップＳ０１５で得たデータ配置情報の中から、１つの処理対象データファイルに係る部分を取り出し、このファイル識別子に対応する処理パラメータ(ステップＳ００６で取得)と組み合わせる。ＣＰＵ１は、組み合わせの結果を、設定ファイル用の所定フォーマットで記述する。 That is, the CPU 1 extracts a portion related to one processing target data file from the data arrangement information obtained in step S015 and combines it with the processing parameter (obtained in step S006) corresponding to this file identifier. The CPU 1 describes the result of the combination in a predetermined format for the setting file.

ＣＰＵ１は、このような処理を、処理対象データファイル毎に行い、すべての処理対象データファイルに対するステップＳ０１９の処理が終了すると、メインルーチンを終了する。 The CPU 1 performs such processing for each processing target data file, and when the processing of step S019 for all processing target data files ends, the main routine ends.

図１４は、並列処理プログラム用設定ファイルの記述例を示す図である。図１４に示す例では、設定ファイルは、処理対象データファイル毎に記述された複数の行からなる。 FIG. 14 is a diagram illustrating a description example of a setting file for a parallel processing program. In the example illustrated in FIG. 14, the setting file includes a plurality of lines described for each processing target data file.

各行には、図１４の左から順に、ノード識別子，処理の指定(この例では“PROC＿A”)，処理対象データファイルのファイル識別子，処理パラメータが記述されている。このような設定ファイルは、各ノードが並列処理プログラムを実行する際に参照される。 In each line, a node identifier, a process designation (in this example, “PROC_A”), a file identifier of a processing target data file, and a processing parameter are described in order from the left in FIG. Such a setting file is referred to when each node executes a parallel processing program.

〈スクリプトの実行〉
スクリプト及び設定ファイルの作成が終了すると、ＣＰＵ１は、スクリプトの実行を開始する。スクリプトの実行によって、コンピュータＹは、ヘッダの設定ファイル転送命令文に従って、設定ファイルを並列計算機群Ｘの各ノードに転送する。<Run script>
When the creation of the script and the setting file is completed, the CPU 1 starts executing the script. By executing the script, the computer Y transfers the setting file to each node of the parallel computer group X according to the setting file transfer command statement in the header.

また、コンピュータＹは、データ配置命令文の実行により、ファイルＤＢ３１に格納された各処理対象データファイル(処理対象データファイル群)を、データ配置情報に従って、配置先のノードへ転送する。 In addition, the computer Y transfers each processing target data file (processing target data file group) stored in the file DB 31 to the placement destination node according to the data placement information by executing the data placement command.

また、コンピュータＹは、処理結果移動命令文の実行により、各ノードに対し、各ノードでの処理対象データファイルの処理により作成される処理結果ファイル(処理済みデータ)を、指定された保管位置 (例えば、ファイルＤＢ３１内に用意される)に格納することを指示する。 Also, the computer Y executes the processing result movement command statement, and for each node, stores the processing result file (processed data) created by processing the processing target data file at each node at the designated storage location ( For example, it is instructed to be stored in the file DB 31.

また、コンピュータＹは、並列処理プログラム実行文の実行により、各ノードに対し、並列処理プログラムの実行開始を指示する。 In addition, the computer Y instructs each node to start executing the parallel processing program by executing the parallel processing program execution statement.

〈並列処理〉
処理対象データファイル群の配置先の各ノード(図３)は、ネットワークを介して、コンピュータＹから設定ファイル及び処理対象データファイルを受信する。これらは、ノード内の外部記憶装置１４に格納される。その後、各ノードのＣＰＵ１１は、コンピュータＹからの並列処理プログラムの実行指示を受け取ると、並列処理プログラムの実行を開始する。<Parallel processing>
Each node (FIG. 3) where the processing target data file group is arranged receives the setting file and the processing target data file from the computer Y via the network. These are stored in the external storage device 14 in the node. Thereafter, when the CPU 11 of each node receives an instruction to execute the parallel processing program from the computer Y, the CPU 11 starts executing the parallel processing program.

図１４は、ＣＰＵ１１で実行される並列処理プログラムの実行処理を示すフローチャートである。ＣＰＵ１１は、図１４に示す処理を開始すると、最初に初期化処理を実行する(ステップＳ３０１)。ＣＰＵ１１は、初期化が終了すると、外部記憶装置１４に格納されている設定ファイルをＭＭ１２に読み込む(ステップＳ３０２)。 FIG. 14 is a flowchart showing execution processing of a parallel processing program executed by the CPU 11. When the process shown in FIG. 14 is started, the CPU 11 first executes an initialization process (step S301). When the initialization is completed, the CPU 11 reads the setting file stored in the external storage device 14 into the MM 12 (step S302).

次に、ＣＰＵ１１は、設定ファイルに従った処理対象データファイルの処理ループを実行する。この処理ループでは、ＣＰＵ１１は、設定ファイル中の１行を処理対象の行に設定し、処理対象の行に記述された設定内容に従って処理対象データファイルに対する処理を実行する。 Next, the CPU 11 executes a processing loop of the processing target data file according to the setting file. In this processing loop, the CPU 11 sets one line in the setting file as a processing target line, and executes processing for the processing target data file according to the setting contents described in the processing target line.

ループにおいて、最初に、ＣＰＵ１１は、設定ファイル中のノード識別子を参照し、このノード識別子が自ノードの識別子と等しいか否かを判定する(ステップＳ３０３)。 In the loop, first, the CPU 11 refers to the node identifier in the setting file, and determines whether or not this node identifier is equal to the identifier of the own node (step S303).

このとき、ノード識別子が等しくない場合には(Ｓ３０３；ＮＯ)、設定ファイル中の次の行が処理対象の行に設定され、ステップＳ３０３の処理が実行される。 At this time, when the node identifiers are not equal (S303; NO), the next line in the setting file is set as a process target line, and the process of step S303 is executed.

これに対し、ノード識別子が等しい場合には(Ｓ３０３；ＹＥＳ)、ＣＰＵ１１は、処理対象行中に記述されたファイル識別子に対応するメタデータを取得する処理を行う(ステップＳ３０４)。 On the other hand, when the node identifiers are equal (S303; YES), the CPU 11 performs a process of acquiring metadata corresponding to the file identifier described in the processing target line (step S304).

このステップＳ３０４の処理は、図１２に示したサブルーチンと同様の処理である。すなわち、ＣＰＵ１１は、外部記憶装置１４に格納されたメタデータテーブル３２Ａ(データ構造はメタデータテーブル３２(図５)と同じ)を参照し、対応するメタデータを検索・取得する。 The processing in step S304 is the same processing as the subroutine shown in FIG. That is, the CPU 11 refers to the metadata table 32A (the data structure is the same as that of the metadata table 32 (FIG. 5)) stored in the external storage device 14, and searches and acquires corresponding metadata.

次に、ＣＰＵ１１は、処理対象行中の処理種別指定，処理パラメータ，及びメタデータに従って、処理対象データファイルに対する処理を実行する(ステップＳ３０５)。すなわち、ＣＰＵ１１は、処理種別指定，処理パラメータ，ファイル識別子及びメタデータを計算プロセッサ１３に与える。すると、計算プロセッサ１３が、外部記憶装置１４からファイル識別子に対応する処理対象データファイルをＭＭ１２に読み出し、処理種別指定及び処理パラメータに従った処理を、メタデータに基づいて実行する。 Next, the CPU 11 executes processing for the processing target data file in accordance with the processing type designation, processing parameter, and metadata in the processing target row (step S305). That is, the CPU 11 gives a processing type designation, a processing parameter, a file identifier, and metadata to the calculation processor 13. Then, the calculation processor 13 reads the processing target data file corresponding to the file identifier from the external storage device 14 to the MM 12, and executes processing according to the processing type designation and processing parameters based on the metadata.

その後、計算プロセッサ１３による処理が終了すると、ＣＰＵ１１は、処理結果のデータ(処理済みデータ)を処理結果ファイルとして、出力する(ステップＳ３０６)。処理結果ファイルは、例えば、コンピュータＹに転送され、コンピュータＹが、ユーザにより指定された保管位置(例えばファイルＤＢ３１内に用意されている)に処理結果ファイルを格納する。 Thereafter, when the processing by the calculation processor 13 is completed, the CPU 11 outputs the processing result data (processed data) as a processing result file (step S306). The processing result file is transferred to, for example, the computer Y, and the computer Y stores the processing result file in a storage location designated by the user (for example, prepared in the file DB 31).

上述した処理が、設定ファイル中の各行を処理対象行として行われ、すべての行に対する処理が終了すると、並列処理プログラムの実行処理が終了する。 The above-described processing is performed with each row in the setting file as a processing target row, and when the processing for all the rows is completed, the execution processing of the parallel processing program is finished.

〈変形例〉
上述した実施形態では、コンピュータＹ及び各ノードがメタデータテーブルを有する場合について説明した。このような構成に代えて、コンピュータＹで取得されたメタデータが、各ノードに転送される構成を適用しても良い。<Modification>
In the above-described embodiment, the case where the computer Y and each node have the metadata table has been described. Instead of such a configuration, a configuration in which metadata acquired by the computer Y is transferred to each node may be applied.

また、本実施形態では、処理対象データファイル(シミュレーションデータファイル)の格納領域が、コンピュータＹの外部記憶装置３上に設けられている例について説明した。格納領域は、各ノードが有していても良く、コンピュータＹ及び並列計算機群Ｘから独立したファイルサーバ上に設けられても良い。 In the present embodiment, the example in which the storage area of the processing target data file (simulation data file) is provided on the external storage device 3 of the computer Y has been described. The storage area may be included in each node, and may be provided on a file server independent of the computer Y and the parallel computer group X.

〈実施形態の作用効果〉
本実施形態によると、ユーザが並列処理指定情報の入力環境(ＵＩ)を用いて、ファイル識別子，ノード数，処理種別，処理詳細パラメータ，及び保管位置を指定すると、処理対象データファイル群に対する並列処理の制御プログラム(スクリプト)及び並列プログラム実行用の設定ファイルが自動的に作成される。<Effects of Embodiment>
According to this embodiment, when a user specifies a file identifier, the number of nodes, a processing type, a detailed processing parameter, and a storage location using an input environment (UI) for parallel processing specification information, parallel processing for a processing target data file group is performed. A control program (script) and a parallel program execution setting file are automatically created.

従来では、ユーザは、並列処理の実行に当たり、データファイルの転送制御も含めて、ときに数百行以上となるスクリプトの記述を過ちなくユーザ自身で記述しなければならなかった。 In the past, when executing parallel processing, the user had to write a script of several hundred lines or more, including data file transfer control.

本実施形態によれば、ユーザが上記した並列処理指定情報の要素となる情報をＵＩを用いて指定又は入力するだけで、所望のスクリプト及び設定ファイルが自動的に作成される。これによって、ユーザの労力を多大に軽減することができる。また、スクリプトの記述に要する時間が短縮されるので、並列処理結果を得るために要する時間を短縮することができる。さらに、ユーザの記述ミスによって並列処理をやり直すおそれを解消することができる。 According to the present embodiment, a desired script and a setting file are automatically created simply by designating or inputting information that is an element of the above-described parallel processing designation information using a UI. This can greatly reduce the user's labor. Further, since the time required for script description is reduced, the time required for obtaining the parallel processing result can be reduced. Furthermore, it is possible to eliminate the possibility of redoing parallel processing due to a description mistake of the user.

また、処理対象データに対するメタデータは、ユーザによるファイル識別子の指定で自動的に検索・取得される。すなわち、ユーザがファイル識別子を指定すると、ファイル識別子からキーワードが抽出され、このキーワードに対応するメタデータが指定されたメタデータとして取り扱われる。これによって、ユーザが処理対象データファイル毎にメタデータの指定を入力する必要がなくなる。従って、ユーザの労力軽減，処理の時間短縮，ユーザの入力ミスの防止を図ることができる。 Further, the metadata for the processing target data is automatically retrieved and acquired by designating the file identifier by the user. That is, when a user specifies a file identifier, a keyword is extracted from the file identifier, and metadata corresponding to the keyword is handled as the specified metadata. This eliminates the need for the user to input metadata designation for each data file to be processed. Therefore, it is possible to reduce the user's labor, shorten the processing time, and prevent the user's input error.

メタデータの自動指定に当たり、本実施形態では、データの格納位置情報(ファイルパス)を含むファイル識別子を処理対象データファイルに適用し、処理対象データの性質を示すキーワード(メタデータ検索用のキーワード)を含ませている。 In the automatic specification of metadata, in the present embodiment, a file identifier including data storage location information (file path) is applied to the processing target data file, and a keyword indicating the nature of the processing target data (keyword for metadata search) Is included.

すなわち、処理対象データとメタデータとを関連付けるデータをファイル識別子に埋め込んでいる。これによって、関連付けるデータを処理対象データ及びメタデータと別に管理する必要がなくなる。従って、記憶領域の有効利用及び管理負担の軽減が図られる。ファイル識別子には、複数のキーワードを含めることができる。 That is, data associating process target data and metadata is embedded in the file identifier. This eliminates the need to manage the associated data separately from the processing target data and the metadata. Therefore, effective use of the storage area and reduction of the management burden can be achieved. A file identifier can include a plurality of keywords.

さらに、ユーザが処理対象データファイルを指定する場合に、ユーザがファイルパスを含むファイル識別子を指定するように構成している。これにより、ファイル識別子の指定がキーワード入力を兼ねる。従って、ユーザの作業負担軽減が図られる。 Further, when the user designates a processing target data file, the user designates a file identifier including a file path. Thereby, the designation of the file identifier also serves as a keyword input. Therefore, the work burden on the user can be reduced.

さらに、本実施形態では、メタデータは、処理対象データファイルと異なる記憶領域に格納されるように構成している。これによって、記憶領域に処理対象データファイルを効率的に格納することができる。また、メタデータを各ノードが有し、メタデータの転送処理が排除されている。これにより、メタデータの転送による効率低下を抑止することができる。 Further, in the present embodiment, the metadata is configured to be stored in a storage area different from the processing target data file. As a result, the processing target data file can be efficiently stored in the storage area. Further, each node has metadata, and metadata transfer processing is eliminated. As a result, a reduction in efficiency due to the transfer of metadata can be suppressed.

本発明は、例えば、様々な数値シミュレーションシステムにおけるデータ処理への適用が可能である。 The present invention can be applied to data processing in various numerical simulation systems, for example.

Claims

Accepting means for receiving parallel processing designation information including a processing target data file group, the number of a plurality of computing nodes in a parallel computer group performing parallel processing on the processing target data file group, and processing content designation for the processing target data file group When,
Storage means storing usage and load status for each of a plurality of computing nodes included in the parallel computer group;
Based on the specified number of calculation nodes and the usage and load status, the specified number of calculation nodes performing parallel processing , and each process constituting the processing target data file group for these calculation nodes A determination means for determining the arrangement of the target data file;
A data placement command for placing each processing object data file on the designated number of calculation nodes according to the placement determination result by the determining means, and the processing object for each computation node on which each processing object data file is placed Control program generation means for automatically generating a parallel processing job script including a parallel processing execution statement of the data file group;
Wherein a configuration file for parallel processing by each computing node in which each subject data file is located is referred to when performing processing of the processing target data files that are placed on itself, for each processing target data file, processing A parallel processing support apparatus, comprising: a file generation unit that automatically generates a setting file including a file identifier of a target data file, an identifier of a computation node in which the processing data file is arranged, and a description of designated processing content.

The determining means, for each processing target data file that constitutes the processing target data file group, from the calculation nodes of the specified number of calculation nodes included in the parallel computer group, a processing target data file and a processing result file corresponding thereto And select a computing node having a storage capacity capable of storing
2. The parallel processing support apparatus according to claim 1, wherein among the selected computation nodes, the computation node having the smallest current processing load is determined as a computation node on which the processing target data file is to be placed.

The parallel processing designation information includes designation of a storage location of a processing result file generated as a result of processing on the processing target data file,
The parallel processing support apparatus according to claim 1, wherein the control program generation unit generates the control program including a command statement indicating that a processing result file is transferred to the storage location.

Accepting means for receiving parallel processing designation information including a processing target data file group, the number of a plurality of computing nodes in a parallel computer group performing parallel processing on the processing target data file group, and processing content designation for the processing target data file group When,
Storage means storing usage and load status for each of a plurality of computing nodes included in the parallel computer group;
Determination means for deciding the arrangement of each processing target data file constituting the processing target data file group with respect to the specified number of calculation nodes based on the specified number of calculation nodes and the usage and load status When,
A data placement command for placing each processing object data file on the designated number of calculation nodes according to the placement determination result by the determining means, and the processing object for each computation node on which each processing object data file is placed Control program generation means for automatically generating a control program including a statement for executing parallel processing of data files, and
Wherein a configuration file for parallel processing by each computing node in which each subject data file is located is referred to when performing processing of the processing target data files that are placed on itself, for each processing target data file, processing File generating means for automatically generating a configuration file including a file identifier of a target data file, an identifier of a calculation node in which the processing data file is arranged, and a description of the specified processing content ;
The data file to be processed is specified by selecting a file identifier selected from a list that displays multiple file identifiers of the data file that contains the file path of the data file stored in one of the multiple directories that make up the directory structure. Made by
Metadata storage means for storing metadata of processing target data;
A keyword list having keywords associated with the metadata;
The file path portion of the specified file identifier is used to extract a keyword for searching metadata corresponding to the data file specified by the specified file identifier from the file path portion of the specified file identifier. An extraction means for comparing a character string forming a part of the keyword list with the keyword list, and extracting a character string that matches at least one keyword in the keyword list as a keyword;
Search means for searching for metadata corresponding to the extracted keyword from the metadata storage means;
For each processing data file, in order to determine whether there is a related data file related to the processing data file, determine whether there is a related data file based on the metadata searched by the search means And a determination means,
When a processing target data file having a related data file is detected by the determination unit, the determination unit arranges the processing target data file and the related data file for the processing target data file in the same calculation node, and the control program generation unit Generating the control program for the processing target data file group including the related data file as one of the processing target data files, and the file generation means generates the setting file for the related data file.
Parallel processing support device.

Receiving a parallel processing designation information including a processing target data file group, a number of a plurality of computing nodes in a parallel computer group performing parallel processing on the processing target data file group, and a processing content designation for the processing target data file group; ,
The specified number of computations for performing parallel processing based on the number of the designated computation nodes and the use and load status for each of the plurality of computation nodes included in the parallel computer group stored in the storage unit Determining the location of each processing target data file constituting the processing target data file group for the nodes and these calculation nodes ;
A data placement command for placing each processing object data file in the designated number of computation nodes according to the placement determination result in the step of determining the placement, and each computation node in which each processing object data file is placed Automatically generating and outputting a parallel processing job script including a parallel processing execution statement of the processing target data file group for :
Wherein a configuration file for parallel processing by each computing node in which each subject data file is located is referred to when performing processing of the processing target data files that are placed on itself, for each processing target data file, processing A program for causing a computer to execute a step of automatically generating and outputting a setting file including a file identifier of a target data file, an identifier of a computation node in which the processing data file is arranged, and a description of a specified processing content .

In the step of determining the arrangement, for each processing target data file constituting the processing target data file group, the processing target data file and the corresponding data file are calculated from the calculation nodes having the specified number of calculation nodes included in the parallel computer group. Select a computing node with a storage capacity that can store the processing result file,
6. The program according to claim 5 , wherein among the selected calculation nodes, the calculation node having the smallest current processing load is determined as a calculation node on which the processing target data file is to be arranged.

The parallel processing designation information includes designation of a storage location of a processing result file generated as a result of processing on the processing target data file,
The program according to claim 5 or 6 , wherein, in the generation step of the control program, the control program including a command statement indicating that the processing result file is transferred to the storage location is generated.

Receiving a parallel processing designation information including a processing target data file group, a number of a plurality of computing nodes in a parallel computer group performing parallel processing on the processing target data file group, and a processing content designation for the processing target data file group; ,
The processing for the specified number of calculation nodes based on the number of the specified calculation nodes and the use and load status for each of the plurality of calculation nodes included in the parallel computer group stored in the storage unit. Determining the location of each processing target data file constituting the target data file group;
A data placement command for placing each processing object data file in the designated number of computation nodes according to the placement determination result in the step of determining the placement, and each computation node in which each processing object data file is placed Automatically generating and outputting a control program including parallel processing execution statements of the processing target data file group for:
Wherein a configuration file for parallel processing by each computing node in which each subject data file is located is referred to when performing processing of the processing target data files that are placed on itself, for each processing target data file, processing and file identifier of the target data file, the identifier of the compute nodes process data file is located, it is performed and outputting a configuration file that contains a description of the specified processing content automatically generates and, to a computer ,
The data file to be processed is specified by selecting a file identifier selected from a list that displays multiple file identifiers of the data file that contains the file path of the data file stored in one of the multiple directories that make up the directory structure. Made by
The file path portion of the specified file identifier is used to extract a keyword for searching metadata corresponding to the data file specified by the specified file identifier from the file path portion of the specified file identifier. An extraction step of comparing, as a keyword, a character string that matches at least one keyword in the keyword list, by comparing a character string that forms a part of the keyword list with a keyword list having a keyword group associated with metadata;
Searching for metadata corresponding to the extracted keyword from a metadata storage means storing metadata of processing target data;
For each processing data file, determine whether there is a related data file based on the metadata searched by the searching step to determine whether there is a related data file related to the processing data file And a determination step to perform further on the computer,
When a processing target data file having a related data file is detected in the determination step, in the determining step, the processing target data file and the related data file corresponding thereto are arranged in the same calculation node, and the control program is automatically executed. In the step of generating and outputting automatically, the control program for the processing target data file group including the related data file as one of the processing target data files is generated, and the setting file is automatically generated and output. In the step, the setting file for the related data file is generated.
program.

Subject data file group, the number of the plurality of computing nodes in the parallel computer group for performing parallel processing for the processing object data file group, the step of accepting an parallelism designation information including designation of the processing content for the processing target data files And
The specified number of computations for performing parallel processing based on the number of the designated computation nodes and the use and load status for each of the plurality of computation nodes included in the parallel computer group stored in the storage unit nodes, and determining the arrangement of each processing target data files that make up the subject data file group for these compute nodes,
A data placement command for placing each processing object data file in the designated number of computation nodes according to the placement determination result in the step of determining the placement, and each computation node in which each processing object data file is placed Automatically generating and outputting a parallel processing job script including a parallel processing execution statement of the processing target data file group for :
Wherein a configuration file for parallel processing by each computing node in which each subject data file is located is referred to when performing processing of the processing target data files that are placed on itself, for each processing target data file, processing Automatically generating and outputting a configuration file including a file identifier of a target data file, an identifier of a calculation node in which the processing data file is arranged, and a description of the specified processing content ;
A parallel processing support method including:

Subject data file group, the number of the plurality of computing nodes in the parallel computer group for performing parallel processing for the processing object data file group, the step of accepting an parallelism designation information including designation of the processing content for the processing target data files And
The processing for the specified number of calculation nodes based on the number of the specified calculation nodes and the use and load status for each of the plurality of calculation nodes included in the parallel computer group stored in the storage unit. determining the arrangement of each processing target data files that constitute the target data file group,
A data placement command for placing each processing object data file in the designated number of computation nodes according to the placement determination result in the step of determining the placement, and each computation node in which each processing object data file is placed Automatically generating and outputting a control program including parallel processing execution statements of the processing target data file group for:
Wherein a configuration file for parallel processing by each computing node in which each subject data file is located is referred to when performing processing of the processing target data files that are placed on itself, for each processing target data file, processing Automatically generating and outputting a configuration file including a file identifier of the target data file, an identifier of a calculation node in which the processing data file is arranged, and a description of the specified processing content , and
The data file to be processed is specified by selecting a file identifier selected from a list that displays multiple file identifiers of the data file that contains the file path of the data file stored in one of the multiple directories that make up the directory structure. Made by
The file path portion of the specified file identifier is used to extract a keyword for searching metadata corresponding to the data file specified by the specified file identifier from the file path portion of the specified file identifier. An extraction step of comparing, as a keyword, a character string that matches at least one keyword in the keyword list, by comparing a character string that forms a part of the keyword list with a keyword list having a keyword group associated with metadata;
Searching for metadata corresponding to the extracted keyword from a metadata storage means storing metadata of processing target data;
Determining whether there is a related data file based on the metadata retrieved by the search to determine whether there is a related data file related to the processed data file for each processed data file And further comprising steps
When a processing target data file having a related data file is detected in the determination step, in the determining step, the processing target data file and the related data file corresponding thereto are arranged in the same calculation node, and the control program is automatically Generating and outputting the control program for the processing target data file group including the related data file as one of the processing target data files, and automatically generating and outputting the setting file. In the step, the setting file for the related data file is generated.
Parallel processing support method.