JP5891842B2

JP5891842B2 - Storage system

Info

Publication number: JP5891842B2
Application number: JP2012038143A
Authority: JP
Inventors: スムドゥデマタピティヤ
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2012-02-24
Filing date: 2012-02-24
Publication date: 2016-03-23
Anticipated expiration: 2032-02-24
Also published as: JP2013174984A

Description

本発明は、ストレージシステムにかかり、特に、データを分散して複数の記憶装置に記憶するストレージシステムに関する。 The present invention relates to a storage system, and more particularly to a storage system that distributes data and stores it in a plurality of storage devices.

近年、コンピュータの発達及び普及に伴い、種々の情報がデジタルデータ化されている。このようなデジタルデータを保存しておく装置として、磁気テープや磁気ディスクなどの記憶装置がある。そして、保存すべきデータは日々増大し、膨大な量となるため、大容量なストレージシステムが必要となっている。また、記憶装置に費やすコストを削減しつつ、信頼性も必要とされる。これに加えて、後にデータを容易に取り出すことが可能であることも必要である。その結果、自動的に記憶容量や性能の増大を実現できると共に、重複記憶を排除して記憶コストを削減し、さらには、冗長性の高いストレージシステムが望まれている。 In recent years, with the development and spread of computers, various types of information have been converted into digital data. As a device for storing such digital data, there are storage devices such as a magnetic tape and a magnetic disk. Since the data to be stored increases day by day and becomes enormous, a large-capacity storage system is required. In addition, reliability is required while reducing the cost of the storage device. In addition to this, it is necessary that data can be easily retrieved later. As a result, there is a demand for a storage system that can automatically increase storage capacity and performance, eliminate duplicate storage, reduce storage costs, and have high redundancy.

このような状況に応じて、近年では、特許文献１に示すように、コンテンツアドレスストレージシステムが開発されている。このコンテンツアドレスストレージシステムは、データを分散して複数の記憶装置に記憶すると共に、このデータの内容に応じて特定される固有のコンテンツアドレスによって、当該データを格納した格納位置が特定される。 In response to such a situation, in recent years, a content address storage system has been developed as shown in Patent Document 1. In this content address storage system, data is distributed and stored in a plurality of storage devices, and the storage location where the data is stored is specified by a unique content address specified according to the content of the data.

具体的に、コンテンツアドレスストレージシステムでは、所定のデータを分割したブロックデータを複数のフラグメントデータにさらに分割すると共に、冗長データ（パリティデータ）となるフラグメントデータを付加して、これら複数のフラグメントデータを複数の記憶装置に分散して格納している。そして、後に、コンテンツアドレスを指定することにより、当該コンテンツアドレスにて特定される格納位置に格納されているデータつまりフラグメントデータを読み出し、複数のフラグメントデータから分割前の所定のデータを復元することができる。 Specifically, in the content address storage system, block data obtained by dividing predetermined data is further divided into a plurality of fragment data, and fragment data that becomes redundant data (parity data) is added to the plurality of fragment data. Distributed and stored in a plurality of storage devices. Then, by designating the content address later, the data stored in the storage location specified by the content address, that is, fragment data can be read, and the predetermined data before division can be restored from a plurality of fragment data. it can.

また、上記コンテンツアドレスは、データの内容に応じて固有となるよう生成される。このため、重複データであれば同じ格納位置のデータを参照することで、同一内容のデータを取得することができる。従って、重複データを別々に格納する必要がなく、重複記録を排除し、データ容量の削減を図ることができる。 Further, the content address is generated so as to be unique according to the content of data. For this reason, if it is duplicate data, the data of the same content can be acquired by referring to the data at the same storage position. Therefore, it is not necessary to store the duplicate data separately, and duplicate recording can be eliminated and the data capacity can be reduced.

ここで、上述した各フラグメントデータには、当該フラグメントデータの元となるブロックデータの情報を含むメタデータが関連付けられて記憶される。例えば、メタデータには、各フラグメントデータを格納するコンポーネント構成情報、パリティ設定などの制御情報が含まれ、同一のブロックデータに所属する各フラグメントデータのメタデータは同一内容となる。そして、各ストレージノードに、フラグメントデータおよびメタデータの保存先をディスクやノードの状態を意識せずに格納するため、コンポーネントと呼ばれる論理的な容器をｎ個予め用意する。コンポーネントは複数のディスクにまたがった配置が可能であり、一つのブロックデータを構成するフラグメントデータは、それぞれ各コンポーネントに一つしか保存されない特徴がある。コンポーネントの各ノードに対する配置およびディスクに対する配置は、システムによって自律的に行われる。また、データ書き込みの際、既に保存済みブロックデータと同じ内容のブロックデータが書き込まれた場合には、重複排除され、同じフラグメントデータは二回書き込まれない、こととなる。 Here, each piece of fragment data described above is associated with metadata including block data information that is the basis of the fragment data. For example, the metadata includes component configuration information for storing each piece of fragment data and control information such as parity setting, and the pieces of metadata of the pieces of fragment data belonging to the same block data have the same contents. In order to store the storage destination of fragment data and metadata in each storage node without being aware of the state of the disk or node, n logical containers called components are prepared in advance. Components can be arranged over a plurality of disks, and only one piece of fragment data constituting one block data is stored in each component. Arrangement of components with respect to each node and arrangement with respect to disks are autonomously performed by the system. In addition, when block data having the same content as already stored block data is written at the time of data writing, deduplication is eliminated and the same fragment data is not written twice.

そして、上記のようなストレージシステムにおいて、データを記憶するストレージノードに障害が生じ、当該ストレージノードがシステムから切り離された場合には、そのストレージノード上のコンポーネントは他のストレージノード上で再生成される。つまり、上述したストレージシステムでは、所定のデータを複数のフラグメントデータに分割すると共に、冗長データとなるフラグメントデータをさらに付加しているため、このうち所定のフラグメントを失ったとしても、他のフラグメントからデータを復元することができる。なお、ストレージノード内のディスクやコンポーネントに障害が生じた場合も同様である。 In a storage system such as that described above, when a failure occurs in a storage node that stores data and the storage node is disconnected from the system, the components on that storage node are regenerated on other storage nodes. The In other words, in the storage system described above, the predetermined data is divided into a plurality of fragment data, and the fragment data to be redundant data is further added. Data can be restored. The same applies when a failure occurs in a disk or component in the storage node.

ここで、特許文献２に開示された、ストレージノードに障害が発生した場合における当該ストレージノードに記憶されていたデータの再生成処理について、図１及び図２を参照して説明する。 Here, a process of regenerating data stored in a storage node when a failure occurs in the storage node disclosed in Patent Document 2 will be described with reference to FIGS. 1 and 2.

まず、図１の上側に示すように、複数のストレージノード４０１〜４０４を装備したストレージシステム３００において、各ストレージノード４０１等にそれぞれ形成された各コンポートネント１〜１２に、記憶対象データであるブロックデータを分割すると共に冗長データを付加したフラグメントデータが分散して格納されている。このような状態において、所定のストレージノードに障害が生じたとすると、直ちに残りのストレージノードに保存されているフラグメントデータから、失われたフラグメントデータを再生成する処理が開始される。 First, as shown in the upper side of FIG. 1, in a storage system 300 equipped with a plurality of storage nodes 401 to 404, each component 1 to 12 formed in each storage node 401 or the like has a block that is data to be stored. Fragment data to which data is divided and redundant data is added is distributed and stored. In such a state, if a failure occurs in a predetermined storage node, processing for regenerating lost fragment data from the fragment data stored in the remaining storage nodes is started immediately.

具体的に、再生成処理では、まず、図１の下側に示すように、障害が生じたストレージノード４０４に形成されていたデータを格納するコンポーネント１０，１１，１２を、作動しているストレージノード４０１〜４０３に再生成する。そして、図２の上側に示すように、作動しているストレージノード４０１〜４０３に記憶されているフラグメントデータ１〜９を読み込み、かかるフラグメントデータ１〜９から当該フラグメントデータの元となるブロックデータＤを再生成し、そのデータＤを再度分割することで失ったフラグメントデータを再生成する。その後、図２の下側に示すように、再生成したフラグメントデータを新たに生成したコンポーネント１０，１１，１２内、つまり、作動している各ストレージノード４０１〜４０３に分散して格納する。なお、この一連の処理が完了するまで一部のデータはアクセス不可能である。 Specifically, in the regeneration process, first, as shown in the lower side of FIG. 1, the components 10, 11, and 12 that store the data formed in the storage node 404 in which the failure has occurred are operated in storage. Regenerate to nodes 401-403. Then, as shown in the upper side of FIG. 2, the fragment data 1 to 9 stored in the operating storage nodes 401 to 403 are read, and the block data D that is the basis of the fragment data is read from the fragment data 1 to 9. Is regenerated, and the fragment data lost by re-dividing the data D is regenerated. After that, as shown in the lower side of FIG. 2, the regenerated fragment data is distributed and stored in the newly generated components 10, 11, and 12, that is, in each of the operating storage nodes 401 to 403. Note that some data cannot be accessed until this series of processing is completed.

特開２００５−２３５１７１号公報JP 2005-235171 A 特開２０１１−１５４４２８号公報JP 2011-154428 A

一方で、上述したストレージシステムにおいては、ファイルシステム毎に冗長データであるパリティの数を自由に設定可能である。すると、同一コンポーネント内に異なる数のパリティを持ったデータが混在することとなるため、同一の物理ディスク上にも異なるパリティのデータが保存される。すると、複数のストレージノードにて構成されたストレージシステムにおいて、あるノード／ディスクに障害が発生した場合、フラグメントデータの一部を失うこととなるが、障害が発生したノード／ディスクの数と、設定されたパリティの数とによっては、残りのフラグメントデータからデータを再生成できない、という問題が生じる。例えば、パリティの数がｘ個であるデータが記憶されていた場合に、ｘ個のノードに障害が生じたとしても、かかる場合にはデータ再生成が可能である。ところが、かかるデータの再生成が完了しない状態で多重にノード／ディスク障害が発生することによりさらにｙ個のフラグメントが読めなくなった場合、パリティの数がｘ個から（ｘ＋ｙ−１）個までのデータが再生成不可能な状態（データロス）となってしまう。 On the other hand, in the above-described storage system, the number of parity that is redundant data can be freely set for each file system. Then, since data having different numbers of parities are mixed in the same component, data of different parities are stored on the same physical disk. Then, in a storage system composed of multiple storage nodes, if a failure occurs in a certain node / disk, part of the fragment data will be lost, but the number of failed nodes / disks and the settings Depending on the number of parity bits, there is a problem that data cannot be regenerated from the remaining fragment data. For example, when data having x number of parities is stored, even if a failure occurs in x number of nodes, data can be regenerated in such a case. However, if y fragments cannot be read due to multiple node / disk failures in a state where the regeneration of such data is not completed, the number of parity data from x to (x + y-1) Will be unable to be regenerated (data loss).

以上のように、上述したストレージシステムでは、ノードの障害によって記憶したデータが消失する可能性も生じ、信頼性が低下する、という問題がある。 As described above, in the above-described storage system, there is a possibility that stored data may be lost due to a node failure, and reliability is lowered.

このため、本発明の目的は、上述した課題である、重複記録排除機能を有するストレージシステムにおける、記録したデータの信頼性の低下という問題を解決する、ことにある。 For this reason, an object of the present invention is to solve the above-mentioned problem, that is, a problem of a decrease in the reliability of recorded data in a storage system having a duplicate recording elimination function.

本発明の一形態であるストレージシステムは、
複数の記憶手段と、
記憶対象データを複数に分割した分割データ及び当該記憶対象データを復元するための冗長データからなる複数のフラグメントデータを生成して、当該複数のフラグメントデータを前記複数の記憶手段に分散して記憶する分散記憶処理手段と、
障害が発生した前記記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータを、障害が発生していない他の前記記憶手段に記憶されている前記記憶対象データを構成する他の前記フラグメントデータに基づいて再生成するデータ再生成手段と、を備え、
前記データ再生手段は、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
という構成をとる。 A storage system according to an aspect of the present invention
A plurality of storage means;
Generating a plurality of fragment data composed of divided data obtained by dividing the storage target data into a plurality of data and redundant data for restoring the storage target data, and distributing and storing the plurality of fragment data in the plurality of storage means Distributed storage processing means;
The fragment data constituting the storage target data stored in the storage means in which a failure has occurred, the other fragment constituting the storage target data stored in another storage means in which no failure has occurred Data regenerating means for regenerating based on fragment data,
The data reproducing means is stored in the storage means in which the failure has occurred based on the other fragment data in a priority order based on the number of redundant data of the fragment data constituting the storage target data. The fragment data constituting the storage target data is regenerated.
The configuration is as follows.

また、本発明の他の形態である情報処理装置は、
記憶対象データを複数に分割した分割データ及び当該記憶対象データを復元するための冗長データからなる複数のフラグメントデータが分散して記憶される複数の記憶手段のうち少なくとも１つを備えた情報処理装置であって、
障害が発生した前記記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータを、障害が発生していない他の前記記憶手段に記憶されている前記記憶対象データを構成する他の前記フラグメントデータに基づいて再生成するデータ再生成手段を備え、
前記データ再生手段は、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
という構成をとる。 An information processing apparatus according to another aspect of the present invention
Information processing apparatus comprising at least one of a plurality of storage means for distributing and storing a plurality of fragment data composed of divided data obtained by dividing storage target data into a plurality of pieces and redundant data for restoring the storage target data Because
The fragment data constituting the storage target data stored in the storage means in which a failure has occurred, the other fragment constituting the storage target data stored in another storage means in which no failure has occurred Data regenerating means for regenerating based on fragment data is provided,
The data reproducing means is stored in the storage means in which the failure has occurred based on the other fragment data in a priority order based on the number of redundant data of the fragment data constituting the storage target data. The fragment data constituting the storage target data is regenerated.
The configuration is as follows.

また、本発明の他の形態であるプログラムは、
記憶対象データを複数に分割した分割データ及び当該記憶対象データを復元するための冗長データからなる複数のフラグメントデータが分散して記憶される複数の記憶手段のうち少なくとも１つを備えた情報処理装置に、
障害が発生した前記記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータを、障害が発生していない他の前記記憶手段に記憶されている前記記憶対象データを構成する他の前記フラグメントデータに基づいて再生成するデータ再生成手段を実現させると共に、
前記データ再生手段は、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
ことを実現させるためのプログラムである。 Moreover, the program which is the other form of this invention is:
Information processing apparatus comprising at least one of a plurality of storage means for distributing and storing a plurality of fragment data composed of divided data obtained by dividing storage target data into a plurality of pieces and redundant data for restoring the storage target data In addition,
The fragment data constituting the storage target data stored in the storage means in which a failure has occurred, the other fragment constituting the storage target data stored in another storage means in which no failure has occurred Realize data regeneration means to regenerate based on fragment data,
The data reproducing means is stored in the storage means in which the failure has occurred based on the other fragment data in a priority order based on the number of redundant data of the fragment data constituting the storage target data. The fragment data constituting the storage target data is regenerated.
It is a program for realizing this.

また、本発明の他の形態である情報処理方法は、
記憶対象データを複数に分割した分割データ及び当該記憶対象データを復元するための冗長データからなる複数のフラグメントデータを生成して、当該複数のフラグメントデータを複数の記憶手段に分散して記憶すると共に、
障害が発生した前記記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータを、障害が発生していない他の前記記憶手段に記憶されている前記記憶対象データを構成する他の前記フラグメントデータに基づいて再生成する際に、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
という構成をとる。 In addition, an information processing method according to another aspect of the present invention includes:
Generating a plurality of fragment data composed of divided data obtained by dividing the storage target data into a plurality of data and redundant data for restoring the storage target data, and storing the plurality of fragment data in a plurality of storage means in a distributed manner ,
The fragment data constituting the storage target data stored in the storage means in which a failure has occurred, the other fragment constituting the storage target data stored in another storage means in which no failure has occurred The storage in which the failure has occurred based on the other fragment data in the priority order based on the number of the redundant data among the fragment data constituting the storage target data when regenerating based on the fragment data Regenerating the fragment data constituting the storage target data stored in the means;
The configuration is as follows.

本発明は、以上のように構成されるため、重複記録排除機能を有するストレージシステムにおいて、記録したデータの信頼性の向上を図ることができる。 Since the present invention is configured as described above, it is possible to improve the reliability of recorded data in a storage system having a duplicate recording exclusion function.

本発明に関連するストレージシステムの動作を示す図である。It is a figure which shows operation | movement of the storage system relevant to this invention. 本発明に関連するストレージシステムの動作を示す図である。It is a figure which shows operation | movement of the storage system relevant to this invention. 本発明の実施形態１におけるシステム全体の構成を示すブロック図である。It is a block diagram which shows the structure of the whole system in Embodiment 1 of this invention. 図３に開示したストレージシステムの構成の概略を示すブロック図である。FIG. 4 is a block diagram showing an outline of a configuration of a storage system disclosed in FIG. 3. 図４に開示したストレージシステムの構成を示すブロック図である。FIG. 5 is a block diagram showing a configuration of a storage system disclosed in FIG. 4. 図５に開示したストレージシステムの動作を説明するための説明図である。FIG. 6 is an explanatory diagram for explaining the operation of the storage system disclosed in FIG. 5. 図５に開示したストレージシステムの動作を説明するための説明図である。FIG. 6 is an explanatory diagram for explaining the operation of the storage system disclosed in FIG. 5. 図５に開示したストレージシステムの動作を説明するための説明図である。FIG. 6 is an explanatory diagram for explaining the operation of the storage system disclosed in FIG. 5. 図５に開示したストレージシステムの動作を説明するための説明図である。FIG. 6 is an explanatory diagram for explaining the operation of the storage system disclosed in FIG. 5. 図５に開示したストレージシステムの動作を説明するための説明図である。FIG. 6 is an explanatory diagram for explaining the operation of the storage system disclosed in FIG. 5. 図５に開示したストレージシステムの動作を説明するための説明図である。FIG. 6 is an explanatory diagram for explaining the operation of the storage system disclosed in FIG. 5. 図５に開示したストレージシステムの動作を説明するための説明図である。FIG. 6 is an explanatory diagram for explaining the operation of the storage system disclosed in FIG. 5. 図６に開示したストレージシステムの動作を示すフローチャートである。7 is a flowchart showing an operation of the storage system disclosed in FIG. 6. 図６に開示したストレージシステムの動作を示すフローチャートである。7 is a flowchart showing an operation of the storage system disclosed in FIG. 6. 図６に開示したストレージシステムの動作を示すフローチャートである。7 is a flowchart showing an operation of the storage system disclosed in FIG. 6. 本発明の付記１におけるストレージシステムの構成を示すブロック図である。It is a block diagram which shows the structure of the storage system in attachment 1 of this invention.

＜実施形態１＞
本発明の第１の実施形態を、図３乃至図１５を参照して説明する。図３は、システム全体の構成を示すブロック図である。図４は、ストレージシステムの概略を示すブロック図であり、図５は、ストレージシステムの詳細な構成を示すブロック図である。図６乃至図１２は、ストレージシステムの動作を説明するための説明図であり、図１３乃至図１５は、ストレージシステムの動作を示すフローチャートである。 <Embodiment 1>
A first embodiment of the present invention will be described with reference to FIGS. FIG. 3 is a block diagram showing the configuration of the entire system. FIG. 4 is a block diagram showing an outline of the storage system, and FIG. 5 is a block diagram showing a detailed configuration of the storage system. 6 to 12 are explanatory diagrams for explaining the operation of the storage system, and FIGS. 13 to 15 are flowcharts showing the operation of the storage system.

ここで、本実施形態では、ストレージシステムが、複数台のサーバコンピュータが接続されて構成されている場合を説明する。但し、本発明におけるストレージシステムは、複数台のコンピュータにて構成されることに限定されず、１台のコンピュータ（情報処理装置）で構成されていてもよい。 Here, in the present embodiment, a case will be described in which the storage system is configured by connecting a plurality of server computers. However, the storage system according to the present invention is not limited to being configured by a plurality of computers, and may be configured by a single computer (information processing apparatus).

図３に示すように、本発明におけるストレージシステム１０は、ネットワークＮを介してバックアップ処理を制御するバックアップシステム１１に接続している。そして、バックアップシステム１１は、ネットワークＮを介して接続されたバックアップ対象装置１２に格納されているバックアップ対象データを取得し、ストレージシステム１０に対して記憶するよう要求する。これにより、ストレージシステム１０は、記憶要求されたバックアップ対象データをバックアップ用に記憶する。なお、本実施形態におけるストレージシステム１０は、バックアップ対象データを記憶する場合を例示して説明するが、それは一例であって、いかなるデータを記憶してもよい。 As shown in FIG. 3, the storage system 10 according to the present invention is connected to a backup system 11 that controls backup processing via a network N. Then, the backup system 11 acquires the backup target data stored in the backup target device 12 connected via the network N and requests the storage system 10 to store it. Thereby, the storage system 10 stores the backup target data requested to be stored for backup. The storage system 10 according to the present embodiment is described by way of example in which backup target data is stored. However, this is an example, and any data may be stored.

そして、図４に示すように、本実施形態におけるストレージシステム１０は、複数のサーバコンピュータが接続されて構成を採っている。具体的に、ストレージシステム１０は、ストレージシステム１０自体における記憶再生動作を制御するサーバコンピュータであるアクセラレータノード２０と、データを格納する記憶装置（記憶手段）を備えたサーバコンピュータであるストレージノード３０と、を備えている。なお、アクセラレータノード２０の数とストレージノード３０の数は、図３に示したものに限定されず、さらに多くの各ノード２０，３０が接続されて構成されていてもよい。 As shown in FIG. 4, the storage system 10 according to this embodiment is configured by connecting a plurality of server computers. Specifically, the storage system 10 includes an accelerator node 20 that is a server computer that controls storage and reproduction operations in the storage system 10 itself, and a storage node 30 that is a server computer including a storage device (storage means) that stores data. It is equipped with. Note that the number of accelerator nodes 20 and the number of storage nodes 30 are not limited to those shown in FIG. 3, and more nodes 20 and 30 may be connected.

さらに、本実施形態におけるストレージシステム１０は、データを分割及び冗長化し、分散して複数の記憶装置に記憶すると共に、記憶するデータの内容に応じて設定される固有のコンテンツアドレスによって、当該データを格納した格納位置を特定するコンテンツアドレスストレージシステムである。このコンテンツアドレスストレージシステムについては、後に詳述する。 Furthermore, the storage system 10 according to the present embodiment divides and makes the data redundant, stores the data in a plurality of storage devices, and stores the data by a unique content address set according to the content of the stored data. It is a content address storage system for specifying a stored location. This content address storage system will be described in detail later.

図５に、ストレージシステム１０の構成を示す。この図に示すように、まず、ストレージシステム１０を構成するアクセラレータノード２０は、装備されたＣＰＵ（Central Processing Unit）などの演算装置にプログラムが組み込まれることによって構成された、ファイルシステムサービス２１と、ブロック分割処理部２２と、重複排除処理部２３と、分散処理部２４と、を備えている。なお、上述した機能の全部または一部は、ストレージシステム１０を構成するストレージノード３０に装備されていてもよい。以下、ストレージシステム１０の各構成について詳述すると共に、その動作を図１３乃至図１５のフローチャートを参照して説明する。 FIG. 5 shows the configuration of the storage system 10. As shown in this figure, first, the accelerator node 20 constituting the storage system 10 includes a file system service 21 configured by incorporating a program in an arithmetic device such as a CPU (Central Processing Unit), A block division processing unit 22, a deduplication processing unit 23, and a distributed processing unit 24 are provided. Note that all or part of the functions described above may be provided in the storage node 30 constituting the storage system 10. Hereinafter, each configuration of the storage system 10 will be described in detail, and its operation will be described with reference to the flowcharts of FIGS.

上記ファイルシステムサービス２１は、バックアップ対象装置１２から送信されるバックアップ対象となるデータの入力を受け、当該データをストレージノード３０に格納する動作を制御するファイルシステムとして機能するものである。このとき、ファイルシステムサービス２１は、例えば、図８を参照して後述するように、バックアップ対象装置１２の種類に応じて複数存在し（ファイルシステム１，２，３等）、それぞれデータ（ブロックデータ（記憶対象データ））に対して異なる数の冗長データ（パリティデータ）を付加する。 The file system service 21 functions as a file system that receives an input of backup target data transmitted from the backup target device 12 and controls an operation of storing the data in the storage node 30. At this time, as will be described later with reference to FIG. 8, for example, a plurality of file system services 21 exist according to the type of the backup target device 12 (file systems 1, 2, 3, etc.), and data (block data). A different number of redundant data (parity data) is added to (storage target data)).

上記ブロック分割処理部２２と、重複排除処理部２３と、分散処理部２４とは、具体的に、バックアップ対象となるデータをストレージノード３０に分散記憶したり、当該ストレージノード３０に記憶されているデータを読み出す処理を行う。ここで、上記各処理部２２，２３，２４による分散記憶処理の一例を図６及び図７に示す。 Specifically, the block division processing unit 22, the deduplication processing unit 23, and the distributed processing unit 24 distribute and store data to be backed up in the storage node 30 or stored in the storage node 30. Process to read data. Here, an example of the distributed storage processing by each of the processing units 22, 23, 24 is shown in FIGS.

まず、ストレージシステム１０は、バックアップ対象データであるデータＡの入力を受けると（図６、図７の矢印Ｙ１、図１３のステップＳ１）、ブロック分割処理部２２にて、図７の矢印Ｙ２に示すように、当該データＡを所定容量（例えば、６４ＫＢ）のブロックデータＤ（記憶対象データ）に分割する（図１３のステップＳ２）。そして、このブロックデータＤのデータ内容に基づいて、当該データ内容を代表する固有のハッシュ値Ｈを算出する（図７の矢印Ｙ３）。例えば、ハッシュ値Ｈは、予め設定されたハッシュ関数を用いて、ブロックデータＤのデータ内容から算出する。 First, when the storage system 10 receives an input of data A, which is backup target data (arrow Y1 in FIGS. 6 and 7, step S1 in FIG. 13), the block division processing unit 22 changes to the arrow Y2 in FIG. As shown, the data A is divided into block data D (data to be stored) having a predetermined capacity (for example, 64 KB) (step S2 in FIG. 13). Based on the data content of the block data D, a unique hash value H representing the data content is calculated (arrow Y3 in FIG. 7). For example, the hash value H is calculated from the data content of the block data D using a preset hash function.

続いて、ストレージシステム１０は、重複排除処理部２３にて、同じ内容のブロックデータＤの重複記録を排除するために、ブロックデータＤのハッシュ値Ｈを用いて、重複排除処理を行う（図１３のステップＳ３）。具体的には、まず、既に格納されているブロックデータＤは、後述するように、当該ブロックデータＤのハッシュ値Ｈと格納位置を表すコンテンツアドレスＣＡとが関連付けられて登録されているため、算出したブロックデータＤのハッシュ値Ｈが既に存在している場合には、既に同一内容のブロックデータＤが格納されていると判断できる（図１３のステップＳ４でＹｅｓ）。この場合には、格納前のブロックデータＤのハッシュ値Ｈと一致した登録されているハッシュ値Ｈに関連付けられているコンテンツアドレスＣＡを取得する。そして、このコンテンツアドレスＣＡを、書き込み要求されたブロックデータＤのコンテンツアドレスＣＡとして参照する。これにより、このコンテンツアドレスＣＡにて参照される既に格納されているデータが、書き込み要求されたブロックデータＤとして使用されることとなり、当該書き込み要求にかかるブロックデータＤを記憶する必要がなくなる。 Subsequently, in the deduplication processing unit 23, the storage system 10 performs deduplication processing using the hash value H of the block data D in order to eliminate duplicate recording of the block data D having the same contents (FIG. 13). Step S3). Specifically, first, the block data D that has already been stored is registered in association with the hash value H of the block data D and the content address CA representing the storage position, as will be described later. If the hash value H of the block data D already exists, it can be determined that the block data D having the same content has already been stored (Yes in step S4 in FIG. 13). In this case, the content address CA associated with the registered hash value H that matches the hash value H of the block data D before storage is acquired. Then, the content address CA is referred to as the content address CA of the block data D requested to be written. As a result, the already stored data referred to by the content address CA is used as the block data D requested to be written, and there is no need to store the block data D related to the write request.

なお、上述したように、既に記憶されているブロックデータＤのコンテンツアドレスＣＡを参照することにより行う書き込み処理は、ストレージノード３０のデータ保存処理部３４により行われる。そして、コンテンツアドレスＣＡを参照することによる書き込み処理が終了すると、バックアップシステム１１やバックアップ対象装置１２といった上位装置に対して書き込み処理が終了したことを表す「ＡＣＫ」信号を返却する（図１３のステップＳ１０）。なお、ストレージノード３０のデータ保存処理部３４によるその後の処理ついては後述する。 As described above, the writing process performed by referring to the content address CA of the block data D that is already stored is performed by the data storage processing unit 34 of the storage node 30. When the writing process by referring to the content address CA is completed, an “ACK” signal indicating that the writing process has been completed is returned to the host device such as the backup system 11 or the backup target device 12 (step in FIG. 13). S10). The subsequent processing by the data storage processing unit 34 of the storage node 30 will be described later.

また、書き込み要求にかかるブロックデータＤがまだ記憶されていないと判断された場合には、ストレージシステム１０は、分散処理部２４にて、かかるブロックデータＤを格納する処理を行う（図１３のステップＳ５，Ｓ６，Ｓ７，Ｓ８）。具体的には、まず、ブロックデータＤを複数の所定の容量のフラグメントデータ（分割データ）に分割する。例えば、図６の符号Ｄ１〜Ｄ９に示すように、９つのフラグメントデータ（分割データ４１）に分割する。さらに、ストレージシステム１０は、分割したフラグメントデータのうちいくつかが欠けた場合であっても、元となるブロックデータＤを復元可能なよう冗長データを生成し、上記分割したフラグメントデータ４１に追加する。例えば、図６の符号Ｄ１０〜Ｄ１２に示すように、３つのフラグメントデータ（冗長データ４２）を追加する。これにより、９つの分割データ４１と、３つの冗長データ４２とにより構成される１２個のフラグメントデータからなるデータセット４０を生成する（図７の矢印Ｙ４）。 If it is determined that the block data D related to the write request is not yet stored, the storage system 10 performs processing for storing the block data D in the distributed processing unit 24 (step of FIG. 13). S5, S6, S7, S8). Specifically, first, the block data D is divided into a plurality of pieces of fragment data (divided data) having a predetermined capacity. For example, as shown by reference numerals D1 to D9 in FIG. 6, the data is divided into nine fragment data (divided data 41). Further, the storage system 10 generates redundant data so that the original block data D can be restored even if some of the divided fragment data is missing, and adds it to the divided fragment data 41. . For example, three pieces of fragment data (redundant data 42) are added as indicated by reference numerals D10 to D12 in FIG. As a result, a data set 40 composed of 12 pieces of fragment data composed of nine divided data 41 and three redundant data 42 is generated (arrow Y4 in FIG. 7).

そして、上述したように生成された各フラグメントデータは、分散処理部２４及び後述するストレージノード３０のデータ保存処理部３４にて、後述する各ストレージノード３０に形成された各コンポーネントＣに分散記憶される。例えば、図６に示すように、１２個のフラグメントデータＤ１〜Ｄ１２を生成した場合には、ストレージノード３０に形成されたデータ格納領域である各コンポーネントＣに、各フラグメントデータＤ１〜Ｄ１２を１つずつそれぞれ格納する（図７の矢印Ｙ５参照）。以上のように、上述したアクセラレータノード２０が装備する各処理部２２，２３，２４とストレージノードが装備するデータ保存処理部３４とは、協働して、バックアップ対象データを分割したブロックデータから、これをさらに分割した複数のフラグメントデータ（冗長データを含む）を生成し、複数の記憶装置に分散して記憶する分散記憶処理手段として機能する。 Each fragment data generated as described above is distributed and stored in each component C formed in each storage node 30 described later by the distributed processing unit 24 and the data storage processing unit 34 of the storage node 30 described later. The For example, as shown in FIG. 6, when 12 pieces of fragment data D1 to D12 are generated, one piece of fragment data D1 to D12 is added to each component C that is a data storage area formed in the storage node 30. Each one is stored (see arrow Y5 in FIG. 7). As described above, the processing units 22, 23, and 24 equipped in the accelerator node 20 and the data storage processing unit 34 equipped in the storage node cooperate with each other from the block data obtained by dividing the backup target data. It functions as a distributed storage processing means for generating a plurality of fragment data (including redundant data) obtained by further dividing the data and distributing and storing them in a plurality of storage devices.

なお、上述した分散記憶処理の際には、各フラグメントデータＤ１〜Ｄ１２に加えて、当該各フラグメントデータにそれぞれ関連する情報を含むメタデータも各コンポーネントＣに記憶する。かかる処理については、ストレージノード３０のデータ保存処理部３４の機能説明時に説明する。 In the distributed storage process described above, in addition to the fragment data D1 to D12, metadata including information related to each fragment data is also stored in each component C. Such processing will be described when the function of the data storage processing unit 34 of the storage node 30 is described.

ここで、上述したようにフラグメントデータが格納されると、ストレージノード３０にて、当該フラグメントデータＤ１〜Ｄ１２の格納位置、つまり、当該フラグメントデータＤ１〜Ｄ１２にて復元されるブロックデータＤの格納位置を表すコンテンツアドレスＣＡが生成される。このとき、コンテンツアドレスＣＡは、例えば、格納したブロックデータＤの内容に基づいて算出したハッシュ値Ｈの一部（ショートハッシュ）（例えば、ハッシュ値Ｈの先頭８Ｂ（バイト））と、論理格納位置を表す情報と、を組み合わせて、生成される。そして、このコンテンツアドレスＣＡは、ストレージシステム１０内のファイルシステムを管理するアクセラレータノード２０にて、バックアップ対象データのファイル名などの識別情報と、コンテンツアドレスＣＡとが関連付けられて、ファイルシステムで管理される。 When the fragment data is stored as described above, the storage position of the fragment data D1 to D12, that is, the storage position of the block data D restored by the fragment data D1 to D12 is stored in the storage node 30. Is generated. At this time, the content address CA includes, for example, a part of the hash value H (short hash) calculated based on the contents of the stored block data D (for example, the top 8B (bytes) of the hash value H) and the logical storage position. And information representing the information is generated. The content address CA is managed in the file system in association with identification information such as the file name of the backup target data and the content address CA in the accelerator node 20 that manages the file system in the storage system 10. The

そして、ストレージシステム１０では、ファイルの読み出し要求を受けると、要求されたファイルに対応するコンテンツアドレスＣＡにて指定される格納位置を特定し、この特定された格納位置に格納されている各フラグメントデータを、読み出し要求されたデータとして読み出すことができる。以上のように、ストレージシステム１０は、データを読み書きする機能を有する。 When the storage system 10 receives a file read request, the storage system 10 specifies the storage location specified by the content address CA corresponding to the requested file, and each fragment data stored in the specified storage location. Can be read as the data requested to be read. As described above, the storage system 10 has a function of reading and writing data.

次に、ストレージノード３０の構成について説明する。なお、本実施形態では、ストレージノード３０は複数装備されているが、それぞれストレージノード３０の構成は同一であるため、そのうち１つのストレージノード３０の構成について説明する。 Next, the configuration of the storage node 30 will be described. In the present embodiment, a plurality of storage nodes 30 are provided. Since the configuration of the storage nodes 30 is the same, the configuration of one of the storage nodes 30 will be described.

図５に示すように、ストレージノード３０は、装備されたＣＰＵ（Central Processing Unit）などの演算装置にプログラムが組み込まれることによって構成された、ノード・ディスク障害検出器３１と、データ再生成制御機３２と、再生成処理機３３と、データ保存処理部３４と、を備えている。また、ストレージノード３０は、記憶装置（記憶手段）である複数のディスク３５を備えている。以下、各構成について詳述する。 As shown in FIG. 5, the storage node 30 includes a node / disk failure detector 31 and a data regeneration controller that are configured by incorporating a program into an arithmetic device such as a CPU (Central Processing Unit). 32, a regeneration processor 33, and a data storage processor 34. The storage node 30 also includes a plurality of disks 35 that are storage devices (storage means). Hereinafter, each configuration will be described in detail.

上記データ保存処理部３４（分散記憶処理手段）は、上述したように、アクセラレータノード２０にてバックアップ対象データが複数に分割されたブロックデータ（記憶対象データ）がさらに分割され冗長データが付加されて生成されたフラグメントデータを、複数のストレージノード３０に装備された複数のディスク３５に構成された各コンポーネントＣに分散して記憶する。このとき、データ保存処理部３４は、各フラグメントデータに、当該フラグメントデータに関連する情報を含むメタデータを関連付けて、同一のコンポートネントＣに記憶する。なお、上述したように、コンテンツアドレスＣＡを参照することによる書き込み処理が終了すると、バックアップシステム１１やバックアップ対象装置１２といった上位装置に対して書き込み処理が終了したことを表す「ＡＣＫ」信号を返却する（図１３のステップＳ９）。 As described above, the data storage processing unit 34 (distributed storage processing means) further divides block data (storage target data) obtained by dividing the backup target data into a plurality of data at the accelerator node 20 and adds redundant data. The generated fragment data is distributed and stored in each component C configured on the plurality of disks 35 equipped in the plurality of storage nodes 30. At this time, the data storage processing unit 34 associates each piece of fragment data with metadata including information related to the fragment data, and stores it in the same component C. As described above, when the writing process by referring to the content address CA is completed, an “ACK” signal indicating that the writing process has been completed is returned to a higher-level device such as the backup system 11 or the backup target device 12. (Step S9 in FIG. 13).

上記メタデータには、まず、フラグメントデータの元となるブロックデータが所属するコンポーネントの構成を表すコンポーネント構成情報を含む。例えば、１つのデータセット４０を構成する各フラグメントデータの各メタデータには、当該同一のデータセット４０を構成するフラグメントデータが格納されたコンポートネントＣを特定するコンポーネント構成情報を含んでいる。また、メタデータには、ブロックデータからフラグメントデータが生成される際に付加された冗長データの数を表すパリティ数を含む。例えば、図８に示すファイルシステム１で生成されるフラグメントデータのデータセットには斜線で示すフラグメントデータが１つ付加されるが、この場合に各フラグメントデータのメタデータは、「１」のパリティ数を含んでいる。同様に、図８に示すファイルシステム２で生成されるフラグメントデータのメタデータは、「３」のパリティ数を含んでおり、図８に示すファイルシステム３で生成されるフラグメントデータのメタデータは、「１１」のパリティ数を含んでいる。 The metadata first includes component configuration information representing the configuration of the component to which the block data that is the source of fragment data belongs. For example, each piece of metadata of each piece of fragment data that constitutes one data set 40 includes component configuration information that identifies the component C in which the fragment data that constitutes the same data set 40 is stored. Further, the metadata includes a parity number indicating the number of redundant data added when fragment data is generated from block data. For example, one piece of fragment data indicated by hatching is added to the data set of fragment data generated by the file system 1 shown in FIG. 8, and in this case, the metadata of each fragment data has a parity number of “1”. Is included. Similarly, the metadata of the fragment data generated by the file system 2 shown in FIG. 8 includes the parity number “3”, and the metadata of the fragment data generated by the file system 3 shown in FIG. A parity number of “11” is included.

さらに、上記メタデータには、フラグメントデータの元となるブロックデータが、同一のデータ内容であるとして判断された他のブロックデータとして参照されている数を表す被参照数を含む。例えば、図９の例では、「ファイル１」が「ブロック１」と「ブロック２」のブロックデータで構成されており、「ファイル２」が「ブロック１」と「ブロック３」のブロックデータで構成されている、ことを表していることとする。この場合には、「ファイル１」は「ブロック１」と「ブロック２」の格納位置を参照する各コンテンツアドレス「ＣＡ１」，「ＣＡ２」を参照して格納され、「ファイル２」は「ブロック１」と「ブロック３」の格納位置を参照する各コンテンツアドレス「ＣＡ１」，「ＣＡ３」を参照して格納される。すると、実際に格納されているブロックデータである「ブロック１」は、「ファイル１」と「ファイル２」から参照されているため、被参照数が「２」となる。このため、「ブロック１」のブロックデータを構成する各フラグメントデータの各メタデータは、「２」の被参照数を含むこととなる。同様に、「ブロック２」，「ブロック３」は、被参照数が「１」となり、当該各ブロックデータを構成する各フラグメントデータの各メタデータは、「１」の被参照数を含むこととなる。 Furthermore, the metadata includes a referenced number that represents the number of block data that is the source of fragment data that is referred to as other block data determined to have the same data content. For example, in the example of FIG. 9, “file 1” is composed of block data “block 1” and “block 2”, and “file 2” is composed of block data “block 1” and “block 3”. It is supposed to represent that. In this case, “file 1” is stored with reference to the content addresses “CA1” and “CA2” referring to the storage positions of “block 1” and “block 2”, and “file 2” is stored in “block 1”. ”And“ Block 3 ”are stored with reference to the content addresses“ CA1 ”and“ CA3 ”referring to the storage locations. Then, “block 1”, which is actually stored block data, is referred to by “file 1” and “file 2”, so the number of referenced is “2”. For this reason, each piece of metadata of each piece of fragment data constituting the block data of “block 1” includes the number of referenced data of “2”. Similarly, “block 2” and “block 3” have a reference count of “1”, and each piece of metadata included in each piece of block data includes a reference count of “1”. Become.

以上のように、各フラグメントデータに関連付けられる各メタデータには、当該フラグメントデータにて構成されるブロックデータのコンポーネント構成情報、パリティ数、被参照数が含まれている。そして、１つのブロックデータから生成される各フラグメントデータに関連付けられる各メタデータは、全て同じ内容となる。 As described above, each piece of metadata associated with each piece of fragment data includes the component configuration information, the number of parity, and the number of referenced data of the block data constituted by the fragment data. And all the metadata linked | related with each fragment data produced | generated from one block data become the same content.

ここで、上述したメタデータは、データ保存処理部３４にて任意のタイミングで更新される。つまり、メタデータに含まれる被参照数は、ブロックデータが格納される度に変更される場合があるため、コンポーネントＣに格納されたメタデータ内の被参照数を更新する処理を行う。具体的に、データ保存処理部３４は、上述したように、新たに格納するブロックデータが既にストレージノード３０に記憶されている場合には、既に記憶されているブロックデータのコンテンツアドレスＣＡを新たに格納するブロックデータとして参照させて書き込み処理を行うが、かかる書き込み処理が完了した後に（図１３のステップＳ１０）、当該書き込み処理とは非同期で、参照されたブロックデータのメタデータ内の被参照数を更新する（図１３のステップＳ１１）。なお、メタデータ内の被参照数の更新処理は、上述したように書き込み処理完了後に非同期で行われることに限定されず、いかなるタイミングで実行されてもよい。 Here, the above-described metadata is updated by the data storage processing unit 34 at an arbitrary timing. In other words, since the number of referenceds included in the metadata may be changed every time the block data is stored, the number of referenceds in the metadata stored in the component C is updated. Specifically, as described above, when the block data to be newly stored is already stored in the storage node 30, the data storage processing unit 34 newly sets the content address CA of the already stored block data. The write process is performed by referring to the block data to be stored. After the write process is completed (step S10 in FIG. 13), the number of references in the metadata of the referenced block data is asynchronous with the write process. Is updated (step S11 in FIG. 13). Note that the update processing of the number of referenceds in the metadata is not limited to being performed asynchronously after completion of the write processing as described above, and may be executed at any timing.

上記ノード・ディスク障害検出器３１（データ再生手段）は、ストレージシステム１０内を常時監視し、ストレージノード３０やディスク３５の障害を検出する（図１４のステップＳ２１）。そして、ノード・ディスク障害検出器３１は、ストレージノード３０やディスク３５の障害を検出すると、障害が発生したストレージノード３０やディスク３５を特定し、かかる障害によって紛失したと思われるコンポーネントのリストを生成して、データ再生成制御器３２に渡す。例えば、図１０に示すように、ストレージシステム１０を構成するストレージノード３０のうち、コンポーネントＣ２，Ｃ８，Ｃ１１が構成されているストレージノード３０に装備されたディスク２に障害が発生したとする。この場合には、ディスク２に形成されたコンポーネントＣ１１が紛失したことを特定して、かかるコンポーネントＣ１１の情報をデータ再生成制御器３２に渡す。なお、紛失したコンポーネントの特定は、例えば、ストレージノード３０やストレージシステム１０内には、予めコンポーネントがどのストレージノードのどのディスクに形成されているかを表す情報が記憶されており、かかる情報を参照して行う。あるいは、紛失したコンポーネントの特定は、障害が発生していない他のストレージノード３０に格納されているメタデータに含まれるコンポーネント構成情報を参照して行ってもよく、その他の方法で行ってもよい。 The node / disk failure detector 31 (data reproducing means) constantly monitors the storage system 10 and detects a failure in the storage node 30 or the disk 35 (step S21 in FIG. 14). When the node / disk failure detector 31 detects a failure in the storage node 30 or the disk 35, the node / disk failure detector 31 identifies the storage node 30 or the disk 35 in which the failure has occurred, and generates a list of components that may have been lost due to the failure. Then, the data is transferred to the data regeneration controller 32. For example, as shown in FIG. 10, it is assumed that a failure has occurred in the disk 2 equipped in the storage node 30 in which the components C2, C8, and C11 are configured among the storage nodes 30 that constitute the storage system 10. In this case, it is determined that the component C11 formed on the disk 2 has been lost, and information on the component C11 is passed to the data regeneration controller 32. For example, in the storage node 30 or the storage system 10, information indicating in which storage node the disk is formed is stored in advance in the storage node 30 or the storage system 10. Do it. Alternatively, the missing component may be identified with reference to component configuration information included in metadata stored in another storage node 30 in which no failure has occurred, or may be performed by other methods. .

上記データ再生成制御器３２（データ再生手段）は、ノード・ディスク障害検出器３１から渡された情報と、現在のコンポーネントの配置構成から、リカバリするコンポーネントのリストと、当該コンポートネントの配置（ディスク）の仮候補を算出する（図１４のステップＳ２２）。例えば、障害が発生したディスク３５を備える同一のストレージノード３０内の他のディスク３５に、障害により消失したコンポーネントをリカバリするよう算出する。また、データ再生成制御器３２は、他のストレージノード３０に設けられたデータ再生成制御器３２と通信し、上述した仮候補の中からリカバリするコンポーネントの最終構成を決定し、他のストレージノード３０のデータ再生成制御器３２に通知する（図１４のステップＳ２３）。例えば、全てのストレージノード３０のデータ再生成制御器３２が算出した仮候補から多数決で最終構成を決定する。ここで、図１２の例では、コンポーネントＣ２に障害が発生した場合に、同一のストレージノード３０内に新たなコンポーネントＣ２’をリカバリすると決定したとする。 The data regeneration controller 32 (data reproducing means) includes a list of components to be recovered from the information passed from the node / disk failure detector 31 and the current component arrangement configuration, and the arrangement of the component (disk ) Temporary candidates (step S22 in FIG. 14). For example, calculation is performed so that a component lost due to the failure is recovered to another disk 35 in the same storage node 30 including the failed disk 35. In addition, the data regeneration controller 32 communicates with the data regeneration controller 32 provided in the other storage node 30 to determine the final configuration of the component to be recovered from the temporary candidates described above. 30 is notified to the data regeneration controller 32 (step S23 in FIG. 14). For example, the final configuration is determined by majority from the temporary candidates calculated by the data regeneration controller 32 of all the storage nodes 30. Here, in the example of FIG. 12, it is assumed that it is determined that a new component C2 'is recovered in the same storage node 30 when a failure occurs in the component C2.

さらに、データ再生成制御器３２は、障害が発生したコンポーネントに格納されていたフラグメントデータが複数存在する場合に、リカバリするフラグメントデータの優先順位を決定する（図１４のステップＳ２４）。このとき、障害が発生したコンポーネントに格納されていたフラグメントデータにて構成されるブロックデータのパリティ数つまり冗長データの数が小さい順に、リカバリを行うよう決定する。例えば、障害が発生したコンポーネントに格納されていたブロックデータを構成するフラグメントデータは、他の障害が発生していないコンポーネントにも格納されているため、当該障害が発生していないコンポートネントに格納されているフラグメントデータのメタデータ内のパリティ数を参照して、障害が発生したコンポーネントに格納されていたブロックデータのパリティ数を特定する。そして、データ再生成制御器３２は、決定したコンポーネントをリカバリする構成とパリティ数に基づく優先順位を、再生成処理器３３に通知する（図１４のステップＳ２５）。 Further, the data regeneration controller 32 determines the priority order of the fragment data to be recovered when there are a plurality of fragment data stored in the failed component (step S24 in FIG. 14). At this time, it is determined to perform recovery in ascending order of the number of parity of block data composed of fragment data stored in the failed component, that is, the number of redundant data. For example, the fragment data that makes up the block data stored in the failed component is also stored in other non-failed components, so it is stored in the non-failed component. The parity number of the block data stored in the failed component is identified with reference to the parity number in the metadata of the fragment data. Then, the data regeneration controller 32 notifies the regeneration processor 33 of the configuration based on the determined component recovery and the priority order based on the number of parity (step S25 in FIG. 14).

なお、データ再生成制御器３２は、具体的には、コンポーネントの障害によりブロックデータを構成するフラグメントデータ数のうち失ったフラグメントデータ数に応じて、リカバリの優先順位を決定する。例えば、コンポーネントの障害により、ｘ個のフラグメントデータを失った場合には、そのパリティ数が、ｘ，ｘ＋１，Ｘ＋２，・・・の順番で優先的にリカバリが行われるよう、優先順位を決定する。 Specifically, the data regeneration controller 32 determines the priority of recovery in accordance with the number of fragment data lost among the number of fragment data constituting the block data due to a component failure. For example, when x pieces of fragment data are lost due to a component failure, the priority order is determined so that the number of parity is recovered preferentially in the order of x, x + 1, X + 2,. .

ここで、図１１を参照して、リカバリするブロックデータの優先順位を、パリティ数の小さい順で決定することについて詳述する。例えば、図１１（Ａ）に示すように、一つのコンポーネントＣ内にパリティ数が異なる各ブロックデータから生成されたフラグメントデータＦ（及びこれに対応するメタデータＭ）が記憶されていることとする。このとき、１つのブロックデータを構成する各フラグメントデータＦは、１つのコンポーネントＣにつき１つのみ格納されるよう分散されていることとする。また、符号Ｐ１で示すフラグメントデータＦはパリティ数が「１」であり、符号Ｐ２で示すフラグメントデータＦはパリティ数が「２」であり、符号Ｐ３で示すフラグメントデータＦはパリティ数が「３」であるとする。 Here, with reference to FIG. 11, it will be described in detail that the priority order of the block data to be recovered is determined in ascending order of the number of parity. For example, as shown in FIG. 11A, it is assumed that fragment data F (and corresponding metadata M) generated from block data having different numbers of parities is stored in one component C. . At this time, it is assumed that each piece of fragment data F constituting one block data is distributed so that only one is stored per component C. Further, the fragment data F indicated by the reference symbol P1 has a parity number “1”, the fragment data F indicated by the reference symbol P2 has a parity number “2”, and the fragment data F indicated by the reference symbol P3 has a parity number “3”. Suppose that

上記の場合、１つのコンポートネントＣに障害が発生した場合には、当該コンポーネントＣ内のフラグメントＦを失うこととなるが、パリティ数が「１」のフラグメントデータＦにて構成されるブロックデータは、これ以上フラグメントＦを失うとリカバリすることができない。つまり、コンポーネントＣ内のフラグメントＦをリカバリしている間に、他のコンポーネントＣに障害が生じると、データの再生成が不可能となってしまう。このため、パリティ数が少ないものほど優先して迅速にリカバリすることが望ましく、このような理由から、パリティ数の小さい順からフラグメントデータＦをリカバリすることとしている。なお、図１１（Ｂ）は、１つのディスク３５内に異なったパリティ数のフラグメントデータＦ（及びこれに対応するメタデータ）が格納されている場合を示しているが、この場合も同様に、ディスク３５に障害が生じてフラグメントデータＦが消失した場合には、当該フラグメントデータＦにて構成されるブロックデータのパリティ数が小さい順にリカバリを行う。 In the above case, when a failure occurs in one component C, the fragment F in the component C is lost, but the block data composed of the fragment data F having the parity number “1” is If no more fragment F is lost, it cannot be recovered. In other words, if a failure occurs in another component C while the fragment F in the component C is being recovered, data cannot be regenerated. For this reason, it is desirable that the smaller the number of parities be, the faster the recovery is prioritized. For this reason, the fragment data F is recovered from the smallest number of parities. FIG. 11B shows a case where fragment data F (and corresponding metadata) having different numbers of parities are stored in one disk 35. In this case as well, When a failure occurs in the disk 35 and the fragment data F is lost, recovery is performed in ascending order of the parity number of the block data constituted by the fragment data F.

上記データ再生成器３３（データ再生手段）は、データ再生成制御部３２からコンポーネントをリカバリする構成とパリティ数に基づく優先順位を受信する（図１５のステップＳ３１）。そして、データ再生成器３３は、受信した構成通りにコンポーネントＣを再生成して、当該コンポーネントＣに格納されていたメタデータをまずはリカバリする（図１５のステップＳ３２）。例えば、障害が生じていない他のコンポーネントＣに格納されているメタデータをコピーすることで、再生成したコンポーネントＣ内にメタデータを再生成する。なお、図１０の例では、ストレージノード３０のディスク２に障害が生じてコンポーネントＣ１１内のデータが消失した場合には、まず矢印Ｙ１１に示すように、同一のストレージノード３０内の他のディスク３にコンポーネントＣ１１を再生成する。その後、矢印Ｙ１２に示すように、他のストレージノード内のメタデータを、再生成したコンポーネントＣ１１内にコピーして再生成する。 The data regenerator 33 (data reproducing means) receives the configuration based on the component recovery and the priority order based on the number of parity from the data regeneration control unit 32 (step S31 in FIG. 15). Then, the data regenerator 33 regenerates the component C according to the received configuration, and first recovers the metadata stored in the component C (step S32 in FIG. 15). For example, the metadata is regenerated in the regenerated component C by copying the metadata stored in the other component C in which no failure has occurred. In the example of FIG. 10, when a failure occurs in the disk 2 of the storage node 30 and the data in the component C11 is lost, first, as indicated by the arrow Y11, another disk 3 in the same storage node 30. The component C11 is regenerated. Thereafter, as indicated by an arrow Y12, the metadata in the other storage node is copied and regenerated in the regenerated component C11.

続いて、データ再生成器３３は、データ再生成制御部３２から通知されたパリティ数の優先順位にて、再生成したコンポーネントＣ内のメタデータを参照し、障害が発生していない他のコンポーネントＣ内のフラグメントデータから、障害により消失したフラグメントデータを再生成する。このとき、同一のパリティ数のフラグメントデータが複数存在する場合には、当該フラグメントデータにて構成されるブロックデータの被参照数が大きい順に、リカバリする（図１５のステップＳ３３〜Ｓ３７）。例えば、パリティ数が「１」のリカバリすべきフラグメントデータが複数存在する場合には、当該フラグメントデータのメタデータから被参照数を抽出する。そして、パリティ数が「１」のリカバリすべきフラグメントデータのうち、被参照数が大きい順にリカバリするフラグメントデータを選定して、リカバリする（図１２の矢印Ｙ２１参照）。その後は、パリティ数が「１」のリカバリすべきフラグメントデータを全てリカバリすると、続いて、パリティ数が「２」のフラグメントデータを被参照数が大きい順にリカバリする（図１２の矢印Ｙ２２参照）。 Subsequently, the data regenerator 33 refers to the metadata in the regenerated component C in the priority order of the number of parity notified from the data regenerating control unit 32, and other components in which no failure has occurred. From the fragment data in C, the fragment data lost due to the failure is regenerated. At this time, if there are a plurality of fragment data having the same parity number, recovery is performed in descending order of the number of referenced data of the block data constituted by the fragment data (steps S33 to S37 in FIG. 15). For example, when there are a plurality of fragment data to be recovered with the parity number “1”, the number of referenced is extracted from the metadata of the fragment data. Then, of the fragment data to be recovered with the parity number “1”, the fragment data to be recovered is selected in the descending order of the number of referenced, and the recovery is performed (see arrow Y21 in FIG. 12). Thereafter, when all the fragment data to be recovered with the parity number “1” is recovered, the fragment data with the parity number “2” is recovered in descending order of the number of referenced (see arrow Y22 in FIG. 12).

このように、被参照数が大きい順にフラグメントデータをリカバリすることとした理由は、被参照数が大きいブロックデータほど、多くのファイルを構成しているデータであるため、消失したときの影響度が大きいためである。つまり、コンポーネントＣ内のフラグメントＦをリカバリしている間に他のフラグメントデータが消失することによりリカバリ不可能となると、被参照数が大きいフラグメントデータほどストレージシステムの信頼性に大きな影響を与えてしまう。このため、被参照数が大きいものほど優先して迅速にリカバリすることとしている。 As described above, the reason why the fragment data is recovered in descending order of the number of referenced is the data constituting a larger number of files as the number of referenced data is larger. Because it is big. That is, if recovery is impossible due to the loss of other fragment data while recovering the fragment F in the component C, fragment data with a larger number of references has a greater effect on the reliability of the storage system. . For this reason, the larger the number of references, the faster the recovery with priority.

ここで、上述したようにフラグメントデータのリカバリを行う優先順位を決定するパリティ値や被参照数は、フラグメントデータが所属するメタデータとしてストレージノード３０側が保持している情報である。従って、ストレージノード３０側で単独でリカバリ処理が可能であり、ストレージシステム１０にアクセスしている側に及ぼす影響を抑制することができる。 Here, as described above, the parity value and the number of referenceds that determine the priority for performing fragment data recovery are information held on the storage node 30 side as metadata to which the fragment data belongs. Therefore, the recovery process can be performed independently on the storage node 30 side, and the influence on the side accessing the storage system 10 can be suppressed.

なお、上記では、メタデータの更新をデータの書き込み処理終了後に非同期で行うことを説明したが、これは、書き込み処理毎にブロックデータの被参照数は常に変更されるため、重要なフラグメントデータほど優先してリカバリされるよう、迅速に設定するためである。例えば、ブロックデータの被参照数をチェックして参照されていないブロックデータの記憶領域を解放する領域解放処理が、任意のタイミングで実行される場合があるが、かかる領域解放処理が長期間実行されない場合には、重要なフラグメントデータが優先してリカバリされない場合が生じる。かかる場合と比較して、本発明は有効である。但し、データの書き込み処理終了に同期してメタデータの更新が行われると、ストレージシステム全体の性能劣化の問題も生じるため、本発明では、書き込み処理の終了後に当該書き込み処理とは非同期でメタデータを更新するようにしている。例えば、書き込み処理終了後に、ストレージシステム全体やストレージノードの負荷に予め設定された余裕がある場合にメタデータの更新を実行するようにしてもよく、非同期に更新する条件設定は任意である。 In the above description, it has been described that the metadata is updated asynchronously after the data writing process is completed. However, since the number of referenced block data is always changed for each writing process, the more important fragment data is This is because the setting is made quickly so that the recovery is performed with priority. For example, the area release processing for checking the block data reference count and releasing the block data storage area that is not referenced may be executed at any timing, but such area release processing is not executed for a long period of time. In some cases, important fragment data may not be recovered preferentially. Compared to such a case, the present invention is effective. However, if the metadata update is performed in synchronization with the end of the data writing process, a problem of performance degradation of the entire storage system also occurs. Therefore, in the present invention, the metadata is asynchronous with the writing process after the end of the writing process. To update. For example, after completion of the write process, the metadata may be updated when there is a preset margin in the load of the entire storage system or the storage node, and the condition setting for updating asynchronously is arbitrary.

＜付記＞
上記実施形態の一部又は全部は、以下の付記のようにも記載されうる。以下、本発明におけるストレージシステム（図１６参照）、情報処理装置、プログラム、情報処理方法の構成の概略を説明する。但し、本発明は、以下の構成に限定されない。 <Appendix>
Part or all of the above-described embodiment can be described as in the following supplementary notes. The outline of the configuration of the storage system (see FIG. 16), information processing apparatus, program, and information processing method in the present invention will be described below. However, the present invention is not limited to the following configuration.

（付記１）
複数の記憶手段１１０と、
記憶対象データを複数に分割した分割データ及び当該記憶対象データを復元するための冗長データからなる複数のフラグメントデータを生成して、当該複数のフラグメントデータを前記複数の記憶手段に分散して記憶する分散記憶処理手段１０１と、
障害が発生した前記記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータを、障害が発生していない他の前記記憶手段に記憶されている前記記憶対象データを構成する他の前記フラグメントデータに基づいて再生成するデータ再生成手段１０２と、を備え、
前記データ再生手段１０２は、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
ストレージシステム１００。 (Appendix 1)
A plurality of storage means 110;
Generating a plurality of fragment data composed of divided data obtained by dividing the storage target data into a plurality of data and redundant data for restoring the storage target data, and distributing and storing the plurality of fragment data in the plurality of storage means Distributed storage processing means 101;
The fragment data constituting the storage target data stored in the storage means in which a failure has occurred, the other fragment constituting the storage target data stored in another storage means in which no failure has occurred Data regenerating means 102 for regenerating based on the fragment data,
The data reproduction means 102 is stored in the storage means in which the failure has occurred based on the other fragment data in a priority order based on the number of redundant data of the fragment data constituting the storage target data. Regenerating the fragment data constituting the storage target data that has been stored,
Storage system 100.

（付記２）
付記１に記載のストレージシステムであって、
前記分散記憶処理手段は、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数を表すパリティ数を、当該フラグメントデータに関連付けて記憶し、
前記データ再生成手段は、障害が発生していない前記他の記憶手段に記憶されている前記記憶対象データを構成する前記他のフラグメントデータに関連付けられている前記パリティ数に基づいて、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数を特定し、当該冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
ストレージシステム。 (Appendix 2)
The storage system according to attachment 1, wherein
The distributed storage processing means stores a parity number representing the number of redundant data of the fragment data constituting the storage target data in association with the fragment data,
The data regeneration unit is configured to store the storage target based on the parity number associated with the other fragment data constituting the storage target data stored in the other storage unit in which no failure has occurred. The number of the redundant data of the fragment data constituting the data is specified and stored in the storage means in which the failure has occurred based on the other fragment data in a priority order based on the number of redundant data. The fragment data constituting the storage target data is regenerated.
Storage system.

（付記３）
付記２に記載のストレージシステムであって、
前記データ再生成手段は、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数が少ない順に、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
ストレージシステム。 (Appendix 3)
The storage system according to appendix 2,
The data regenerating unit is stored in the storage unit in which the failure has occurred based on the other fragment data in order of decreasing number of the redundant data among the fragment data constituting the storage target data. Regenerating the fragment data constituting the data to be stored;
Storage system.

（付記４）
付記３に記載のストレージシステムであって、
前記分散記憶処理手段は、前記複数の記憶手段に既に記憶されている前記記憶対象データとデータ内容が予め設定された基準により同一であると判断された他の記憶対象データを前記複数の記憶手段に記憶する場合に、当該他の記憶対象データとして既に記憶されている前記記憶対象データを参照させ、
前記データ再生成手段は、再生成の対象となる前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数が同一である場合に、前記記憶対象データが前記他の記憶対象データとして参照されている数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
ストレージシステム。 (Appendix 4)
The storage system according to attachment 3, wherein
The distributed storage processing unit is configured to store the storage target data that is already stored in the plurality of storage units and other storage target data that is determined to be the same as the data content based on a preset criterion. To store the data to be stored that is already stored as the other data to be stored,
When the number of the redundant data among the fragment data constituting the storage target data to be regenerated is the same, the data regeneration unit is configured to use the storage target data as the other storage target data. Regenerating the fragment data constituting the storage target data stored in the storage means in which the failure has occurred based on the other fragment data in a priority order based on the number referred to;
Storage system.

（付記５）
付記４に記載のストレージシステムであって、
前記分散記憶処理手段は、前記記憶対象データが前記他の記憶対象データとして参照されている数を表す被参照数を、前記記憶対象データを構成する前記フラグメントデータに関連付けて記憶し、
前記データ再生成手段は、再生成の対象となる前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数が同一である場合に、障害が発生していない前記他の記憶手段に記憶されている前記記憶対象データを構成する前記他のフラグメントデータに関連付けられている前記被参照数に基づいて、前記記憶対象データが参照されている数を特定し、当該参照されている数が大きい順に、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
ストレージシステム。 (Appendix 5)
The storage system according to appendix 4, wherein
The distributed storage processing means stores a number of references representing the number of reference to the storage target data as the other storage target data in association with the fragment data constituting the storage target data,
When the number of the redundant data among the fragment data constituting the storage target data to be regenerated is the same, the data regenerating unit stores the data in the other storage unit in which no failure has occurred. Based on the referenced number associated with the other fragment data constituting the stored storage target data, the number of the storage target data is referred to, and the referenced number is Regenerating the fragment data constituting the storage target data stored in the storage means in which the failure has occurred based on the other fragment data in descending order;
Storage system.

（付記６）
付記５に記載のストレージシステムであって、
前記分散記憶処理手段は、前記他の記憶対象データを既に記憶されている前記記憶対象データを参照することにより当該他の記憶対象データの書き込み処理が終了した後に、既に記憶されている前記記憶対象データに関連付けられている前記被参照数を更新する、
ストレージシステム。 (Appendix 6)
The storage system according to appendix 5,
The distributed storage processing means refers to the storage target already stored after the write processing of the other storage target data is completed by referring to the storage target data already stored in the other storage target data. Updating the number of referenceds associated with the data;
Storage system.

（付記７）
記憶対象データを複数に分割した分割データ及び当該記憶対象データを復元するための冗長データからなる複数のフラグメントデータが分散して記憶される複数の記憶手段のうち少なくとも１つを備えた情報処理装置であって、
障害が発生した前記記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータを、障害が発生していない他の前記記憶手段に記憶されている前記記憶対象データを構成する他の前記フラグメントデータに基づいて再生成するデータ再生成手段を備え、
前記データ再生手段は、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
情報処理装置。 (Appendix 7)
Information processing apparatus comprising at least one of a plurality of storage means for distributing and storing a plurality of fragment data composed of divided data obtained by dividing storage target data into a plurality of pieces and redundant data for restoring the storage target data Because
The fragment data constituting the storage target data stored in the storage means in which a failure has occurred, the other fragment constituting the storage target data stored in another storage means in which no failure has occurred Data regenerating means for regenerating based on fragment data is provided,
The data reproducing means is stored in the storage means in which the failure has occurred based on the other fragment data in a priority order based on the number of redundant data of the fragment data constituting the storage target data. The fragment data constituting the storage target data is regenerated.
Information processing device.

（付記８）
付記７に記載の情報処理装置であって、
前記記憶手段は、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数を表すパリティ数を、当該フラグメントデータに関連付けて記憶しており、
前記データ再生成手段は、障害が発生していない前記他の記憶手段に記憶されている前記記憶対象データを構成する前記他のフラグメントデータに関連付けられている前記パリティ数に基づいて、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数を特定し、当該冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
情報処理装置。 (Appendix 8)
An information processing apparatus according to appendix 7,
The storage means stores a parity number indicating the number of redundant data of the fragment data constituting the storage target data in association with the fragment data,
The data regeneration unit is configured to store the storage target based on the parity number associated with the other fragment data constituting the storage target data stored in the other storage unit in which no failure has occurred. The number of the redundant data of the fragment data constituting the data is specified and stored in the storage means in which the failure has occurred based on the other fragment data in a priority order based on the number of redundant data. The fragment data constituting the storage target data is regenerated.
Information processing device.

（付記９）
記憶対象データを複数に分割した分割データ及び当該記憶対象データを復元するための冗長データからなる複数のフラグメントデータが分散して記憶される複数の記憶手段のうち少なくとも１つを備えた情報処理装置に、
障害が発生した前記記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータを、障害が発生していない他の前記記憶手段に記憶されている前記記憶対象データを構成する他の前記フラグメントデータに基づいて再生成するデータ再生成手段を実現させると共に、
前記データ再生手段は、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
ことを実現させるためのプログラム。 (Appendix 9)
Information processing apparatus comprising at least one of a plurality of storage means for distributing and storing a plurality of fragment data composed of divided data obtained by dividing storage target data into a plurality of pieces and redundant data for restoring the storage target data In addition,
The fragment data constituting the storage target data stored in the storage means in which a failure has occurred, the other fragment constituting the storage target data stored in another storage means in which no failure has occurred Realize data regeneration means to regenerate based on fragment data,
The data reproducing means is stored in the storage means in which the failure has occurred based on the other fragment data in a priority order based on the number of redundant data of the fragment data constituting the storage target data. The fragment data constituting the storage target data is regenerated.
A program to make things happen.

（付記１０）
付記９に記載のプログラムであって、
前記記憶手段は、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数を表すパリティ数を、当該フラグメントデータに関連付け記憶しており、
前記データ再生成手段は、障害が発生していない前記他の記憶手段に記憶されている前記記憶対象データを構成する前記他のフラグメントデータに関連付けられている前記パリティ数に基づいて、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数を特定し、当該冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
プログラム。 (Appendix 10)
The program according to appendix 9, wherein
The storage means stores a parity number indicating the number of redundant data of the fragment data constituting the storage target data in association with the fragment data,
The data regeneration unit is configured to store the storage target based on the parity number associated with the other fragment data constituting the storage target data stored in the other storage unit in which no failure has occurred. The number of the redundant data of the fragment data constituting the data is specified and stored in the storage means in which the failure has occurred based on the other fragment data in a priority order based on the number of redundant data. The fragment data constituting the storage target data is regenerated.
program.

（付記１１）
記憶対象データを複数に分割した分割データ及び当該記憶対象データを復元するための冗長データからなる複数のフラグメントデータを生成して、当該複数のフラグメントデータを複数の記憶手段に分散して記憶すると共に、
障害が発生した前記記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータを、障害が発生していない他の前記記憶手段に記憶されている前記記憶対象データを構成する他の前記フラグメントデータに基づいて再生成する際に、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
情報処理方法。 (Appendix 11)
Generating a plurality of fragment data composed of divided data obtained by dividing the storage target data into a plurality of data and redundant data for restoring the storage target data, and storing the plurality of fragment data in a plurality of storage means in a distributed manner ,
The fragment data constituting the storage target data stored in the storage means in which a failure has occurred, the other fragment constituting the storage target data stored in another storage means in which no failure has occurred The storage in which the failure has occurred based on the other fragment data in the priority order based on the number of the redundant data among the fragment data constituting the storage target data when regenerating based on the fragment data Regenerating the fragment data constituting the storage target data stored in the means;
Information processing method.

（付記１２）
付記１１に記載の情報処理方法であって、
前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数を表すパリティ数を、当該フラグメントデータに関連付けて前記複数の記憶手段に分散して記憶し、
障害が発生していない前記他の記憶手段に記憶されている前記記憶対象データを構成する前記他のフラグメントデータに関連付けられている前記パリティ数に基づいて、前記記憶対象データを構成する前記フラグメントデータのうちの前記冗長データの数を特定し、当該冗長データの数に基づく優先順位にて、前記他のフラグメントデータに基づいて前記障害が発生した記憶手段に記憶されていた前記記憶対象データを構成する前記フラグメントデータの再生成を行う、
情報処理方法。 (Appendix 12)
An information processing method according to attachment 11, wherein
A parity number representing the number of redundant data of the fragment data constituting the storage target data is distributed and stored in the plurality of storage means in association with the fragment data;
The fragment data constituting the storage target data based on the number of parities associated with the other fragment data constituting the storage target data stored in the other storage means in which no failure has occurred The number of the redundant data is specified, and the storage target data stored in the storage unit in which the failure has occurred is configured based on the other fragment data in the priority order based on the number of the redundant data. The fragment data is regenerated.
Information processing method.

なお、上述したプログラムは、記憶装置に記憶されていたり、コンピュータが読み取り可能な記録媒体に記録されている。例えば、記録媒体は、フレキシブルディスク、光ディスク、光磁気ディスク、及び、半導体メモリ等の可搬性を有する媒体である。 Note that the above-described program is stored in a storage device or recorded on a computer-readable recording medium. For example, the recording medium is a portable medium such as a flexible disk, an optical disk, a magneto-optical disk, and a semiconductor memory.

以上、上記実施形態等を参照して本願発明を説明したが、本願発明は、上述した実施形態に限定されるものではない。本願発明の構成や詳細には、本願発明の範囲内で当業者が理解しうる様々な変更をすることができる。 Although the present invention has been described with reference to the above-described embodiment and the like, the present invention is not limited to the above-described embodiment. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention.

１０ストレージシステム
１１バックアップシステム
１２バックアップ対象装置
２０アクセラレータノード
２１ファイルシステムサービス
２２ブロック分割処理部
２３重複排除処理部
２４分散処理部
３０ストレージノード
３１ノード・ディスク障害検出部
３２データ再生成制御器
３３再生成処理器
３４データ保存処理部
３５ディスク
10 Storage System 11 Backup System 12 Backup Target Device 20 Accelerator Node 21 File System Service 22 Block Division Processing Unit 23 Deduplication Processing Unit 24 Distributed Processing Unit 30 Storage Node 31 Node / Disk Failure Detection Unit 32 Data Regeneration Controller 33 Regeneration Processor 34 Data storage processor 35 Disk

Claims

A plurality of storage means;
Generating a plurality of fragment data composed of divided data obtained by dividing the storage target data into a plurality of data and redundant data for restoring the storage target data, and distributing and storing the plurality of fragment data in the plurality of storage means Distributed storage processing means;
The fragment data constituting the storage target data stored in the storage means in which a failure has occurred, the other fragment constituting the storage target data stored in another storage means in which no failure has occurred Data regenerating means for regenerating based on fragment data,
The distributed storage processing means stores a parity number indicating the number of redundant data among the fragment data constituting the storage target data in association with the fragment data, and has already been stored in the plurality of storage means. When the other storage target data determined to be the same as the storage target data and the data content are stored in the plurality of storage means, the data is already stored as the other storage target data. Referring to the storage target data, and storing a reference number representing the number of the storage target data referred to as the other storage target data in association with the fragment data constituting the storage target data,
The data regeneration unit is configured to store the storage target based on the parity number associated with the other fragment data constituting the storage target data stored in the other storage unit in which no failure has occurred. The number of the redundant data of the fragment data constituting the data is specified, and the storage stored in the storage unit in which the failure has occurred based on the other fragment data in the order of decreasing the number of the redundant data There line regeneration of the fragment data constituting the target data, further, when the number of the redundant data of said fragment data configuring the storage target data to be regenerated is the same, a failure The other fragment data constituting the storage target data stored in the other storage means not generated Based on the number of referenced, the number of the storage target data is referred to as the other storage target data, and based on the other fragment data in descending order of the referenced number Regenerating the fragment data constituting the storage target data stored in the storage means in which the failure has occurred,
Storage system.

The storage system according to claim 1 ,
The distributed storage processing unit, after the writing process of the other storage target data has been completed by referring to the storage target data already stored as the other storage target data, the storage object already stored Updating the number of referenceds associated with the data;
Storage system.

Information processing apparatus comprising at least one of a plurality of storage means for distributing and storing a plurality of fragment data composed of divided data obtained by dividing storage target data into a plurality of pieces and redundant data for restoring the storage target data In addition,
The fragment data constituting the storage target data stored in the storage means in which a failure has occurred, the other fragment constituting the storage target data stored in another storage means in which no failure has occurred Realize data regeneration means to regenerate based on fragment data,
The storage means stores a parity number representing the number of redundant data of the fragment data constituting the data to be stored in association with the fragment data, and further stores the parity number in the plurality of storage means. When the other storage target data determined to be the same as the data to be stored and the data content based on a preset criterion are stored in the plurality of storage means, the other storage target data has already been stored. The stored data to be stored is referred to, and the number of references that represents the number of the stored data to be referred to as the other data to be stored is stored in association with the fragment data constituting the data to be stored. And
The data regeneration unit is configured to store the storage target based on the parity number associated with the other fragment data constituting the storage target data stored in the other storage unit in which no failure has occurred. The number of the redundant data of the fragment data constituting the data is specified, and the storage stored in the storage unit in which the failure has occurred based on the other fragment data in the order of decreasing the number of the redundant data There line regeneration of the fragment data constituting the target data, further, when the number of the redundant data of said fragment data configuring the storage target data to be regenerated is the same, a failure The other fragment data constituting the storage target data stored in the other storage means not generated Based on the number of referenced, the number of the storage target data is referred to as the other storage target data, and based on the other fragment data in descending order of the referenced number Regenerating the fragment data constituting the storage target data stored in the storage means in which the failure has occurred,
A program to make things happen.

Generating a plurality of fragment data composed of divided data obtained by dividing the storage target data into a plurality of data and redundant data for restoring the storage target data, and storing the plurality of fragment data in a plurality of storage means in a distributed manner ,
The parity data representing the number of redundant data of the fragment data constituting the storage target data is stored in association with the fragment data, and the storage target data and data already stored in the plurality of storage means When storing other storage target data whose contents are determined to be the same based on a preset criterion in the plurality of storage means, refer to the storage target data already stored as the other storage target data And storing the number of references representing the number of the storage target data referred to as the other storage target data in association with the fragment data constituting the storage target data,
The fragment data constituting the storage target data stored in the storage means in which a failure has occurred, the other fragment constituting the storage target data stored in another storage means in which no failure has occurred When regenerating based on fragment data, based on the number of parities associated with the other fragment data constituting the storage target data stored in the other storage means in which no failure has occurred identifies the number of the redundant data of said fragment data configuring the storage target data, in the order number of the redundant data is small, is stored in the storage means that the problem has occurred on the basis of the other fragment data There line regeneration of the fragment data composing the which was the storage target data, further, it the object of regeneration When the number of the redundant data among the fragment data constituting the storage target data is the same, the other constituting the storage target data stored in the other storage means in which no failure has occurred Based on the referenced number associated with the fragment data, the number that the storage target data is referred to as the other storage target data is specified, and the other number is referred to in descending order. Regenerating the fragment data constituting the storage target data stored in the storage means where the failure has occurred based on fragment data;
Information processing method.