WO2016023230A1

WO2016023230A1 - Data migration method, controller and data migration device

Info

Publication number: WO2016023230A1
Application number: PCT/CN2014/084526
Authority: WO
Inventors: 蒲贵友
Original assignee: 华为技术有限公司
Priority date: 2014-08-15
Filing date: 2014-08-15
Publication date: 2016-02-18
Also published as: CN104583930A; CN104583930B

Abstract

Provided in an embodiment of the present invention are a data migration method, controller and data migration device. The method comprises the following steps: a controller counts frequency of access to a data block in a first level failure domain, the data block comprising i-th target data, i being an integer greater than or equal to zero and i<N (S201); when the frequency of access to the data block reaches a preset threshold, the controller determines in a second level disk set a second level failure domain in which to store the data block, the determined second level failure domain having not stored the target data (S202); and the controller migrates the data block to the determined second level failure domain (S203). The present invention prevents a plurality of target data from being migrated to the same failure domain when performing hierarchical storage operation, thus ensuring data reliability.

Description

Data migration method, controller and data migration device

The present invention relates to storage technologies, and in particular, to a data migration method, a controller, and a data migration device. Background technique

When storage devices store data, for data of higher importance, data reliability is usually ensured by storing multiple copies of data at the same time. For example, multiple copies of data can be stored in different fault domains. When the storage space of one copy of the data is corrupted, other copies of the data are still available. A fault domain is a storage area of a certain range. Data corruption in this area does not affect data in other areas.

At the same time, tiered storage technology has been widely used in storage devices. Hierarchical storage technology means that when there is a large speed and price difference in the storage medium in the storage device, the currently accessed data (also called hot data) is stored on a high-speed and high-price storage medium, and the current access is performed. Insufficient data (also known as cold data) is stored on low-speed, low-priced storage media.

This may occur when the above two technologies are applied to the same storage device, for example: Two copies of the data are saved in the storage device, and one copy of the data is stored in a failure to the storage medium of low speed and low price. In the domain, another copy of the data is stored in a fault domain to a low-speed, low-price storage medium. When one of the data blocks in which the data copy is located becomes a hot data block, the data block is needed to improve its access efficiency. When migrating to a high-speed, high-priced storage medium, it is possible to store the copy of the data and another copy of the data in the same fault domain. When the storage space of the fault domain is damaged, the data will be lost. Reliability Can not be guaranteed. Summary of the invention

Embodiments of the present invention provide a data migration method, a controller, and a data migration apparatus. Avoid It avoids the migration of multiple target data to the same fault domain during the tiered storage operation, ensuring the reliability of the data.

In a first aspect, an embodiment of the present invention provides a data migration method, where the method is performed by a storage device, where the storage device includes at least a controller, a first hierarchical disk set, and a second hierarchical disk set, where the first The hierarchical disk set includes a first hierarchical fault domain, the second hierarchical disk set includes at least two second hierarchical fault domains; the storage device stores N pieces of target data, and N is a positive integer greater than 1, the N The number is smaller than the number of the first level fault domains, and the N is smaller than the number of the second level fault domains; the method includes:

The controller collects an access frequency of a data block in the first level of the fault domain, where the data block includes an i th target data, where i is an integer greater than or equal to zero, and i < N;

When the access frequency of the data block reaches a preset threshold, the controller determines, in the second hierarchical disk set, a second level fault domain of the data block to be stored, and the determined second level The target domain data is not saved in the fault domain;

The controller migrates the data block to the determined second level of fault domain. With reference to the first aspect, in a first possible implementation, the determining, by the controller, the second level fault domain of the data block to be stored in the second hierarchical disk set includes:

Determining, by the controller, the number of the second level fault domain according to the i and N, and a preset association relationship, where the association relationship is that the number of the second level fault domain is equal to the N and a product of a random number plus the i, thereby obtaining a sum, wherein the random number is a positive integer greater than or equal to 1, and the random number does not exceed the first aspect of the combination included in the storage device or A first possible implementation of the first aspect, in a second possible implementation, the method further includes:

The controller saves a correspondence between the first level fault domain and the determined second level fault domain;

When the access frequency of the data block is lower than the preset threshold, the controller is configured according to the The correspondence migrates the data block from the determined second level fault domain to the first level fault domain.

With reference to the first aspect, or any one of the first to the second possible implementations of the first aspect, in a third possible implementation, the method further includes:

The controller receives the target data, and determines, according to a preset rule, that the number of copies of the target data in the storage device is N;

The controller copies the target data to generate N copies of the target data. In a second aspect, an embodiment of the present invention provides a controller, where the controller includes a processor and a communication interface;

The communication interface is configured to communicate with a disk array, where the disk array includes a first hierarchical disk set and a second hierarchical disk set, the first hierarchical disk set includes a first hierarchical fault domain, and the second hierarchical disk The set includes at least two second level fault domains; the disk array holds N pieces of target data, N is a positive integer greater than 1, the N is smaller than the number of the first level fault domains, and the N is smaller than the first The number of second-level fault domains;

The processor is configured to calculate an access frequency of a data block in the first hierarchical fault domain, where the data block includes an i th target data, where i is an integer greater than or equal to zero, and, i<N;

When the access frequency of the data block reaches a preset threshold, the processor determines, in the second hierarchical disk set, a second level fault domain of the data block to be stored, and the determined second level fault The domain does not save the target data;

The processor migrates the data block to the determined second level of fault domain. With reference to the second aspect, in a first possible implementation, the processor is specifically configured to determine a number of the second level fault domain according to the i and N, and a preset association relationship, where The association relationship is that the number of the second level fault domain is equal to the product of the N and the random number plus the i, and the obtained sum, wherein the random number is a positive integer greater than or equal to 1, and The random number does not exceed the second level fault domain included in the storage device The quotient obtained by dividing the number by the N.

With reference to the second aspect, or the first possible implementation manner of the second aspect, in a second possible implementation, the processor is further configured to save the first hierarchical fault domain and the determined second Correspondence between hierarchical fault domains;

And when the access frequency of the data block is lower than the preset threshold, the processor is further configured to migrate the data block from the determined second level fault domain to the The first level of fault domain.

With reference to the second aspect, or any one of the first to the second possible implementations of the second aspect, in a third possible implementation, the processor is further configured to receive the target data, according to the preset The rule determines that the number of copies of the target data in the storage device is N; the controller copies the target data to generate N copies of the target data.

In a third aspect, an embodiment of the present invention provides a data migration apparatus, where the data migration apparatus is located in a controller of a storage device, where the storage apparatus includes at least the controller, a first hierarchical disk set, and a second hierarchical disk. The set, the first hierarchical disk set includes a first hierarchical fault domain, and the second hierarchical disk set includes at least two second hierarchical fault domains; the storage device stores N pieces of target data, where N is greater than 1. a positive integer, the N is smaller than the number of the first hierarchical fault domains, and the N is smaller than the number of the second hierarchical fault domains; the data migration device includes:

a statistics module, configured to calculate an access frequency of the data block in the first level fault domain, where the data block includes an i th target data, where i is an integer greater than or equal to zero, and, i<N;

a determining module, configured to determine, in the second hierarchical disk set, a second level fault domain of the data block to be stored when the access frequency of the data block reaches a preset threshold, the determined second The hierarchical fault domain does not save the target data;

And a migration module, configured to migrate the data block into the determined second level fault domain.

With reference to the third aspect, in a first possible implementation, the determining module is specifically used to Determining, according to the i and N, and a preset association relationship, the number of the second level fault domain, where the association relationship is that the number of the second level fault domain is equal to the product of the N and the random number And adding the i, the obtained sum, wherein the random number is a positive integer greater than or equal to 1, and the random number does not exceed the number of second level fault domains included in the storage device The quotient obtained by the above N.

In combination with the third aspect or the first possible implementation manner of the third aspect, in a second possible implementation manner, the method further includes:

a saving module, configured to save a correspondence between the first level fault domain and the determined second level fault domain;

And the migrating module is further configured to migrate the data block from the determined second level fault domain to the first level fault domain according to the correspondence.

With reference to the third aspect, or any one of the first to the second possible implementation manners of the third aspect, in a third possible implementation, the method further includes: a receiving module, configured to receive the target copying module, Copying the target data to generate N copies of the target data. An embodiment of the present invention provides a data migration method, a controller, and a data migration apparatus, and calculates an access frequency of a data block in the first hierarchical fault domain, where the data block includes one of a plurality of target data. And when the access frequency of the data block in the first level fault domain reaches a preset threshold, the data block is migrated to the second level fault domain, and the second level fault domain does not save the target data, This avoids migrating multiple pieces of target data to the same fault domain when performing hierarchical storage operations, ensuring data reliability.

DRAWINGS

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings to be used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are Some embodiments of the present invention may also be used to obtain other drawings based on these drawings without departing from the skilled artisan. FIG. 1 is an application scenario diagram according to an embodiment of the present invention;

2 is a schematic structural diagram of a storage device according to an embodiment of the present invention;

3 is a diagram showing an example of a disk array according to an embodiment of the present invention;

4 is a schematic diagram of a fault domain distribution in a disk array according to an embodiment of the present invention; FIG. 5 is a schematic flowchart of a data migration method according to an embodiment of the present invention; FIG. 6 is a data migration apparatus according to an embodiment of the present invention. Schematic diagram of the structure.

detailed description

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is a partial embodiment of the invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

System architecture of an embodiment of the present invention

As shown in FIG. 1 , FIG. 1 is an application scenario diagram of an embodiment of the present invention. In an application scenario, the storage system includes a host 100, a connection device 105, and a storage device 110.

The host 100 can include any computing device known in the art, such as a server, desktop computer, application server, etc., with an operating system and other applications installed in the host 100, and the host 100 can have multiple.

Connection device 105 may include any interface between a storage device and a host known in the art, such as a fiber switch, or other existing switch.

The storage device 110 may include one or more interconnected disks of a storage device known in the prior art, such as a disk array, a cluster of disks (JBOD), and a direct access storage device (DASD). The drive, wherein the direct access memory can include a tape storage device such as a tape library, one or more storage units.

2 is a schematic structural diagram of a storage device 110 according to an embodiment of the present invention. The storage device shown in FIG. 2 is a disk array. As shown in FIG. 2, the storage device 110 may include a controller 115. And a disk array 125, wherein the disk array herein refers to a redundant array of independent disks

(Redundant Arrays of Independent Disks, RAID), there may be multiple disk arrays, and the disk array 125 is composed of a plurality of disks 130. The disk array 125 and the controller 115 can be communicatively connected via a communication protocol such as a small computer system interface (SCSI) protocol, which is not limited herein.

The controller 115 is the "brain" of the storage device 110 and mainly includes a processor 118, a cache 120, a memory 122, a communication bus (abbreviated as bus) 126, and a communication interface 128. Processor 118, cache 120, memory 122, and communication interface 128 communicate with one another via communication bus 126.

The communication interface 128 is configured to communicate with the host 100 or the disk array 125.

The memory 122 is used to store the program 124. The memory 124 may include a high speed RAM memory, and may also include a non-volatile memory such as at least one disk memory. It can be understood that the memory 124 can be a random access memory (RAM), a magnetic disk, a hard disk, an optical disk, a solid state disk (SSD), or a nonvolatile memory, and can store program codes. Non-transitory machine readable medium.

Program 124 can include program code, the program code including computer operating instructions.

Cache 120 (Cache) is used to cache data received from host 100 or data read from disk array 125 to improve the performance and reliability of the array. The cache 120 may be a non-transitory machine readable medium that can store data, such as a RAM, a ROM, a flash memory, or a solid state disk (SSD), which is not limited herein.

The processor 118 may be a central processing unit CPU, or an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits configured to implement embodiments of the present invention. An operating system and other software programs are installed in the processor 118, and different software programs can be regarded as one processing module having different functions. For example, processing input/output (I/O) requests to disk 130 for additional processing of data on disk and many more. Thereby, the controller 115 can implement various data management functions such as 10 operations, data copy copying, and tiered storage. In the embodiment of the present invention, the processor 118 is configured to execute the program 124, and specifically, the related steps in the following method embodiments may be performed.

The disk array 125 is used to store data. In the embodiment of the present invention, the disk array 125 may include multiple types of disks 130, such as Solid State Drive (SSD) or Serial Attached SCSI (SAS). Or Fiber Channel (FC) hard disk drive (HDD), where SCSI (Small Computer System Interface) is the abbreviation of minicomputer system interface or Serial Advanced Technology Attachment (SATA). Or Near Line (NL) Serial Attached SCSI (SAS) HDD, which is not limited here. For the convenience of description, the SSD is referred to as SSD, and the SAS HDD, FC HDD, and SATA HDD are simply referred to as HDD.

3 is an example of a disk array 125. As shown in FIG. 3, the disk array 125 may include a plurality of disks 130. The plurality of disks 130 may be divided into a high-speed device layer and a low-speed device layer, and the high-speed device layer may include multiple A high speed, high price disk 130, such as an SSD; a low speed device layer may contain a plurality of low speed, low cost disks 130, such as HDDs. It can be understood that, in the embodiment of the present invention, the plurality of disks 130 can be divided into other levels according to the type of the disk. Here, only two levels of the high-speed device layer and the low-speed device layer are illustrated.

In addition, the storage space of the disk array 125 can also be divided into a plurality of fault domains. A fault domain is a logical storage area where faults are isolated. Data corruption in this area does not affect data in other areas. In the embodiment of the present invention, the fault domain is set by the storage device 110. A fault domain may be a RAID, or may be a disk, or may be a storage area divided according to other manners. limited. For convenience of description, the embodiment of the present invention is described by taking a fault domain as a disk as an example, as shown in FIG. 4:

The disk array 125 includes a high speed device layer and a low speed device layer, and the high speed device layer may include multiple

SSD (take an SSD as an example, but not limited to one SSD), the low-speed device layer can contain more HDD (in the figure, two HDDs are used as examples, but not limited to two HDDs). An SSD can be a fault domain, and each HDD is a fault domain.

In the embodiment of the present invention, the fault domains need to be numbered. The specific numbering rule may be that the fault domains included in each device layer are numbered. For example, if the high-speed device layer includes three fault domains, then In the high-speed device layer, each fault domain is numbered as fault domain 0, fault domain 1, and fault domain 2; assuming that the low-speed device layer contains four fault domains, then for the low-speed device layer, each fault domain number is faulty. Domain 0, Fault Domain 1, Fault Domain 2, Fault Domain 3. In the embodiment of the present invention, the number of the fault domain is simply referred to as the fault domain number, and the fault domain number refers to the number used to distinguish each fault domain, such as 0, 1, and the like. It can be understood that the fault domain number can also use other expressions such as a letter, which is not limited in the embodiment of the present invention.

When the storage device 110 stores data, for data of higher importance, such as metadata, the reliability of the data is usually ensured by simultaneously storing a plurality of copies of the data. For example, as shown in Figure 4, two copies of the data are stored in the low-speed device layer, copy 0 and copy 1, respectively. Where copy 0 is stored in fault domain 0 and replica 1 is stored in fault domain 1. Assuming a failure such as disk corruption in the fault domain 0, a copy of the data can be obtained from the fault domain 1 without data loss. It will be understood by those skilled in the art that both the fault domain 0 and the fault domain 1 can save other data in addition to the copy of the data described above. In the embodiment of the present invention, the copy number refers to a number used to distinguish each data copy, for example, 0, 1. It is to be understood that the copy number may also be in other forms such as a letter, which is not limited in the embodiment of the present invention.

With reference to FIG. 1, any one of the storage devices 110 can receive a data write request sent by one or more hosts 100 through the connection device 105, and determine that the data carried by the data write request is determined according to a preset hierarchical storage policy. The written device layer (for example, the low-speed device layer) determines the number of copies of the data that the data needs to be backed up and stored according to a preset data importance determination rule. Each copy of the data is then written to a different fault domain in the low-speed device layer. Any one of the storage devices 110 may also receive a data read request sent by one or more hosts 100 through the connection device 105, and read data from an address carried by the data read request. In addition, any one of the storage devices 110 in the embodiments of the present invention may also support a tiered storage technology. This will be described below in conjunction with FIG. 4 is a schematic diagram of a fault domain distribution in a disk array according to an embodiment of the present invention.

The storage space of the disk array 125 can be divided into a plurality of data blocks (not shown in Fig. 4), which are data units that can monitor the access frequency of the data therein as a whole and migrate as a whole. A data block may be located in one disk or in multiple disks. However, in the embodiment of the present invention, one data block must be located in a fault domain. It can be understood that when a data block is located in multiple disks, the fault domain in which the data block is located refers to a RAID. The storage device 110 monitors the access frequency of each data block in the disk array 125. For example, when it is found that the access frequency of the data block where the copy 0 is located is higher than a preset threshold, the data in the data block can be determined to become hot data. Therefore, the data block can be migrated to the fault domain of the high speed device layer according to a preset association relationship, for example, in the fault domain 0 of the high speed device layer. It should be noted that the statistics of the access frequency are in units of one data block, and the capacity of the data block is usually larger than the size of one data copy, and the data block may contain other data in addition to the copy 0. Therefore, the data block where the copy 0 is located may be caused by frequent access of the copy 0, or it may be caused by frequent access of data other than the copy 0 in the data block. When the data block where copy 0 is hot is caused by frequent access of copy 0, then the data block where copy 1 is located will also become hot and needs to be migrated to the fault domain of high-speed device layer; when the data block where copy 0 is located When the heat is caused by frequent accesses of data other than the copy 0 in the data block, the data block in which the copy 1 is located may not become hot and does not need to be migrated to the high-speed device layer. For the time being, only the case where copy 0 is migrated to fault domain 0 of the high-speed device layer is discussed.

The storage device 110 continues to monitor the access frequency of each data block in the disk array 125. When it is found that the data block of the fault domain 0 that has migrated to the high-speed device layer becomes cold again, the data block where the copy 0 is located needs to be moved back to the data block. In the low speed device layer. In order to prevent the replica 0 and the replica 1 from being stored in the same fault domain, the fault domain of the low-speed device layer to which the data block where the replica 0 is located will be determined according to the preset association relationship. Data migration method

The following describes the method for data migration provided by the embodiment of the present invention. As shown in FIG. 5, it is a flowchart of a method for data migration provided by the embodiment of the present invention. The method can be applied to the storage system shown in FIG. 1 and FIG. In the storage device shown, the storage device includes at least a first hierarchical disk set and a second hierarchical disk set, the first hierarchical disk set includes a first hierarchical fault domain, and the second hierarchical disk set includes at least two a second level fault domain, where the first level fault domain refers to a fault isolation logical storage area in the first hierarchical disk set, and the second hierarchical fault domain refers to the second hierarchical disk set a logical storage area of the fault isolation; the storage device stores N pieces of target data, N is a positive integer greater than 1, the N is smaller than the number of the first level fault domain, and the N is smaller than the second The number of hierarchical fault domains. It should be noted that the first hierarchical disk set in this embodiment is also the low speed device layer shown in FIG. 3. The second hierarchical disk set in this embodiment is also the high speed device layer shown in FIG. 3. Then, the first-level fault domain is the fault domain contained in the low-speed device layer, which may be an HDD, or may be a RAID, or other fault-isolated logical storage area; the second-level fault domain is a fault domain included in the high-speed device layer. It could be an SSD, or it could be a RAID, or other fault-isolated logical storage area.

The method includes the following steps:

Step S201: The processor 118 counts the access frequency of the data block in the first hierarchical fault domain, where the data block includes the i-th target data, where i is an integer greater than or equal to zero, and i<N.

In this embodiment, a total of N pieces of target data, that is, a copy of the data described above, are stored in the storage device. And need to number N target data, such as 0th target data (copy 0), 1st target data (copy 1), 2nd target data (copy 2) ... N-1 Target data (copy N-1). The i-th target data is stored in one data block in the first-level fault domain. The processor 118 can periodically count the access frequency of each data block in the first level fault domain. When it is found that the access frequency of the data block where the i-th target data is located reaches a preset threshold, step S202 is performed. It should be noted that, in this embodiment, the access frequency of each data block may be triggered by a timer, or may be manually triggered, or triggered in other manners.

Step S202: When the access frequency of the data block reaches a preset threshold, the processor 118 determines, in the second hierarchical disk set, a second level fault domain of the data block to be stored, where the determined The second level fault domain does not hold one of the N target data.

When the access frequency of the data block counted by the processor 118 reaches the preset threshold in the step S201, the data included in the data block becomes hot data, and in order to improve the access efficiency, the data may be migrated to the second level disk. Collections (for example, multiple SSDs).

Further, since the second hierarchical disk set contains a plurality of second level fault domains, the processor 118 needs to determine which of the plurality of second level fault domains is to be migrated.

An embodiment is to sequentially determine whether each second level fault domain in the second hierarchical disk set holds target data, and if not, determine the second level fault domain as the second to be stored. The hierarchical fault domain; if so, continues to determine whether the next second-level fault domain holds target data until it finds a second-level fault domain that does not hold the target data. It should be noted that, after determining the second level fault domain of the data block to be stored, the second level fault domain may be saved and the first level fault domain migrated by the data block may be saved. Correspondence between the two. When the access frequency of the data block is lower than the preset threshold, the data block is cooled, and the data block may be moved back to the original first-level fault domain according to the saved correspondence.

In another embodiment, the second hierarchical fault domain of the data block to be stored is determined using the set association relationship. The set association relationship refers to the number of copies "N" of the target data stored in the storage device 110, the number "i" of the target data held in the data block, and the second layer. The relationship between the number of the level fault domain. Specifically, the number of the second level fault domain is equal to the product of the N and a random number plus the i, and the obtained sum, wherein the random number does not exceed the content included in the storage device The number of second level fault domains divided by the N to obtain the quotient. The relationship will be exemplified below:

Assuming N=3, i=0, that is, the target data is saved in the storage device 110, and the number of each target data is 0, 1, and 2, respectively. The target data saved in the data block is the 0th target data. Then, the number of the second level fault domain of the data block to be stored is equal to the product of 3 and the random number plus 0, thereby obtaining the sum. When the random number is 0, the number of the second level fault domain of the data block to be stored is 0; when the random number is 1, the number of the second level fault domain of the data block to be stored is 3; When the number is 2, the number of the second level fault domain of the data block to be stored is 6... However, the random number cannot exceed the number of the second level fault domain divided by the N to obtain the quotient. Assuming that the number of second-level fault domains is 7, then the quotient of 7 divided by 3 is 2, so the random number must be less than or equal to 2.

Therefore, for the 0th target data, you can migrate to the second level fault domain numbered 0 or 3 or 6.

Optionally, for the implementation, after determining the second level fault domain of the data block to be stored, the second level fault domain may be saved and the first level fault domain migrated by the data block may not be saved. Correspondence between them. When the access frequency of the data block is lower than a preset threshold (the data block becomes cold), the data block may be moved back to a first level fault of the first hierarchical disk set according to a similar relationship with the foregoing. In the domain. At this time, the number of the first level fault domain is equal to the product of the N and a random number plus the i, and the obtained sum, wherein the random number does not exceed the first level included in the storage device. The number of fault domains divided by the N to obtain the quotient. For example, the number of the target data is unchanged, and is still the 0th target data, and the number of copies of the target data stored in the storage device 110 is still 3, that is, i=0, N=3, then according to the association. The first level fault domain obtained by the relationship can be numbered 0 or 3 or 6 (assuming that the first level disk set also contains 7 first level fault domains). In other words, When the data block needs to be moved back to the first-level disk set, it can be migrated to the first-level fault domain numbered 0 or the first-level fault domain numbered 3 or the first-level fault domain numbered 0. After the second level fault domain, the correspondence between the second level fault domain and the first level fault domain where the data block migrates is saved. When the access frequency of the data block is lower than a preset threshold, the data block may be moved back to the original first level fault domain according to the saved correspondence. It can be understood that, according to the saved correspondence, the data block can only be moved back to the original first-level fault domain, and the range is narrower than the first-level fault domain re-determined by the association relationship.

Step S203: The processor 118 migrates the data block to the determined second level fault domain.

According to the example of step S202, the processor 118 may migrate the data block to a second level fault domain numbered 0 or 3 or 6.

When the fault domain is a disk, the second level fault domain numbered 0 or 3 or 6 can be a disk numbered 0 or 3 or 6 respectively; when the fault domain is a RAID, the number is 0 or 3 or 6. The second level fault domain may be a RAID numbered 0 or 3 or 6, respectively. In this case, the data block may be migrated to the determined second level fault domain, specifically, the data block may be striped. Write to the RAID.

The embodiment of the present invention counts the access frequency of the data block in the first hierarchical fault domain, and the data block includes one of the plurality of target data, and the access frequency of the data block in the first hierarchical fault domain reaches the pre-predetermined frequency. When the threshold is set, the data block is migrated to the second hierarchical fault domain, and the second hierarchical fault domain does not save the target data, thereby avoiding migrating multiple target data to the same when performing the hierarchical storage operation In a fault domain, the reliability of the data is guaranteed.

In the above embodiment, the method may further include:

Step 204: The processor 118 receives the target data, determines, according to a preset rule, that the number of copies of the target data in the storage device is N; and copies the target data to generate N copies of the target data. . The processor 118 can receive a data write request sent by the host 100, where the data write request carries the target data. The preset rule here refers to a correspondence between the importance of the target data and the number of copies of the target data to be saved. For example, if the target data is business data, it is not important, and may be saved in the storage device 110; if the target data is metadata of the business data, the importance is high, and the storage may be The device 110 holds 3 copies, and so on.

110 may copy the target data to generate N pieces of target data.

Device of embodiment of the invention

An embodiment of the present invention provides a data migration apparatus, where the data migration apparatus is located in a controller of a storage device, where the storage apparatus includes at least the controller, a first hierarchical disk set, and a second hierarchical disk set, where The first level disk set includes a first level fault domain, the second level disk set includes at least two second level fault domains, the storage device stores N pieces of target data, and N is a positive integer greater than 1. N is smaller than the number of the first level fault domains, and the N is smaller than the number of the second level fault domains. As shown in FIG. 6, the data migration apparatus includes:

The statistics module 401 is configured to calculate an access frequency of the data block in the first hierarchical fault domain, where the data block includes an i th target data, where i is an integer greater than or equal to zero, and, i<N;

a determining module 402, configured to determine, in the second hierarchical disk set, a second level fault domain of the data block to be stored when the access frequency of the data block reaches a preset threshold, the determined The second level fault domain does not save the target data;

The migration module 403 is configured to migrate the data block to the determined second level fault domain.

Optionally, the determining module 402 is specifically configured to:

Determining the second level fault according to the i and N, and a preset association relationship The number of the domain, the association relationship is that the number of the second level fault domain is equal to the product of the N and the random number plus the i, thereby obtaining a sum, wherein the random number is greater than or equal to A positive integer of 1, and the random number does not exceed the quotient obtained by dividing the number of second level fault domains included in the storage device by the N.

Optionally, the data migration device further includes:

a saving module 404, configured to save a correspondence between the first level fault domain and the determined second level fault domain;

Correspondingly, the migration module 403 is further configured to migrate the data block from the determined second level fault domain to the first level fault domain according to the correspondence.

Optionally, the data migration device further includes:

The receiving module 405 is configured to receive the target data, and determine, according to a preset rule, that the number of copies of the target data in the storage device is N;

The copying module 406 is configured to copy the target data to generate N pieces of the target data.

The device provided by the embodiment of the present invention may be configured in the controller described in the foregoing embodiment, and is used to perform the data storage method described in the foregoing embodiments. For a detailed description of the function of each module, refer to the description in the method embodiment. I will not repeat them here.

Those of ordinary skill in the art will appreciate that various aspects of the present invention, or possible implementations of various aspects, can be embodied as a system, method, or computer program product. Thus, aspects of the invention, or possible implementations of various aspects, may be in the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, etc.), or a combination of software and hardware aspects, They are collectively referred to herein as "circuits," "modules," or "systems." Furthermore, various aspects of the invention, or possible implementations of various aspects, may take the form of a computer program product, which is a computer readable program code stored in a computer readable medium.

The computer readable medium can be a computer readable signal medium or a computer readable storage medium. Computer readable storage media includes, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor System, device or device, or any suitable combination of the foregoing, such as random access memory

(RAM), Read Only Memory (ROM), Erasable Programmable Read Only Memory (EPROM or Flash Memory), Optical Fiber, Portable Read Only Memory (CD-ROM:).

The processor in the computer reads the computer readable program code stored in the computer readable medium, such that the processor can perform the functional actions specified in each step or combination of steps in the flowchart; A device that functions as specified in each block, or combination of blocks.

The computer readable program code can be executed entirely on the user's computer, partly on the user's computer, as a separate software package, partly on the user's computer and partly on the remote computer, or entirely on the remote computer or server. . It should also be noted that in some alternative implementations, the functions noted in the various steps of the flowchart, or in the blocks in the block diagrams, may not occur in the order noted. For example, depending on the functionality involved, the two steps shown in succession, or two blocks may actually be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order.

Claims

Claim

A method for data migration, the method is performed by a storage device, where the storage device includes at least a controller, a first hierarchical disk set, and a second hierarchical disk set, where the first hierarchical disk set includes a first level fault domain, the second level disk set includes at least two second level fault domains; the storage device stores N pieces of target data, N is a positive integer greater than 1, and the N is smaller than the first level The number of fault domains, and the N is smaller than the number of fault domains in the second level; the method includes:

The controller migrates the data block to the determined second level of fault domain.

The method according to claim 1, wherein the controller determines, in the second hierarchical disk set, that the second level fault domain to be stored is:

Determining, by the controller, the number of the second level fault domain according to the i and N, and a preset association relationship, where the association relationship is that the number of the second level fault domain is equal to the N and a product of a random number plus the i, thereby obtaining a sum, wherein the random number is a positive integer greater than or equal to 1, and the random number does not exceed that included in the storage device

The method according to claim 1 or 2, further comprising:

When the access frequency of the data block is lower than the preset threshold, the controller migrates the data block from the determined second level fault domain to the first level fault according to the correspondence relationship In the domain.

The method according to any one of claims 1 to 3, further comprising: the controller receiving the target data, determining, according to a preset rule, the target data in the storage device The number of copies saved is N;

The controller copies the target data to generate N copies of the target data.

A controller, comprising: a processor and a communication interface; the communication interface, configured to communicate with a disk array, wherein the disk array comprises a first level disk set and a second level disk The set, the first hierarchical disk set includes a first hierarchical fault domain, the second hierarchical disk set includes at least two second hierarchical fault domains; the disk array holds N pieces of target data, and N is greater than 1. a positive integer, the N is smaller than the number of the first hierarchical fault domains, and the N is smaller than the number of the second hierarchical fault domains;

The processor migrates the data block to the determined second level of fault domain.

The controller according to claim 5, wherein the processor is specifically configured to determine, according to the i and N, and a preset association relationship, the number of the second level fault domain, The association relationship is that the number of the second level fault domain is equal to the product of the N and the random number plus the i, and the obtained sum is, wherein the random number is a positive integer greater than or equal to 1. And the random number does not exceed the number of the second level fault domains included in the storage device divided by the N to obtain the quotient.

7. A controller according to claim 5 or claim 6 wherein:

The processor is further configured to save a correspondence between the first hierarchical fault domain and the determined second hierarchical fault domain; When the access frequency of the data block is lower than the preset threshold, the processor is further configured to migrate the data block from the determined second-level fault domain according to the correspondence relationship The first level of fault domain.

The controller according to any one of claims 5-7, wherein the processor is further configured to receive the target data, and determine the target data in the storage device according to a preset rule. The number of copies saved in the file is N; the controller copies the target data to generate N copies of the target data.

A data migration device, wherein the data migration device is located in a controller of a storage device, the storage device includes at least the controller, a first hierarchical disk set, and a second hierarchical disk set, The first level disk set includes a first level fault domain, the second level disk set includes at least two second level fault domains; the storage device stores N pieces of target data, and N is a positive integer greater than 1. The N is smaller than the number of the first level fault domains, and the N is smaller than the number of the second level fault domains; the data migration device includes:

a statistics module, configured to calculate an access frequency of the data block in the first level fault domain, where the data block includes an i-th target data, where i is an integer greater than or equal to zero, and, i<N; a determining module, Determining, in the second hierarchical disk set, a second level fault domain of the data block to be stored when the access frequency of the data block reaches a preset threshold, the determined second level fault domain The target data is not saved;

The device according to claim 9, wherein the determining module is specifically configured to determine a number of the second level fault domain according to the i and N, and a preset association relationship, The association relationship is that the number of the second level fault domain is equal to the product of the N and the random number plus the i, thereby obtaining a sum, wherein the random number is a positive integer greater than or equal to 1, and The random number does not exceed the number of the second level fault domains included in the storage device divided by the N to obtain the quotient.

The device according to claim 9, further comprising: a saving module, configured to save a correspondence between the first hierarchical fault domain and the determined second hierarchical fault domain;

12. Apparatus according to any of claims 9-11, further comprising:

a receiving module, configured to receive the target data, and determine, according to a preset rule, that the number of copies of the target data in the storage device is N;

And a copy module, configured to copy the target data to generate N pieces of the target data.