CN110209350B

CN110209350B - Dynamic scheduling method for application I/O (input/output) request in HPC (high performance computing) system of hybrid storage architecture

Info

Publication number: CN110209350B
Application number: CN201910386909.8A
Authority: CN
Inventors: 石宣化; 金海�; 杨莹; 姜焰; 花昱圣
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2019-05-10
Filing date: 2019-05-10
Publication date: 2020-07-10
Anticipated expiration: 2039-05-10
Also published as: CN110209350A

Abstract

The invention discloses a dynamic scheduling method of an application I/O request in a hybrid storage architecture HPC system, belonging to the technical field of high-performance computing. The invention uses the random degree to represent the access mode characteristics of the application, dynamically schedules the I/O request, selects the application with larger random degree to write in the SSD which is insensitive to the random degree, the application with smaller random degree to write in the HDD which is sensitive to the random degree, the HDD processes the request of the continuous mode as much as possible, and the SSD processes the request of the random mode as much as possible, thereby reducing the I/O interference problem. According to the discovered phenomenon that the bandwidth distribution is relatively stable during I/O resource competition, a calculation method for bandwidth allocation is provided, and the calculation method can be used for predicting the dynamic load of the storage device. By combining two parameters of the random degree of the application and the bandwidth which can be obtained in the storage equipment, the I/O request is reasonably scheduled by utilizing the characteristics of the access mode of the application and the load characteristics of different storage equipment, the application is guaranteed to finish running before running time, and meanwhile, the system performance is improved, so that the service quality of the application is guaranteed.

Description

Dynamic scheduling method for application I/O (input/output) request in HPC (high performance computing) system of hybrid storage architecture

Technical Field

The invention belongs to the technical field of high-performance computing, and particularly relates to a dynamic scheduling method for application I/O (input/output) requests in a hybrid storage architecture (HPC) system.

Background

Multi-tenant, multi-load scenarios are increasingly common in cloud HPC (High Performance Computing) systems, which means that storage resources in the system are shared among more and more different users. Therefore, under the condition of limited physical resources, it is important to provide quality of service guarantee for applications and tenants. The quality of service guarantee means that the performance of an application can be maintained within a reasonable threshold range of an agreement, and the service performance of another application cannot be influenced no matter how the arrival rate of access requests of other applications, the randomness and the sequence of the requests, the read-write proportion of the requests and the like are changed. The size of the computing is also continuously enlarged, and more storage resources are needed, and meanwhile, the limited storage resources are needed to serve more applications. When multiple applications access the storage service simultaneously, they will compete for I/O resources, which will result in a severe reduction in I/O aggregate bandwidth. These I/O requests with different access modes are mixed together, and the current primary storage medium, the Hard Disk Drive (HDD), can efficiently handle continuous mode requests, but when handling random mode requests, it can cause I/O interference due to the severe head seek overhead caused by frequent positioning.

To solve the problem of severe performance degradation with random requests, many solutions have been proposed, where the use of a new type of storage medium is a better choice. Since a Solid State Drive (SSD) has excellent performance and is insensitive to random requests, and the cost per GB is much lower than that of a DRAM as a resident storage, the SSD is often used as the most commonly used high-speed storage device. In HPC systems, however, complete replacement of HDDs with SSDs is not practical due to the tremendous storage requirements and the relatively high cost of SSDs. Therefore, a hybrid storage architecture of the SSD and the HDD becomes a better choice.

However, in the current hybrid storage architecture, the scheduling of the application I/O is still primitive, and the I/O interference still occurs when multiple applications compete for the I/O resources, which results in the fact that the application service quality cannot be guaranteed and the performance of the storage system is low.

Disclosure of Invention

Aiming at the defects of the prior art, the invention aims to solve the technical problems that the service quality of the I/O scheduling application of the hybrid storage architecture in the prior art cannot be guaranteed and the performance of a storage system is low.

In order to achieve the above object, in a first aspect, an embodiment of the present invention provides a method for dynamically scheduling application I/O requests in a hybrid storage architecture HPC system, where the method includes the following steps:

s1, grouping all I/O requests reaching a file system layer of an HPC system, wherein the requests from the same application are grouped into the same group;

s2, judging whether N is equal to 1, if so, directly writing the request corresponding to the application into the HDD; otherwise, judge 1<N≤N₀If so, calculating the random degree of each application and sequencing according to the random degree, writing the request corresponding to the application with the minimum random degree into the HDD, and writing the requests corresponding to other applications into the SSD; otherwise, calculating the random degree of each application and sequencing according to the random degree, writing the requests corresponding to the M applications with larger random degree into the SSD, and writing the requests corresponding to the rest N-M applications into the HDD;

s3, if a new application is added to operate, calculating and comparing B of the application_i-SAnd B_i-HDIf B is_i-SSD≤B_i-HDDWriting the request corresponding to the application into HDD, if B_i-SSD＞B_i-HDDFurther comparing the application with the maximum random degree running in the HDD, writing the request corresponding to the larger random degree in the two applications into the SSD, and writing the request corresponding to the smaller random degree in the two applications into the HDD;

wherein, B_i-SSD、B_i-HDDRespectively allocating the obtained bandwidth for the application running on the SSD and the obtained bandwidth for the application running on the HDD, wherein M is B_i-SSDIs initially less than B_i-HDDN is the number of applications in parallel, N₀A threshold is preset for the number of parallel applications.

Specifically, the calculation method of the degree of randomness applied is as follows: ordering the offsets of the I/O requests; if the distance between the two requests after the I/O requests are subjected to offset sorting is equal to the size of the requests, the two requests are considered to be continuous, and the value of the random factor is 0, otherwise, the two requests are considered to be random requests, and the value of the random factor is 1; the degree of randomness R is calculated using the following formula:

wherein, R _ Factor is a random Factor, and K is the total distance number in all the requests corresponding to the application.

In particular, the number of applications in parallel is preset by a threshold N₀The value range is [3,10 ]]。

In particular, concurrent applications may allocate the bandwidth B obtained_iThe calculation formula is as follows:

wherein, B_iRepresenting the bandwidth that the ith application runs on the storage device and may be allocated, B representing the aggregate bandwidth under the storage device, P_iA scaling factor, Q, representing the size of the ith application request_iIndicating the ith application request size and n indicating the number of parallel applications.

In particular, aggregate bandwidth B under HDD_HDDThe calculation formula is as follows:

wherein, B_peakFor the peak bandwidth of the HDD without I/O interference,

the average of the degree of randomness R is applied for operation in the HDD.

In particular, aggregated bandwidth B under SSD_SSDThe peak bandwidth of the solid state disk.

In a second aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for dynamically scheduling application I/O requests in the hybrid storage architecture HPC system according to the first aspect.

Generally, compared with the prior art, the above technical solution conceived by the present invention has the following beneficial effects:

1. the invention uses the random degree to represent the access mode characteristics of the application, and is convenient for dynamically scheduling the I/O request by combining the storage device characteristics. The application with the larger random degree is selected to write the SSD which is insensitive to the random degree, the application with the smaller random degree is written into the HDD which is sensitive to the random degree, the HDD processes the request of a continuous mode as much as possible, and the SSD processes the request of a random mode as much as possible, so that the problem of I/O interference is solved.

2. The invention provides a calculation method for bandwidth allocation according to the discovered phenomenon that bandwidth distribution is relatively stable during I/O resource competition, which can be used for predicting the dynamic load of the storage equipment so as to dynamically sense the load and is beneficial to selecting proper storage equipment to schedule an I/O request.

3. The invention combines two parameters of the random degree of the application and the bandwidth which can be obtained in the storage equipment, reasonably schedules the I/O request by utilizing the characteristics of the access mode of the application and the load characteristics of different storage equipment, ensures that the application finishes running before running time, and simultaneously improves the system performance, thereby ensuring the service quality of the application.

Drawings

FIG. 1 is a flowchart of a method for dynamically scheduling application I/O requests in a hybrid storage architecture HPC system according to an embodiment of the present invention;

fig. 2 is a schematic diagram of dynamic scheduling according to an embodiment of the present invention;

fig. 3 is a schematic diagram of dynamic scheduling according to a second embodiment of the present invention;

fig. 4 is a schematic diagram of dynamic scheduling provided in the third embodiment of the present invention;

FIG. 5(a) is a schematic diagram of the experimental results of IOR1 provided by an embodiment of the present invention;

fig. 5(b) is a schematic diagram of an experimental result of the IOR2 provided in the embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, a method for dynamically scheduling application I/O requests in a hybrid storage architecture HPC system includes the following steps:

s3, if a new application is added to operate, calculating and comparing B of the application_i-SSDAnd B_i-HDIf B is_i-SSD≤B_i-HDWriting the request corresponding to the application into HDD, if B_i-SSD＞B_i-HDDFurther comparing the application with the maximum random degree running in the HDD, writing the request corresponding to the larger random degree in the two applications into the SSD, and writing the request corresponding to the smaller random degree in the two applications into the HDD;

wherein, B_i-SSD、B_i-HDDRespectively allocating the obtained bandwidth for the application running on the SSD and the obtained bandwidth for the application running on the HDD, wherein M is B_i-SSDIs initially less than B_i-HN is the number of applications in parallel, N₀A threshold is preset for the number of parallel applications.

Step S1. group all I/O requests that reach the file system layer of the HPC system, with requests from the same application grouped into the same group.

To ensure that requests from the same application are written to the same storage device, all I/O requests arriving at the file system layer of the HPC system are grouped, with requests from the same application grouped into the same group.

Step S2, judging whether N is equal to 1 or not, and if so, judging whether the N is equal to 1 or notWriting directly to the HDD with a corresponding request; otherwise, judge 1<N≤N₀If so, calculating the random degree of each application and sequencing according to the random degree, writing the request corresponding to the application with the minimum random degree into the HDD, and writing the requests corresponding to other applications into the SSD; otherwise, calculating the random degree of each application and sequencing according to the random degree, writing the requests corresponding to the M applications with larger random degree into the SSD, and writing the requests corresponding to the rest N-M applications into the HDD.

HDDs are very sensitive to I/O requests where the access pattern is random, which can result in frequent head seeks, resulting in degraded performance. SSD is not sensitive to random patterns and can handle these requests efficiently. Therefore, the application access pattern is characterized by a degree of randomness, so that the appropriate storage devices are selected to write the requests separately.

The dynamic scheduling of the I/O request is carried out in three situations according to different number N of the parallel applications in the initial time.

Case 1: and when N is 1, directly writing the request corresponding to the application into the HDD.

Case 2: 1<N≤N₀And calculating the random degree of each application and sequencing according to the random degree, writing the request corresponding to the application with the minimum random degree into the HDD, and writing the requests corresponding to other applications into the SSD.

Case 3: n is a radical of>N₀Calculating the random degree of each application and sequencing according to the random degree, writing the requests corresponding to M applications with larger random degree into the SSD, and writing the requests corresponding to the rest N-M applications into the HDD, wherein M is B_i-SSDIs initially less than B_i-HDDCritical value of (A), B_i-SSD、B_i-HDDThe obtained bandwidth may be allocated for the application running on the SSD and the obtained bandwidth may be allocated for the application running on the HDD, respectively.

If M applications are running in SSD, N-M applications are running in HDD, B_i-SSDJust beginning to be less than B_i-HDDM is a threshold value. N is a radical of₀Has a value range of [3,10 ]]And preferably 5.

The applied randomness calculation method is as follows: first, the offsets of the I/O requests are ordered. The Random Factor (Random Factor) depends on the distance between two ordered requests for the I/O request offset. If the distance is equal to the request size, the two requests are considered to be consecutive, and the random factor takes a value of 0. Otherwise, the two requests are considered to be random requests, and the random factor is 1. Finally, the degree of randomness can be calculated using the following formula.

S3, if a new application is added to operate, calculating and comparing B of the application_i-SSAnd B_i-HDDIf B is_i-SS≤B_i-HDDWriting the request corresponding to the application into HDD, if B_i-SSD＞B_i-HDDAnd further comparing the application with the maximum random degree running in the HDD, writing the request corresponding to the larger random degree in the two applications into the SSD, and writing the request corresponding to the smaller random degree in the two applications into the HDD.

And running a large number of experiments, performing combined experiments on the applications with different access modes and request sizes on different storage devices, and analyzing the experiment results. From bandwidth distribution observations on concurrent running of multiple applications, it is known that as I/O requests pass through the file system layer, their resulting service will follow a relatively fixed distribution, with the bandwidth allocation exhibiting a nearly equally-divided, regular distribution. This means that the processing capacity of the storage device in a competitive environment, i.e. how many specific I/O requests are processed, can be estimated for a given duration.

Taking advantage of this phenomenon, the invention proposes a method for calculating the bandwidth B that can be allocated by concurrent applications_iFor characterizing the load condition of the storage device during operation.

Wherein, B_iIndicating that the ith application is running in storageThe device may allocate the acquired bandwidth, B denotes the aggregate bandwidth under the storage device, P_iA scaling factor, Q, representing the size of the ith application request_iIndicating the ith application request size and n indicating the number of parallel applications. According to a nearly equally divided bandwidth distribution, P_iThe value is 1. Q_iValues of 64K, 128K, 256K, etc.

For example, if there are two applications running simultaneously, one 64k and one 128k, then an application requesting size 64k requests bandwidth 1/3B and an application requesting size 128k requests bandwidth 2/3B.

According to experiments, the I/O bandwidth and the random degree of the application request are approximately linear in the HDD, and the I/O bandwidth gradually decreases along with the increase of the random degree of the application request. Therefore, the total bandwidth B obtained by the application running the mechanical hard disk under the hybrid storage architecture can be calculated according to the peak bandwidth of the HDD under the condition of no I/O interference of the mechanical hard disk in the hybrid storage architecture and the random degree of the application request_HDDThereby further calculating the bandwidth B that an application running on the HDD at the same time may be allocated to obtain_i-HDD。

The HDD aggregate bandwidth formula is calculated as follows:

wherein, B_peakFor the peak bandwidth of the HDD without I/O interference,

the average of the degree of randomness is applied for runs in the HDD.

Due to the SSD property (almost 0 access latency) of the solid state diskLate), is insensitive to random requests, and considers the aggregate bandwidth B of the solid state disk under the hybrid storage architecture_SSDThe peak bandwidth of the solid state disk is obtained, so that the bandwidth B possibly allocated and obtained by one application running in the SSD at the same time can be further calculated_i-SSD。

As shown in FIG. 2, in the first embodiment, only one application runs in the HPC system with the mixed storage architecture alone, and no other applications run simultaneously, the application is not interfered by I/O, the running end time of the application can be met by writing to the HDD, so that when one application runs, the application is selected to be directly written to the HDD.

As shown in fig. 3, in the second embodiment, two applications run simultaneously, the randomness of the two applications is first calculated, and then the randomness of the two applications is compared. The application with the smaller degree of randomness is written to the HDD, and the other is written to the SSD.

In the third embodiment, N applications start to run simultaneously, and it is first calculated that the bandwidth B may be allocated and obtained when the applications run in the SSD and the HDD_i-SSDAnd B_i-HDD(ii) a Then a threshold is found, where if M applications are running in SSD, N-M applications are running in HDD, B_i-SSDJust beginning to be less than B_i-HDD(ii) a Secondly, calculating the random degree of N applications, sequencing, selecting the first M applications with larger random degree to run in the SSD, and running the rest N-M applications in the HDD. As shown in FIG. 4, the original N applications are running, and at the same time, new applications are continuously added to run, and the bandwidth B possibly allocated to the N +1 th application in the SSD and HDD is calculated_i-SAnd B_i-HDDIf B is_i-SSDIs less than B_i-HDDAt this time, it is not suitable to run the (N + 1) th application in the SSD, and if the application is selected to run in the SSD, not only a considerable bandwidth is not obtained, but also the application already running in the SSD is tired, so the (N + 1) th application is directly written into the HDD; if B is present_i-SSDGreater than B_i-HDDCalculating the random degree of the (N + 1) th application, comparing the random degree of the (N + 1) th application with the random degree of the application with the maximum random degree in the HDD, if the random degree of the (N + 1) th application is less than or equal to that in the HDD, the application which is originally operated in the HDD is resetAnd (4) writing the (N + 1) th application into the SSD while the (N + 1) th application monopolizes the HDD, otherwise, writing the (N + 1) th application into the SSD.

The limited physical resources are shared by more and more users in the cloud HPC environment, the invention does not blindly pursue the highest performance, but allocates the I/O resources according to the application requirements to guarantee the service quality of the application, similar to the on-demand service in cloud computing. The invention defines the requirements of an application using an end-of-run time, which refers to the run time of the application when running alone undisturbed. Each application has an end-of-run time, and when the execution time exceeds the end of the run, it is difficult to ensure the service quality of the application. Each application can be executed before its own run end time in the best case. Meanwhile, the running end time and the throughput of the application can be used as a judgment basis for guaranteeing the service quality. Therefore, the present invention proposes a dynamic scheduling strategy, which utilizes the characteristics of different storage devices to ensure that each application can meet the end-of-run time, thereby improving bandwidth utilization and providing high I/O performance.

The cluster for carrying out verification experiments is provided with 10 nodes, wherein 8 nodes are used as computing nodes, 2 nodes are used as I/O nodes, the memory size of each computing node is 64GB, a 300GB mechanical disk is configured, the memory size of each I/O node is 8GB, a 300GB solid state disk is configured, the nodes are connected through an Infiniband network, and the rest configurations are the same.

Comparing fig. 5(a) and fig. 5(b), it can be seen that, when the time interval is 0s, the performance of IOR1 in DD L-QoS is improved by 2.5 times, and the performance of IOR2 is improved by 3.79 times, because when the time interval is 0, the I/O interference between two IOR instances is the largest, resulting in severe performance degradation.

In the experiment of the present invention, when the time interval was 36 seconds, the performance of IOR1 in OrangeFS was maximized, which is the same as the performance in DD L-QoS, because IOR1 had no I/O interference at all during operation, in IOR2, the larger the time interval, the smaller the I/O interference generated between IOR1 and IOR2, the same the performance of IOR2 in OrangeFS would be improved as the time interval increases, however, in DD L-QoS, the performance of IOR2 would decrease as the time interval increases, because writing is needed to avoid the I/O interference to decrease the extent of the I/O interference, the amount of data written to HDD SSD would eventually decrease as the I/O interference portion decreases.

The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for dynamically scheduling application I/O requests in a hybrid storage architecture (HPC) system, the method comprising the steps of:

s2, judging whether N is equal to 1, if so, directly writing the request corresponding to the application into the HDD; otherwise, judging that N is more than 1 and less than or equal to N₀If so, calculating the random degree of each application and sequencing according to the random degree, writing the request corresponding to the application with the minimum random degree into the HDD, and writing the requests corresponding to other applications into the SSD; otherwise, calculating the random degree of each application and sequencing according to the random degree, writing the requests corresponding to the M applications with larger random degree into the SSD, and writing the requests corresponding to the rest N-M applications into the HDD;

s3, if a new application is added to operate, calculating and comparing B of the application_i-SSDAnd B_i-HDDIf B is_i-SSD≤B_i-HDDWriting the request corresponding to the application into HDD, if B_i-SSD＞B_i-HDDFurther comparing the application with the maximum random degree running in the HDD, writing the request corresponding to the larger random degree in the two applications into the SSD, and writing the request corresponding to the smaller random degree in the two applications into the HDD;

wherein, B_i-SSD、B_i-HDDRespectively allocating the obtained bandwidth for the application running on the SSD and the obtained bandwidth for the application running on the HDD, wherein M is B_i-SSDIs initially less than B_i-HDDN is the number of applications in parallel, N₀Presetting a threshold value for the number of parallel applications;

the applied randomness calculation method is as follows: ordering the offsets of the I/O requests; if the distance between the two requests after the I/O requests are subjected to offset sorting is equal to the size of the requests, the two requests are considered to be continuous, and the value of the random factor is 0, otherwise, the two requests are considered to be random requests, and the value of the random factor is 1; the degree of randomness R is calculated using the following formula:

2. The dynamic scheduling method of claim 1 wherein the number of applications in parallel is preset by a threshold N₀The value range is [3,10 ]]。

3. A method for dynamic scheduling as claimed in any one of claims 1 to 2 wherein the bandwidth B obtained by the possible allocation is applied concurrently_iThe calculation formula is as follows:

4. The dynamic scheduling method of claim 3 wherein the aggregate bandwidth under HDD B_HDDThe calculation formula is as follows:

wherein, B_peakFor the peak bandwidth of the HDD without I/O interference,

the average of the degree of randomness R is applied for operation in the HDD.

5. The dynamic scheduling method of claim 3 wherein the aggregated bandwidth B under SSD is B_SSDThe peak bandwidth of the solid state disk.

6. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, implements a method for dynamic scheduling of application I/O requests in a hybrid storage architecture, HPC, system according to any one of claims 1 to 5.