WO2024212614A1 - 基于多维资源预测的混合弹性伸缩方法 - Google Patents
基于多维资源预测的混合弹性伸缩方法 Download PDFInfo
- Publication number
- WO2024212614A1 WO2024212614A1 PCT/CN2023/143167 CN2023143167W WO2024212614A1 WO 2024212614 A1 WO2024212614 A1 WO 2024212614A1 CN 2023143167 W CN2023143167 W CN 2023143167W WO 2024212614 A1 WO2024212614 A1 WO 2024212614A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- elastic
- expansion
- scaling
- model
- business
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 230000008602 contraction Effects 0.000 claims abstract description 40
- 238000012549 training Methods 0.000 claims abstract description 30
- 230000008569 process Effects 0.000 claims abstract description 23
- 238000012544 monitoring process Methods 0.000 claims abstract description 18
- 238000007635 classification algorithm Methods 0.000 claims abstract description 14
- 238000007781 pre-processing Methods 0.000 claims abstract description 9
- 238000005457 optimization Methods 0.000 claims abstract description 5
- 230000015654 memory Effects 0.000 claims description 41
- 230000006870 function Effects 0.000 claims description 12
- 238000004422 calculation algorithm Methods 0.000 claims description 9
- 210000002569 neuron Anatomy 0.000 claims description 9
- 230000004913 activation Effects 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 6
- 230000004044 response Effects 0.000 claims description 5
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 230000001174 ascending effect Effects 0.000 claims description 4
- 238000004140 cleaning Methods 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 3
- 230000035945 sensitivity Effects 0.000 abstract description 8
- 238000010586 diagram Methods 0.000 description 9
- 238000004590 computer program Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 4
- 101100233916 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR5 gene Proteins 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000013480 data collection Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 101001121408 Homo sapiens L-amino-acid oxidase Proteins 0.000 description 1
- 102100026388 L-amino-acid oxidase Human genes 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 239000000306 component Substances 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000013468 resource allocation Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000002922 simulated annealing Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5072—Grid computing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
- G06F9/505—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
- G06N3/0442—Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/147—Network analysis or design for predicting network behaviour
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/16—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
- H04L67/1008—Server selection for load balancing based on parameters of servers, e.g. available memory or workload
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present invention relates to the technical field of cloud computing, and in particular to a hybrid elastic scaling method based on multi-dimensional resource prediction.
- cloud computing technology has gradually matured. It uses technologies such as distributed computing and virtual resource management to centralize scattered resources to form a resource pool and dynamically provide elastic services to users.
- Elastic scaling is used to solve the dynamic computing demand scenarios of cloud-based business systems and automatically adjust the corresponding computing power according to business needs. When business demand increases, elastic scaling automatically moves in instances; when business demand decreases, elastic scaling automatically pops up instances to ensure service quality and balance operating costs.
- the present invention proposes a hybrid elastic scaling method based on multi-dimensional resource prediction, comprising:
- the historical data set when the historical data set is preprocessed, it includes:
- x t,j is the missing filling value of the j-th feature at time t, and its value is the average of the j-th feature at the moment before and after time t;
- x i,j is the original data
- x' i,j is the value of x i,j after standardization
- ⁇ j is the mean of the feature in the data set X
- ⁇ j is the variance of the feature in the data set X.
- the collected historical data set is divided into a training set D train and a test set D test using timestamps as indexes;
- the LSTM network is divided into an input layer, a hidden layer, and an output layer.
- the input layer is the business traffic data over a period of time.
- the number of neurons is consistent with the dimension of the input data.
- the LSTM model includes:
- the optimizer uses the Adam algorithm for gradient control, and the loss function uses the mean square error MSE for model evaluation, as shown in Formula 3:
- y actual is the actual value
- y predict is the predicted value
- model weights w and b are saved to the configuration file for prediction
- the training data set of the SVM classification algorithm consists of the service traffic prediction value Predict i at each moment predicted by the LSTM model and the actual scaling strategy Label SVM adopted at that moment.
- the SVM loss function is shown in Formula 4:
- yi is the actual value of the business traffic at time i
- wxi -b is the predicted value of the business traffic at time i
- ⁇ is the error between the predicted value and the true value of the tolerance model
- w is the model weight.
- the elastic expansion is performed after the current time t. If the elastic expansion has been performed within the rated time T before the current time t, the current expansion is canceled. If the elastic expansion has not been performed, a hybrid elastic expansion plan is generated and executed after the current time t.
- elastic contraction will be performed after the current moment t. If elastic contraction has been performed within the rated time T before the current moment t, this contraction will be canceled. If elastic contraction has not been performed, a mixed elastic contraction plan will be generated and executed after the current moment t.
- a hybrid elastic expansion solution including:
- the hybrid elastic expansion solution is generated based on user costs and expected resource volume; wherein,
- An expansion plan that meets business needs and has the lowest user cost is generated based on the user activation cost, the expected amount of resources for expansion, and the amount of resources that can be sold in the current resource pool, and an expansion model is established.
- the expansion model is shown in Formula 1:
- Ni is the number of instances of the available specification i that have been activated, and are the number of CPU cores and memory size of the salable specification Pi , respectively.
- CPU epx and MEM epx are the expected total number of CPU cores and total memory size, respectively.
- the expansion plan includes the instance specifications and quantities that need to be moved into the scaling group.
- the specifications are arranged in descending order by CPU core number and memory size.
- generating a mixed elastic shrinkage scheme includes:
- the hybrid elastic contraction scheme is generated based on the least load priority principle and the expected resource quantity;
- first k instances are taken for mixed elastic contraction
- some instances are scaled vertically and some instances are scaled horizontally, including:
- the present invention has the beneficial effect of improving the business flow of the application system of the cloud platform through multi-dimensional monitoring, and analyzing and predicting the business flow of the system in real time based on the LSTM model.
- the method of the present invention can realize the intelligent and automated hybrid elastic scaling process of the business system, improve the sensitivity of the cloud platform to monitoring changes in business system traffic, reduce the demand for business experience of cloud platform engineers, and is conducive to improving the utilization of cloud resources by the business system and reducing business operating costs.
- FIG1 is a flow chart of a hybrid elastic scaling method based on multi-dimensional resource prediction provided by an embodiment of the present invention
- FIG2 is a block diagram of future business flow prediction based on LSTM provided in an embodiment of the present invention.
- FIG3 is a schematic diagram of a hybrid elastic scaling solution process provided by an embodiment of the present invention.
- FIG. 4 is a schematic diagram of a system architecture provided by an embodiment of the present invention.
- the embodiment of the present invention proposes a hybrid elastic scaling method based on multi-dimensional resource prediction, business flow data collection and preprocessing based on multi-dimensional indicators; predicting future business flow and expected resource volume through LSTM model; generating a hybrid elastic scaling solution based on user cost and expected quota.
- business performance indicators need to be considered in order to more comprehensively describe real-time business flow.
- the changes in demand for different business systems will show certain regularities over time. This business flow regularity can be considered to be mined and learned in real time using artificial intelligence algorithms, and accurate decisions can be made based on the results of the learning to achieve elastic scaling of the computing power of the business system.
- the method of this embodiment can realize the intelligence and automation of the hybrid elastic scaling process of the business system, improve the cloud platform's sensitivity to monitoring changes in business system traffic, reduce the cloud platform engineers' demand for business experience, and help improve the business system's utilization of cloud resources and reduce business operating costs.
- this embodiment provides a hybrid elastic scaling method based on multi-dimensional resource prediction, comprising the following steps:
- Step S100 Collect business flow data of multi-dimensional indicators of the cloud platform, build a historical data set based on the collected business flow data, and pre-process the historical data set.
- a historical data set based on the collected business traffic data it includes:
- the historical data collected in real time by the cloud platform monitoring service is solidified into the database after collection.
- the collection cycle is set to 1 minute, that is, the data is collected once a minute.
- xt ,j is the missing filling value of the j-th feature at time t, and its value is the average of the j-th feature at the moment before and after time t.
- x i,j is the original data
- x' i,j is the value of x i,j after standardization
- ⁇ j is the mean of the feature in the data set X
- ⁇ j is the variance of the feature in the data set X.
- the method of this embodiment also includes: Step S200: constructing an LSTM model, inputting the preprocessed historical data set into the LSTM model for model training and model tuning, and predicting the business traffic forecast value of the cloud platform in a preset time period in the future through the LSTM model.
- the method of this embodiment uses the Long Short Term Memory (LSTM) model to predict future business traffic and expected resource volume.
- LSTM Long Short Term Memory
- LSTM is a recursive neural network algorithm that can Fully exploit time series prediction information with time dependency and contextual information in the short and long term.
- the collected historical data set is preprocessed and then divided into a training set D train and a test set D test using timestamps as indexes, which are used for subsequent model training and model tuning, respectively.
- the LSTM network is divided into an input layer, a hidden layer, and an output layer.
- the input layer is the business traffic data over a period of time.
- the number of neurons is consistent with the dimension of the input data.
- the LSTM network when constructing an LSTM network, is mainly divided into an input layer, a hidden layer, and an output layer.
- the input layer is the business flow data within a period of time.
- the number of neurons is consistent with the input data dimension; the number of hidden layers and the number of neurons in each layer need to be manually configured, generally based on the amount of business traffic data and the model training effect.
- the configuration principle is that the more layers and neurons there are, the more complex the model is and the longer the training time is.
- the number of layers and neurons can be set to 2 and 50 respectively according to actual conditions; the output layer is a fully connected layer, and the number of neurons is consistent with the feature dimension to generate a set of predicted values for future business traffic.
- the LSTM model is trained and predicted.
- the optimizer uses the Adam algorithm for gradient control, and the loss function uses the mean square error MSE for model evaluation, as shown in Formula 3:
- y actual is the actual value and y predict is the predicted value.
- the model weights w and b are saved to the configuration file for prediction.
- multiple LSTM networks are repeatedly constructed, the same training process is performed, and the average of the prediction results of multiple LSTM networks is selected as the final prediction value of the LSTM model.
- we repeatedly constructed four LSTM networks performed the same training process, and then took the average of the prediction results of the four networks as the final prediction value of the model.
- the parameters w and b in the configuration file are imported, and the model is input with m business flows in the period before the current time t.
- the predicted value Predict is the predicted value of business flows in the future period.
- the training data set of the SVM classification algorithm consists of the service traffic prediction value Predict i at each moment predicted by the LSTM model and the actual scaling strategy Label SVM adopted at that moment.
- the SVM loss function is shown in Formula 4:
- yi is the actual value of the business traffic at time i
- wxi -b is the predicted value of the business traffic at time i
- ⁇ is the error between the predicted value and the true value of the tolerance model
- w is the model weight.
- This embodiment takes into account resource performance and business performance through LSTM-based multi-dimensional resource prediction, and uses LSTM to predict future business traffic, which is beneficial to improving the accuracy of business traffic prediction and the sensitivity of business traffic changes.
- the method of this embodiment also includes: Step S300: After predicting the future business traffic prediction value through the LSTM model, making a decision on elastic expansion or elastic contraction of the business traffic prediction value based on the SVM classification algorithm.
- the elastic expansion will be performed after the current time t. If the elastic expansion has been performed within the rated time T before the current time t, the current expansion will be canceled. If the elastic expansion has not been performed, If the hybrid elastic expansion plan is generated, the hybrid elastic expansion plan will be executed after the current time t;
- elastic contraction will be performed after the current moment t. If elastic contraction has been performed within the rated time T before the current moment t, this contraction will be canceled. If elastic contraction has not been performed, a mixed elastic contraction plan will be generated and executed after the current moment t.
- the prediction result is elastic expansion, it will trigger elastic expansion after the current moment t. If elastic expansion has been performed within the rated time T before the current moment t, this expansion will be canceled, otherwise a mixed elastic expansion plan will be generated and the expansion plan will be executed after the current moment t. If the algorithm predicts elastic contraction, it will trigger elastic contraction after the current moment t. If elastic contraction has been performed within the rated time T before the current moment t, this contraction will be canceled, otherwise a mixed elastic contraction plan will be generated and the contraction plan will be executed after the current moment t.
- the hybrid elastic expansion plan is generated based on user costs and expected resource quantities
- Ni is the number of instances of the available specification i that have been activated, and are the number of CPU cores and memory size of the available specification Pi , respectively.
- CPU epx and MEM epx are the expected total number of CPU cores and total memory size, respectively.
- the expansion plan includes the instance specifications and quantity that need to be moved into the scaling group.
- the specifications are arranged in descending order by CPU core number and memory size.
- the scaling group has instances of the specifications specified in the expansion plan but the quantity is insufficient, some instances of the specifications are moved into the horizontal scaling group.
- the expansion plan generated based on Formula 6 includes the instance specifications and quantity that need to be moved into the scaling group. These specifications are arranged in descending order by the number of CPU cores and memory size. Traverse the specifications in the plan in descending order and compare them with the specifications of the existing instances in the scaling group in turn. If there is no specification in the scaling group, the horizontal scaling moves in the instance of this specification; if there is a specification in the scaling group but the number is insufficient, the horizontal scaling moves in some instances of this specification; after the traversal, if the original instance in the scaling group is not in the scheme group, vertical scaling is performed to make the total number of CPU cores and total memory size in the scaling group meet the expected values.
- this embodiment also considers certain expansion and contraction of storage and network capabilities. In the elastic expansion requirements, the expansion of the number and specifications of instances in the scaling group will also expand the storage disk and network capabilities at the same time.
- some instances are scaled vertically and some instances are scaled horizontally, including: among the first k instances with the lowest load that meet the expected resource volume, small-specification and low-load instances are downgraded and then vertically scaled, and large-specification and low-load instances are scaled horizontally.
- the hybrid elastic shrinking plan is generated based on the principle of least load priority and expected resource volume.
- the CPU utilization and memory utilization of all instances in the scaling group are sorted in ascending order, and the first k instances are selected for hybrid elastic shrinking.
- Some instances are vertically scaled, that is, the specifications are downgraded, and some instances are horizontally scaled, that is, moved out of the scaling group, so that the total number of CPU cores and total memory size in the scaling group meet the expected values.
- the generated shrinking plan also shrinks the storage and network capabilities to a certain extent while appropriately shrinking the computing power.
- the hybrid elastic shrinking plan is executed after the current time t. This can be performed more accurately.
- the scaling group performs scaling activities to achieve accurate and reasonable resource allocation and improve overall resource utilization.
- the above embodiment uses a machine learning algorithm to comprehensively analyze application scenarios from multiple dimensions of resources and business through an elastic scaling decision method based on SVM and multi-dimensional resource prediction, and determines whether elastic scaling is needed, thereby improving the decision-making ability of elastic scaling in complex business scenarios.
- the hybrid elastic scaling generation method based on user cost and expected resource volume comprehensively considers the user cost and the reasonable allocation of currently available resources, and performs horizontal and vertical scaling of the scaling group to achieve flexible configuration, which can not only meet business needs, but also improve overall resource utilization.
- FIG. 4 shows the system architecture required for the execution of this embodiment, which mainly includes the underlying database, configuration files, core component monitoring services, business traffic prediction services and hybrid scaling services, and elastic scaling object business scaling groups.
- Step 1 The instances in the business scaling group are distributed as follows: there are 2 instances of specification 2 and 1 instance of specification 1.
- the monitoring service periodically collects the business traffic indicator data of the scaling group, including the scaling group name, timestamp, CPU utilization, memory utilization, GPU utilization, disk IO utilization, network maximum throughput, user request rate, request response rate, request success rate, number of concurrent requests, and the total number of CPU cores and total memory size in the scaling group at the current timestamp.
- the collected data is solidified into the database for storage.
- Step 2 The business traffic prediction service imports the continuously collected data for a certain period of time (such as 30 days) from the database. First, perform data preprocessing, including data cleaning and data standardization. Then build a 4-layer LSTM network with 2 hidden layers and 50 neurons, and train the model based on the Adam optimizer. During the training process, the model parameters are continuously tuned. After the training is completed, save the LSTM model parameters to the configuration file.
- a certain period of time such as 30 days
- Step 3 Input the historical business traffic data into the LSTM model to obtain the corresponding prediction value and the corresponding time scaling label as the input of the SVM model to train the SVM.
- the penalty factor, kernel function and other hyperparameters are continuously tuned.
- the penalty factor of the SVM is 0.125
- the kernel function is the radial basis function RBF.
- Step 4 After deploying the trained model to the environment, the monitoring service collects data at a certain time t1 .
- Historical business traffic This moment is just before the peak of business, and its main business traffic features are "high CPU and memory utilization, increased concurrency and disk utilization".
- the business traffic at this moment is input into the LSTM model, and the model outputs the prediction result Predict t+1 for the next moment; Predict t+1 is further input into the SVM model, and the final output elastic scaling decision is "elastic expansion”.
- Step 5 After the hybrid elastic scaling service receives the "elastic expansion" instruction, it determines whether elastic scaling has been performed within 10 minutes before a certain time t 1. If it has been performed, the expansion will not be performed this time to avoid frequent expansion; otherwise, the generation of an elastic expansion plan is triggered. In the generated expansion plan, there is 1 instance of specification 3, 3 instances of specification 2, and 1 instance of specification 1. The final execution is: 1 instance of horizontal scaling specification 3 is added to the scaling group, and 1 instance of horizontal scaling specification 2 is added to the scaling group.
- Step 6 The monitoring service collects business traffic at a certain time t2 .
- the business traffic characteristics at this time are as follows: "The CPU and memory utilization load of the scaling group is stable and the number of concurrent users remains unchanged, but the request response decreases.” After the LSTM model and SVM model are used, the model prediction result is that no scaling is required.
- Step 7 The monitoring service collects the business traffic at a certain time t 3.
- the business traffic characteristics at this time are "low CPU and memory utilization of the scaling group, and decreased concurrency”.
- the generated instruction is "elastic shrink", and the generated solution is: 1 instance of specification 3, 2 instances of specification 1.
- the final execution is: horizontally scale 2 instances of specification 2 out of the scaling group, and vertically scale the instance of specification 2 to specification 1.
- the above is the implementation process of the hybrid elastic scaling method based on multi-dimensional resource prediction.
- the model When the model is first built and trained, it is necessary to manually configure the relevant parameters and model tuning.
- the model When the model is trained, the amount of historical data is generally larger, and the model can better explore the traffic patterns and characteristics of specific businesses. Before the model makes a prediction, it is only necessary to configure the indicator threshold, rated time, and rated number of times in the generation plan.
- the above-mentioned embodiments can realize the intelligent and automated hybrid elastic scaling process of the cloud platform business system, improve the cloud platform's sensitivity to monitoring changes in business system traffic, reduce the cloud platform engineers' demand for business experience, and help improve the business system's utilization of cloud resources and reduce business operating costs.
- the embodiments of the present application may be provided as methods, systems, or computer program products. Therefore, the present application may take the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of an embodiment including one or more of the following: The form of a computer program product implemented on a computer-usable storage medium (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program code.
- a computer-usable storage medium including but not limited to disk storage, CD-ROM, optical storage, etc.
- These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
- These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Evolutionary Biology (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Hardware Design (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本发明提供了一种基于多维资源预测的混合弹性伸缩方法,包括:采集云平台的多维指标的业务流量数据,基于采集的业务流量数据构建历史数据集,并对历史数据集进行预处理;构建LSTM模型,将预处理后的历史数据集输入至LSTM模型中进行模型训练和模型调优,并通过LSTM模型预测云平台在未来预设时段内的业务流量预测值;在通过LSTM模型预测未来的业务流量预测值后,基于SVM分类算法对业务流量预测值进行弹性扩展或弹性收缩的决策。本发明的方法可以实现对业务系统的混合弹性伸缩过程的智能化、自动化,提高云平台对业务系统流量变化监控的灵敏度,有利于提高业务系统对云资源的利用率、降低业务运营成本。
Description
本发明涉及云计算技术领域,具体而言,涉及一种基于多维资源预测的混合弹性伸缩方法。
近年来,云计算技术逐渐发展成熟,它利用分布式计算、虚拟资源管理等技术,将分散的资源集中起来形成资源池,为用户动态提供弹性服务。弹性伸缩用于解决上云业务系统的动态计算需求场景,根据业务需求自动调整相应的计算能力。当业务需求增长时,弹性伸缩自动移入实例;当业务需求下降时,弹性伸缩自动弹出实例,以此保证服务质量、平衡运营成本。
传统的弹性伸缩一般基于基础监控项如CPU利用率、内存利用率,通过人工触发、定时规则或告警规则触发实例移入或弹出,以应对业务突发流量或节约成本。其中,仅通过CPU利用率、内存利用率两项监控指标较难对实时业务流量有灵敏的反映。此外,弹性伸缩的触发往往需要运维工程师对业务、云系统有较为深入的了解,才能制定出符合业务需求的、合适的触发规则。然而,由于不同业务系统的业务场景复杂多样,具有丰富经验的运维工程师针对不同业务、复杂场景制定相应的弹性伸缩触发规则较难具备可移植性。
因此,如何实现对云平台的业务系统的混合弹性伸缩过程的智能化、自动化,提高云平台对业务系统流量变化监控的灵敏度,降低云平台工程师对业务经验的需求成为急需解决的问题。
发明内容
鉴于此,本发明提出了一种基于多维资源预测的混合弹性伸缩方法,旨在解决如何实现对云平台的业务系统的混合弹性伸缩过程的智能化、自动化,提
高云平台对业务系统流量变化监控的灵敏度,降低云平台工程师对业务经验的需求的问题。
一个方面,本发明提出了一种基于多维资源预测的混合弹性伸缩方法,包括:
采集云平台的多维指标的业务流量数据,基于采集的所述业务流量数据构建历史数据集,并对所述历史数据集进行预处理;
构建LSTM模型,将预处理后的所述历史数据集输入至所述LSTM模型中进行模型训练和模型调优,并通过所述LSTM模型预测所述云平台在未来预设时段内的业务流量预测值;
在通过所述LSTM模型预测未来的业务流量预测值后,基于SVM分类算法对所述业务流量预测值进行弹性扩展或弹性收缩的决策。
进一步地,在基于采集的所述业务流量数据构建历史数据集时,包括:
通过云平台监控服务实时采集的历史数据,所述历史数据采集后固化到数据库中,采集的单条业务流量数据由多个特征组成,定义为[伸缩组名称,时间戳,CPU利用率,内存利用率,GPU利用率,磁盘IO使用率,网络最大吞吐率,用户请求速率,请求响应速率,请求成功率,并发数,CPU总核数,内存总大小],记为X=[x1 … xn]T,其中,x1为伸缩组名称,...,xn为内存总大小,n为特征维度;
取当前时刻的前m条采集数据作为m行*n列的原始数据集,记为相应的每一时刻t的标签均为下一时刻t+1的指标值,即
进一步地,在对所述历史数据集进行预处理时,包括:
对所述历史数据集进行数据清洗和数据标准化处理;其中,
根据式1对业务流量数据X中的特征填补缺失值:
xt,j=(xt-1,j+xt+1,j)/2 (1)
xt,j=(xt-1,j+xt+1,j)/2 (1)
其中,xt,j是第j种特征在t时刻的缺失填充值,其取值为取第j种特征在t时刻的前一时刻和后一时刻的均值;
根据式2对业务流量数据X中的特征进行标准化处理:
其中,xi,j为原始数据,x'i,j为xi,j经过标准化处理后的值,μj为数据集X中的特征的均值,σj为数据集X中的特征的方差。
进一步地,在构建LSTM模型时,包括:
将收集的所述历史数据集进行数据预处理后,以时间戳为索引将数据集划分为训练集Dtrain和测试集Dtest;
进行LSTM网络的构建,LSTM网络分为输入层、隐藏层和输出层,输入层为一段时间内的业务流量数据神经元个数与输入数据维度保持一致。
进一步地,在构建LSTM模型后,包括:
对LSTM模型训练及预测;其中,
在对所述LSTM模型训练时,优化器选用Adam算法进行梯度控制,损失函数选用均方误差MSE进行模型评估,如式3所示:
其中,yactual为实际值,ypredict为预测值;
在模型训练完成后,将模型的权重w和b保存至配置文件中,用于预测使用;
重复构建多个LSTM网络,进行相同的训练过程,并选取多个所述LSTM网络的预测结果的均值作为所述LSTM模型最后的预测值。
进一步地,在建立所述SVM分类算法时,包括:
所述SVM分类算法的训练数据集由所述LSTM模型预测出的各时刻下业务流量预测值Predicti与该时刻下实际采取的伸缩策略LabelSVM组成,SVM损失函数如式4所示:
优化目标如式5所示:
其中,yi为时刻i的业务流量的实际值,wxi-b为时刻i的业务流量的预测值,ε为容忍模型预测值与真实值的误差,w为模型权重。
进一步地,在基于SVM分类算法对所述业务流量预测值进行弹性扩展或弹性收缩的决策时,包括:
当决策结果为弹性扩展时,则在当前时刻t后进行弹性扩展,若在当前时刻t前的额定时间T内进行过弹性扩展,则取消本次扩展,若未进行过弹性扩展则生成混合弹性扩展方案,并在当前时刻t后执行所述混合弹性扩展方案;
当决策结果为弹性收缩时,则在当前时刻t后进行弹性收缩,若在当前时刻t前的额定时间T内进行过弹性收缩,则取消本次收缩,若未进行过弹性收缩则生成混合弹性收缩方案,并在当前时刻t后执行所述混合弹性收缩方案。
进一步地,若未进行过弹性扩展则生成混合弹性扩展方案时,包括:
所述混合弹性扩展方案基于用户成本和预期资源量生成;其中,
根据用户开通成本、扩容预期资源量以及当前资源池可售卖资源量生成满足业务需求且用户成本最低的扩展方案,并建立扩展模型,所述扩展模型如式1所示:
其中,Pi为可售卖规格i的售价,Ni是可售卖规格i的实例开通台数,和分别为可售卖规格Pi的CPU核数和内存大小,CPUepx和MEMepx分别为预期的CPU总核数和内存总大小。
进一步地,在根据用户开通成本、扩容预期资源量以及当前资源池可售卖资源量生成满足业务需求且用户成本最低的扩展方案后,包括:
所述扩展方案中包含有需要移入伸缩组内的实例规格和数量,规格以CPU核数和内存大小降序排列;
降序遍历所述扩展方案内的规格,并依次与所述伸缩组内已有实例的规格进行比较:
当所述伸缩组内无所述扩展方案内的规格时,则垂直伸缩移入该规格实例;
当所述伸缩组内有所述扩展方案内的规格但数量不足时,则水平伸缩移入部分该规格实例;
遍历完所述扩展方案内的规格后,若所述伸缩组内原有实例不在所述扩展方案内,则进行垂直伸缩,以使得所述伸缩组内总CPU核数和总内存大小满足预期值。
进一步地,若未进行过弹性收缩则生成混合弹性收缩方案时,包括:
所述混合弹性收缩方案基于最少负载优先原则和预期资源量生成;其中,
对伸缩组内所有实例的CPU利用率、内存利用率升序排序,取前k台实例进行混合弹性收缩,部分实例垂直伸缩,部分实例水平伸缩,以使得所述伸缩组内总CPU核数和总内存大小满足预期值。
进一步地,在取前k台实例进行混合弹性收缩,部分实例垂直伸缩,部分实例水平伸缩时,包括:
在满足预期资源量的负载最低的前k台实例中,对小规格低负载实例降配后进行垂直伸缩,对大规格低负载实例进行水平伸缩。
与现有技术相比,本发明的有益效果在于,本发明通过对云平台的应用系统的业务流量多维度监控,基于LSTM模型实时分析预测系统业务流量,提高
对业务流量变化的灵敏度;通过SVM模型进行弹性伸缩决策,基于用户成本和预期资源量生成合适的弹性伸缩方案;基于用户成本和预期资源量的弹性伸缩方案中,会对伸缩组内实例进行混合伸缩,包括升降配实例的垂直弹性伸缩和增减实例数量的水平弹性伸缩,以此动态调配伸缩组内资源,提高资源利用率,保证服务质量。本发明的方法可以实现对业务系统的混合弹性伸缩过程的智能化、自动化,提高云平台对业务系统流量变化监控的灵敏度,降低云平台工程师对业务经验的需求,有利于提高业务系统对云资源的利用率、降低业务运营成本。
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本发明的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:
图1为本发明实施例提供的基于多维资源预测的混合弹性伸缩方法的流程图;
图2为本发明实施例提供的基于LSTM的未来业务流量预测框图;
图3为本发明实施例提供的混合弹性伸缩方案流程示意图;
图4为本发明实施例提供的系统架构示意图。
下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。需要说明的是,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本发明。
本发明实施例提出了一种基于多维资源预测的混合弹性伸缩方法,基于多维指标的业务流量数据采集及预处理;通过LSTM模型预测未来业务流量和预期资源量;基于用户成本和期望配额生成混合弹性伸缩方案。在本实施例的方法中,针对现有技术中的不足,为了较为全面地描述实时业务流量,除了基础资源维度指标外,还需要考虑业务性能指标。并且,不同业务系统的需求变化随着时间会表现出一定规律,这种业务流量规律可以考虑使用人工智能算法进行实时挖掘和学习,并基于学习的结果进行精准决策,实现业务系统计算能力的弹性伸缩。
可以理解的是,采用本实施例的方法可以实现对业务系统的混合弹性伸缩过程的智能化、自动化,提高云平台对业务系统流量变化监控的灵敏度,降低云平台工程师对业务经验的需求,有利于提高业务系统对云资源的利用率、降低业务运营成本。
参阅图1所示,本实施例提供的一种基于多维资源预测的混合弹性伸缩方法,包括以下步骤:
步骤S100:采集云平台的多维指标的业务流量数据,基于采集的业务流量数据构建历史数据集,并对历史数据集进行预处理。
具体而言,在基于采集的业务流量数据构建历史数据集时,包括:
通过云平台监控服务实时采集的历史数据,历史数据采集后固化到数据库中,采集的单条业务流量数据由多个特征组成,定义为[伸缩组名称,时间戳,CPU利用率,内存利用率,GPU利用率,磁盘IO使用率,网络最大吞吐率,用户请求速率,请求响应速率,请求成功率,并发数,CPU总核数,内存总大小],记为X=[x1 … xn]T,其中,x1为伸缩组名称,x2为时间戳,x3为CPU利用率...,xn为内存总大小,n为特征维度。采集周期Cycle配置为1分钟,也即一分钟采集一次。
取当前时刻的前m条采集数据作为m行*n列的原始数据集,记为
相应的每一时刻t的标签均为下一时刻t+1的指标值,即
具体而言,在对历史数据集进行预处理时,包括:
对历史数据集进行数据清洗和数据标准化处理;其中,
根据式1对业务流量数据X中的特征填补缺失值:
xt,j=(xt-1,j+xt+1,j)/2 (1)
xt,j=(xt-1,j+xt+1,j)/2 (1)
其中,xt,j是第j种特征在t时刻的缺失填充值,其取值为取第j种特征在t时刻的前一时刻和后一时刻的均值。
在实际应用中,监控服务采集数据过程中,可能因采集能力或网络传输等原因导致数据丢失或空缺。针对采集到的业务流量数据X中存在缺失值的可能情况,采用基于统计学的特征值填充方法填补缺失值。
此外,不同类型的特征指标存在量纲差异,因此在训练模型时需要对特征进行标准化处理,可以提升模型收敛速度,便于综合分析。对于非负小数类型的特征,采用z-score标准化处理。
具体而言,根据式2对业务流量数据X中的特征进行标准化处理:
其中,xi,j为原始数据,x'i,j为xi,j经过标准化处理后的值,μj为数据集X中的特征的均值,σj为数据集X中的特征的方差。
本实施例的方法还包括:步骤S200:构建LSTM模型,将预处理后的历史数据集输入至LSTM模型中进行模型训练和模型调优,并通过LSTM模型预测云平台在未来预设时段内的业务流量预测值。
本实施例的方法采用长短时记忆模型(Long Short Term Memory,LSTM)来预测未来业务流量和预期资源量。LSTM是一种递归神经网络算法,能够从短
期和长期中充分挖掘具有时间依赖和上下文信息的时间序列预测信息。
具体而言,在构建LSTM模型时,将收集的历史数据集进行数据预处理后,以时间戳为索引将数据集划分为训练集Dtrain和测试集Dtest,分别用于后续模型训练和模型调优。
进行LSTM网络的构建,LSTM网络分为输入层、隐藏层和输出层,输入层为一段时间内的业务流量数据神经元个数与输入数据维度保持一致。
具体的,进行LSTM网络的构建时,LSTM网络主要分为输入层、隐藏层和输出层。输入层为一段时间内的业务流量数据神经元个数与输入数据维度保持一致;隐藏层的层数和每层神经元个数需要人工配置,一般根据业务流量数据量大小、模型训练效果进行配置,配置原则为层数和神经元个数越多,则模型越复杂,训练时间越长,可根据实际情况将层数和神经元个数分别设定为2和50;输出层为全连接层,神经元个数与特征维度保持一致,生成一组未来业务流量的预测值。
具体而言,在构建LSTM模型后,对LSTM模型训练及预测。在对LSTM模型训练时,优化器选用Adam算法进行梯度控制,损失函数选用均方误差MSE进行模型评估,如式3所示:
其中,yactual为实际值,ypredict为预测值。
在模型训练完成后,将模型的权重w和b保存至配置文件中,用于预测使用。
结合图2所示,重复构建多个LSTM网络,进行相同的训练过程,并选取多个LSTM网络的预测结果的均值作为LSTM模型最后的预测值。具体的,为了
降低随机性,重复构建4个LSTM网络,进行相同的训练过程,然后取4个网络的预测结果的均值作为模型最后的预测值。
在预测时,导入配置文件中的参数w和b,对模型输入当前时刻t前一段时间的m条业务流量,预测得到Predict则为未来一段时间的业务流量预测值。
通过LSTM算法预测的未来流量预测值后,需要对业务流量预测值进行弹性扩展或弹性收缩的判断。传统的弹性伸缩进行伸缩判断时仅依据CPU和内存利用率与额定阈值的比较结果。仅通过以上策略无法满足多样化复杂化的业务场景。因此本实施例采用基于支持向量机SVM分类算法进行弹性伸缩决策。
具体而言,在建立SVM分类算法时,SVM分类算法的训练数据集由LSTM模型预测出的各时刻下业务流量预测值Predicti与该时刻下实际采取的伸缩策略LabelSVM组成,SVM损失函数如式4所示:
优化目标如式5所示:
其中,yi为时刻i的业务流量的实际值,wxi-b为时刻i的业务流量的预测值,ε为容忍模型预测值与真实值的误差,w为模型权重。
本实施例通过基于LSTM的多维资源预测,考虑了资源性能和业务性能,使用LSTM对未来业务流量进行预测,有利于提高对业务流量预测的准确性、业务流量变化的灵敏度。
本实施例的方法还包括:步骤S300:在通过LSTM模型预测未来的业务流量预测值后,基于SVM分类算法对业务流量预测值进行弹性扩展或弹性收缩的决策。
结合图3所示,具体而言,在基于SVM分类算法对业务流量预测值进行弹性扩展或弹性收缩的决策时,包括:
当决策结果为弹性扩展时,则在当前时刻t后进行弹性扩展,若在当前时刻t前的额定时间T内进行过弹性扩展,则取消本次扩展,若未进行过弹性扩
展则生成混合弹性扩展方案,并在当前时刻t后执行混合弹性扩展方案;
当决策结果为弹性收缩时,则在当前时刻t后进行弹性收缩,若在当前时刻t前的额定时间T内进行过弹性收缩,则取消本次收缩,若未进行过弹性收缩则生成混合弹性收缩方案,并在当前时刻t后执行混合弹性收缩方案。
具体的,如果预测结果为弹性扩展,则会触发在当前时刻t后进行弹性扩展。如果在当前时刻t前的额定时间T内已经进行过弹性扩展,则取消本次扩展,否则生成混合弹性扩展方案,并在当前时刻t后执行扩展方案。如果算法预测结果为弹性收缩,则会触发在当前时刻t后进行弹性收缩。如果在当前时刻t前的额定时间T内已经进行过弹性收缩,则取消本次收缩,否则生成混合弹性收缩方案,并在当前时刻t后执行收缩方案。
具体而言,若未进行过弹性扩展则生成混合弹性扩展方案时,混合弹性扩展方案基于用户成本和预期资源量生成;其中,
根据用户开通成本、扩容预期资源量以及当前资源池可售卖资源量生成满足业务需求且用户成本最低的扩展方案,并建立扩展模型,扩展模型如式1所示:
其中,Pi为可售卖规格i的售价,Ni是可售卖规格i的实例开通台数,和分别为可售卖规格Pi的CPU核数和内存大小,CPUepx和MEMepx分别为预期的CPU总核数和内存总大小。该问题属于组合优化问题,可采用模拟退火算法进行求解。
具体而言,在根据用户开通成本、扩容预期资源量以及当前资源池可售卖资源量生成满足业务需求且用户成本最低的扩展方案后,包括:
扩展方案中包含有需要移入伸缩组内的实例规格和数量,规格以CPU核数和内存大小降序排列;
降序遍历扩展方案内的规格,并依次与伸缩组内已有实例的规格进行比较:
当伸缩组内无扩展方案内的规格时,则垂直伸缩移入该规格实例;
当伸缩组内有扩展方案内的规格但数量不足时,则水平伸缩移入部分该规格实例;
遍历完扩展方案内的规格后,若伸缩组内原有实例不在扩展方案内,则进行垂直伸缩,以使得伸缩组内总CPU核数和总内存大小满足预期值。
具体的,基于式6生成的扩展方案中,包含了需要移入伸缩组内的实例规格和数量,这些规格以CPU核数、内存大小降序排列。降序遍历方案内规格,依次与伸缩组内已有实例的规格进行比较。如果伸缩组内无方案内的规格,则水平伸缩移入该规格实例;如果伸缩组内有方案内的规格但数量不足,则水平伸缩移入部分该规格实例;遍历完后伸缩组内原有实例不在方案组内,则进行垂直伸缩以使得伸缩组内总CPU核数和总内存大小满足预期值。此外,本实施例除了对计算能力扩展外,还考虑对存储、网络能力采取一定的扩缩。在弹性扩展需求中,伸缩组内的实例数量、规格的扩展会同时对存储磁盘、网络能力的扩展。
具体而言,若未进行过弹性收缩则生成混合弹性收缩方案时,包括:混合弹性收缩方案基于最少负载优先原则和预期资源量生成;其中,
对伸缩组内所有实例的CPU利用率、内存利用率升序排序,取前k台实例进行混合弹性收缩,部分实例垂直伸缩,部分实例水平伸缩,以使得伸缩组内总CPU核数和总内存大小满足预期值。
具体而言,在取前k台实例进行混合弹性收缩,部分实例垂直伸缩,部分实例水平伸缩时,包括:在满足预期资源量的负载最低的前k台实例中,对小规格低负载实例降配后进行垂直伸缩,对大规格低负载实例进行水平伸缩。
具体的,混合弹性收缩方案基于最少负载优先原则和预期资源量生成。对伸缩组内所有实例的CPU利用率、内存利用率升序排序,取前k台实例进行混合弹性收缩,部分实例垂直伸缩也即规格降配,部分实例水平伸缩也即移出伸缩组,以满足伸缩组内总CPU核数和总内存大小满足预期值。此外,生成的收缩方案在对计算能力的适当收缩时也对存储、网络能力进行了一定的收缩。方案生成后,在当前时刻t后执行混合弹性伸缩方案。这样能够更为精准地进行
伸缩组伸缩活动,实现资源精准合理分配,提高总体资源利用率。
可以理解的是,上述实施例通过基于SVM和多维资源预测的弹性伸缩决策方法,使用机器学习算法从资源和业务的多个维度综合分析应用场景,判断是否需要弹性伸缩,提高了弹性伸缩在复杂业务场景的决策能力。基于用户成本和预期资源量的混合弹性伸缩生成方法,综合考虑了用户成本和当前可售资源的合理分配,对伸缩组进行水平伸缩和垂直伸缩,实现灵活配置,既可以满足业务需求,同时提高总体资源利用率。
参阅图4所示,图4为本实施例执行所需的系统架构,主要包含了底层的数据库、配置文件、核心组件监控服务、业务流量预测服务和混合伸缩服务,弹性伸缩对象业务伸缩组。
基于上述实施例的一个具体示例中,上述系统架构执行混合弹性伸缩的步骤为:
步骤1:业务伸缩组内实例分布为规格2实例有2台,规格1实例有1台。监控服务定时周期性采集该伸缩组的业务流量指标数据,包括伸缩组名称、时间戳、CPU利用率、内存利用率、GPU利用率、磁盘IO使用率、网络最大吞吐率、用户请求速率、请求响应速率、请求成功率、并发数,以及当前时间戳下该伸缩组内的CPU总核数和内存总大小。将采集到的数据固化到数据库中进行保存。
步骤2:业务流量预测服务从数据库中导入一定时间(如30天)的连续性采集数据。首先进行数据预处理,包括数据清洗和数据标准化。然后搭建4个隐藏层为2、神经元数为50的LSTM网络,并基于Adam优化器进行模型训练,在训练过程中对模型参数不断调优。训练完成后将LSTM模型参数保存至配置文件中。
步骤3:将历史业务流量数据输入LSTM模型中,得到相应的预测值和对应时间的伸缩标签,作为SVM模型的输入,对SVM进行训练。训练过程中对惩罚因子、核函数等超参数进行不断调优,这里SVM的惩罚因子采用0.125,核函数采用径向基函数RBF。训练完成后将SVM模型参数保存至配置文件中。
步骤4:将以上训练好的模型部署到环境后,监控服务采集到某时刻t1的
业务历史流量,此时刻正值业务高发前夕,其业务流量主要特征为“CPU、内存利用率较高,并发数和磁盘利用率升高”。将该时刻业务流量输入到LSTM模型中,模型输出下一时刻的预测结果Predictt+1;将Predictt+1进一步输入到SVM模型中,最终输出弹性伸缩决策为“弹性扩展”。
步骤5:混合弹性伸缩服务得到“弹性扩展”的指令后,判断某时刻t1前10分钟内是否进行过弹性伸缩,若已进行过,为避免频繁伸缩则本次不执行扩展;否则触发弹性扩展方案的生成。生成的扩展方案中,规格3实例1台,规格2实例3台,规格1实例1台。最终执行为:水平伸缩规格3实例1台入伸缩组,水平伸缩规格2实例1台入伸缩组。
步骤6:监控服务采集到某时刻t2的业务流量,此时刻的业务流量特征表现为“伸缩组的CPU、内存利用率负载稳定且并发数不变,但请求响应下降”。在经过LSTM模型和SVM模型后,模型预测结果为无需进行伸缩。
步骤7:监控服务采集到某时刻t3的业务流量,此时刻的业务流量特征表现为“伸缩组的CPU、内存利用率较低、并发数下降”。在经过LSTM模型和SVM模型后,生成的指令为“弹性收缩”,生成的方案为:规格3实例1台,规格1实例2台。最终执行为:水平伸缩规格2实例2台出伸缩组,垂直伸缩规格2实例为规格1。
以上为基于多维资源预测的混合弹性伸缩方法实施过程,在初次构建模型和训练模型时,需要人工配置相关参数和模型调优。模型训练时,一般历史数据量更大,模型更能挖掘具体业务的流量规律和特征。当模型进行预测前,仅需对生成方案中的指标阈值、额定时间、额定次数进行配置。
可以理解的是,上述各实施例可以实现对云平台业务系统的混合弹性伸缩过程的智能化、自动化,提高云平台对业务系统流量变化监控的灵敏度,降低云平台工程师对业务经验的需求,有利于提高业务系统对云资源的利用率、降低业务运营成本。
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包
含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
最后应当说明的是:以上实施例仅用以说明本发明的技术方案而非对其限制,尽管参照上述实施例对本发明进行了详细的说明,所属领域的普通技术人员应当理解:依然可以对本发明的具体实施方式进行修改或者等同替换,而未脱离本发明精神和范围的任何修改或者等同替换,其均应涵盖在本发明的权利要求保护范围之内。
Claims (10)
- 一种基于多维资源预测的混合弹性伸缩方法,其特征在于,包括:采集云平台的多维指标的业务流量数据,基于采集的业务流量数据构建历史数据集,并对所述历史数据集进行预处理;构建LSTM模型,将预处理后的所述历史数据集输入至所述LSTM模型中进行模型训练和模型调优,并通过所述LSTM模型预测所述云平台在未来预设时段内的业务流量预测值;在通过所述LSTM模型预测未来的业务流量预测值后,基于SVM分类算法对所述业务流量预测值进行弹性扩展或弹性收缩的决策。
- 根据权利要求1所述的基于多维资源预测的混合弹性伸缩方法,其特征在于,在基于采集的所述业务流量数据构建历史数据集时,包括:通过云平台监控服务实时采集的历史数据,所述历史数据采集后固化到数据库中,采集的单条业务流量数据由多个特征组成,定义为[伸缩组名称,时间戳,CPU利用率,内存利用率,GPU利用率,磁盘IO使用率,网络最大吞吐率,用户请求速率,请求响应速率,请求成功率,并发数,CPU总核数,内存总大小],记为X=[x1 … xn]T,其中,x1为伸缩组名称,...,xn为内存总大小,n为特征维度;取当前时刻的前m条采集数据作为m行*n列的原始数据集,记为相应的每一时刻t的标签均为下一时刻t+1的指标值,即
- 根据权利要求2所述的基于多维资源预测的混合弹性伸缩方法,其特征在于,在对所述历史数据集进行预处理时,包括:对所述历史数据集进行数据清洗和数据标准化处理;其中,根据式1对业务流量数据X中的特征填补缺失值:
xt,j=(xt-1,j+xt+1,j)/2 (1)其中,xt,j是第j种特征在t时刻的缺失填充值,其取值为取第j种特征在t时刻的前一时刻和后一时刻的均值;根据式2对业务流量数据X中的特征进行标准化处理:
其中,xi,j为原始数据,x'i,j为xi,j经过标准化处理后的值,μj为数据集X中的特征的均值,σj为数据集X中的特征的方差。 - 根据权利要求1所述的基于多维资源预测的混合弹性伸缩方法,其特征在于,在构建LSTM模型时,包括:将收集的所述历史数据集进行数据预处理后,以时间戳为索引将数据集划分为训练集Dtrain和测试集Dtest;进行LSTM网络的构建,LSTM网络分为输入层、隐藏层和输出层,输入层为一段时间内的业务流量数据神经元个数与输入数据维度保持一致。
- 根据权利要求1所述的基于多维资源预测的混合弹性伸缩方法,其特征在于,在构建LSTM模型后,包括:对LSTM模型训练及预测;其中,在对所述LSTM模型训练时,优化器选用Adam算法进行梯度控制,损失函数选用均方误差MSE进行模型评估,如式3所示:
其中,yactual为实际值,ypredict为预测值;在模型训练完成后,将模型的权重w和b保存至配置文件中,用于预测使用;重复构建多个LSTM网络,进行相同的训练过程,并选取多个所述LSTM网络的预测结果的均值作为所述LSTM模型最后的预测值。 - 根据权利要求1所述的基于多维资源预测的混合弹性伸缩方法,其特征在于,在建立所述SVM分类算法时,包括:所述SVM分类算法的训练数据集由所述LSTM模型预测出的各时刻下业务流量预测值Predicti与该时刻下实际采取的伸缩策略LabelSVM组成,SVM损失函数如式4所示:
优化目标如式5所示:
其中,yi为时刻i的业务流量的实际值,wxi-b为时刻i的业务流量的预测值,∈为容忍模型预测值与真实值的误差,w为模型权重。 - 根据权利要求1-6任一项所述的基于多维资源预测的混合弹性伸缩方法,其特征在于,在基于SVM分类算法对所述业务流量预测值进行弹性扩展或弹性收缩的决策时,包括:当决策结果为弹性扩展时,则在当前时刻t后进行弹性扩展,若在当前时刻t前的额定时间T内进行过弹性扩展,则取消本次扩展,若未进行过弹性扩展则生成混合弹性扩展方案,并在当前时刻t后执行所述混合弹性扩展方案;当决策结果为弹性收缩时,则在当前时刻t后进行弹性收缩,若在当前时刻t前的额定时间T内进行过弹性收缩,则取消本次收缩,若未进行过弹性收缩则生成混合弹性收缩方案,并在当前时刻t后执行所述混合弹性收缩方案。
- 根据权利要求7所述的基于多维资源预测的混合弹性伸缩方法,其特征在于,若未进行过弹性扩展则生成混合弹性扩展方案时,包括:所述混合弹性扩展方案基于用户成本和预期资源量生成;其中,根据用户开通成本、扩容预期资源量以及当前资源池可售卖资源量生成满足业务需求且用户成本最低的扩展方案,并建立扩展模型,所述扩展模型如式 1所示:
其中,Pi为可售卖规格i的售价,Ni是可售卖规格i的实例开通台数,和分别为可售卖规格Pi的CPU核数和内存大小,CPUepx和MEMepx分别为预期的CPU总核数和内存总大小。 - 根据权利要求8所述的基于多维资源预测的混合弹性伸缩方法,其特征在于,在根据用户开通成本、扩容预期资源量以及当前资源池可售卖资源量生成满足业务需求且用户成本最低的扩展方案后,包括:所述扩展方案中包含有需要移入伸缩组内的实例规格和数量,规格以CPU核数和内存大小降序排列;降序遍历所述扩展方案内的规格,并依次与所述伸缩组内已有实例的规格进行比较:当所述伸缩组内无所述扩展方案内的规格时,则垂直伸缩移入该规格实例;当所述伸缩组内有所述扩展方案内的规格但数量不足时,则水平伸缩移入部分该规格实例;遍历完所述扩展方案内的规格后,若所述伸缩组内原有实例不在所述扩展方案内,则进行垂直伸缩,以使得所述伸缩组内总CPU核数和总内存大小满足预期值。
- 根据权利要求7所述的基于多维资源预测的混合弹性伸缩方法,其特征在于,若未进行过弹性收缩则生成混合弹性收缩方案时,包括:所述混合弹性收缩方案基于最少负载优先原则和预期资源量生成;其中,对伸缩组内所有实例的CPU利用率、内存利用率升序排序,取前k台实例进行混合弹性收缩,部分实例垂直伸缩,部分实例水平伸缩,以使得所述伸缩组内总CPU核数和总内存大小满足预期值。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311710172.3 | 2023-12-13 | ||
CN202311710172.3A CN117827434A (zh) | 2023-12-13 | 2023-12-13 | 基于多维资源预测的混合弹性伸缩方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024212614A1 true WO2024212614A1 (zh) | 2024-10-17 |
Family
ID=90523534
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2023/143167 WO2024212614A1 (zh) | 2023-12-13 | 2023-12-29 | 基于多维资源预测的混合弹性伸缩方法 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN117827434A (zh) |
WO (1) | WO2024212614A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118069381B (zh) * | 2024-04-25 | 2024-07-26 | 江西锦路科技开发有限公司 | 一种基于资源需求预测容器云混合弹性伸缩方法 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111932015A (zh) * | 2020-08-12 | 2020-11-13 | 武汉中电节能有限公司 | 一种区域供冷供热冷热量负荷的预测方法及系统 |
CN113886454A (zh) * | 2021-08-13 | 2022-01-04 | 浙江理工大学 | 一种基于lstm-rbf的云资源预测方法 |
WO2023272726A1 (zh) * | 2021-07-02 | 2023-01-05 | 深圳先进技术研究院 | 云服务器集群负载调度方法、系统、终端以及存储介质 |
US11550635B1 (en) * | 2019-03-28 | 2023-01-10 | Amazon Technologies, Inc. | Using delayed autocorrelation to improve the predictive scaling of computing resources |
CN117056021A (zh) * | 2023-08-15 | 2023-11-14 | 浙江大学 | 基于长时间序列预测的动态区间弹性扩缩容方法和系统 |
-
2023
- 2023-12-13 CN CN202311710172.3A patent/CN117827434A/zh active Pending
- 2023-12-29 WO PCT/CN2023/143167 patent/WO2024212614A1/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11550635B1 (en) * | 2019-03-28 | 2023-01-10 | Amazon Technologies, Inc. | Using delayed autocorrelation to improve the predictive scaling of computing resources |
CN111932015A (zh) * | 2020-08-12 | 2020-11-13 | 武汉中电节能有限公司 | 一种区域供冷供热冷热量负荷的预测方法及系统 |
WO2023272726A1 (zh) * | 2021-07-02 | 2023-01-05 | 深圳先进技术研究院 | 云服务器集群负载调度方法、系统、终端以及存储介质 |
CN113886454A (zh) * | 2021-08-13 | 2022-01-04 | 浙江理工大学 | 一种基于lstm-rbf的云资源预测方法 |
CN117056021A (zh) * | 2023-08-15 | 2023-11-14 | 浙江大学 | 基于长时间序列预测的动态区间弹性扩缩容方法和系统 |
Also Published As
Publication number | Publication date |
---|---|
CN117827434A (zh) | 2024-04-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110389820B (zh) | 一种基于v-TGRU模型进行资源预测的私有云任务调度方法 | |
CN110096349A (zh) | 一种基于集群节点负载状态预测的作业调度方法 | |
CN105046327B (zh) | 一种基于机器学习技术的智能电网信息系统及方法 | |
CN111026549B (zh) | 一种电力信息通信设备自动化测试资源调度方法 | |
WO2023272726A1 (zh) | 云服务器集群负载调度方法、系统、终端以及存储介质 | |
CN106600058A (zh) | 一种制造云服务QoS的组合预测方法 | |
WO2024212614A1 (zh) | 基于多维资源预测的混合弹性伸缩方法 | |
CN113902116A (zh) | 一种面向深度学习模型推理批处理优化方法与系统 | |
CN117764631A (zh) | 基于源端静态数据建模的数据治理优化方法及系统 | |
Dogani et al. | K-agrued: A container autoscaling technique for cloud-based web applications in kubernetes using attention-based gru encoder-decoder | |
Shu et al. | Resource demand prediction of cloud workloads using an attention-based GRU model | |
CN115456223B (zh) | 基于全生命周期的锂电池梯次回收管理方法及系统 | |
Chen et al. | A nonlinear scheduling rule incorporating fuzzy-neural remaining cycle time estimator for scheduling a semiconductor manufacturing factory—a simulation study | |
CN104217296A (zh) | 一种上市公司绩效综合评价方法 | |
Yang et al. | Design of kubernetes scheduling strategy based on LSTM and grey model | |
CN114611903A (zh) | 一种基于信息熵赋权的数据传输动态配置方法和系统 | |
Yang et al. | Trust-based scheduling strategy for cloud workflow applications | |
Zhang et al. | A data stream prediction strategy for elastic stream computing systems | |
CN116266128A (zh) | 一种用于生态平台资源调度的方法及系统 | |
KR20160044623A (ko) | 리눅스 가상 서버의 로드 밸런싱 방법 | |
Du et al. | OctopusKing: A TCT-aware task scheduling on spark platform | |
CN108241533A (zh) | 一种基于预测和分层抽样的资源池未来负载生成方法 | |
CN114401496A (zh) | 一种基于5g边缘计算的视频信息快速处理方法 | |
Wu et al. | Application of Improved Feature Pre-processing Method in Prevention and Control of Electricity Charge Risk | |
Chen et al. | En-beats: A novel ensemble learning-based method for multiple resource predictions in cloud |