Jntuk 4-1 Cse R20 CC Unit-Iv
Jntuk 4-1 Cse R20 CC Unit-Iv
Jntuk 4-1 Cse R20 CC Unit-Iv
A policy typically refers to the principal guiding decisions, whereas mechanisms represent the
means to implement policies. Separation of policies from mechanisms is a guiding principle in
computer science.
Cloud resource management policies can be loosely grouped into five classes:
1. Admission control.
2. Capacity allocation.
3. Load balancing.
4. Energy optimization.
5. Quality-of-service (QoS) guarantees.
The explicit goal of an admission control policy is to prevent the system from accepting
workloads in violation of high-level system policies; for example, a system may not accept an
additional workload that would prevent it from completing work already in progress or
contracted.
Limiting the workload requires some knowledge of the global state of the system. In a dynamic
system such knowledge, when available, is at best obsolete. Capacity allocation means to
allocate resources for individual instances; an instance is an activation of a service. Locating
resources subject to multiple global optimization constraints requires a search of a very large
search space when the state of individual systems changes rapidly.
Load balancing and energy optimization can be done locally, but global load-balancing and
energy optimization policies encounter the same difficulties as the one we have already
discussed.
Load balancing and energy optimization are correlated and affect the cost of providing the
services. Indeed, it was predicted that by 2012 up to 40% of the budget for IT enterprise
infrastructure would be spent on energy.
1
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
The common meaning of the term load balancing is that of evenly distributing the load to a set of
servers. For example, consider the case of four identical servers, A B C , , , and D , whose
relative loads whose relative loads are 80% 60% 40%, and 20%, respectively, of their capacity.
As a result of perfect load balancing, all servers would end with the same load - 50% of each
server’s capacity.
In cloud computing a critical goal is minimizing the cost of providing the service and, in
particular, minimizing the energy consumption. This leads to a different meaning of the term
load balancing instead of having the load evenly distributed among all servers, we want to
concentrate it and use the smallest number of servers while switching the others to standby
mode, a state in which a server uses less energy.
In our example, the load from D will migrate to A and the load from C will migrate to; thus, B A
and B will be loaded at full capacity, whereas C and D will be switched to standby mode.
Quality of service is that aspect of resource management that is probably the most difficult to
address and, at the same time, possibly the most critical to the future of cloud computing.
Allocation techniques in computer clouds must be based on a disciplined approach rather than
adhoc methods. The four basic mechanisms for the implementation of resource management
policies are:
1. Control theory: Control theory uses the feedback to guarantee system stability and predict
transient behavior but can be used only to predict local rather than global behavior.
2
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
2. Machine learning: A major advantage of machine learning techniques is that they do not
need a performance model of the system. This technique could be applied to coordination
of several autonomic system managers.
3. Utility-based: Utility-based approaches require a performance model and a mechanism to
correlate user-level performance with cost.
4. Market-oriented/economic mechanisms: Such mechanisms do not require a model of the
system, e.g., combinatorial auctions for bundles of resources
The classical feedback control methods are used in all these cases to regulate the key operating
parameters of the system based on measurement of the system output; the feedback control in
these methods assumes a linear time-invariant system model and a closed-loop controller.
This controller is based on an open-loop system transfer function that satisfies stability and
sensitivity constraints.
The technique allows multiple QoS objectives and operating constraints to be expressed as a cost
function and can be applied to stand-alone or distributed Web servers, database servers, high-
performance application servers, and even mobile/embedded systems.
3
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
4
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
A Model Capturing Both QoS and Energy Consumption for a Single-Server System. Now we
turn our attention to the case of a single processor serving a stream of input requests. To compute
the optimal inputs over a finite horizon, the controller in Figure 6.1 uses feedback regarding the
current state, as well as an estimation of the future disturbance due to the environment. The control
task is solved as a state regulation problem updating the initial and final states of the control
horizon.
5
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
Allocation architecture based on control theory concepts for the entire cloud. The automatic
resource management is based on two levels of controllers, one for the service provider and one
for the application, see Figure 6.2
6
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
The main components of a control system are the inputs, the control system components, and the
outputs. The inputs in such models are the offered workload and the policies for admission control,
the capacity allocation, the load balancing, the energy optimization, and the QoS guarantees in the
cloud.
The system components are sensors used to estimate relevant measures of performance and
controllers that implement various policies; the output is the resource allocations to the individual
applications.
The controllers use the feedback provided by sensors to stabilize the system; stability is related
to the change of the output. If the change is too large, the system may become unstable.
In the first case, upper and lower bounds on performance trigger adaptation through resource
7
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
reallocation. Such policies are simple and intuitive but require setting per-application thresholds.
First, due to the very large number of servers and to the fact that the load changes rapidly
in time, the estimation of the current system load is likely to be inaccurate.
Second, the ratio of average to maximal resource requirements of individual users
specified in a service-level agreement is typically very high.
Thresholds: A threshold is the value of a parameter related to the state of a system that triggers a
change in the system behavior. Thresholds are used in control theory to keep critical parameters
of a system in a predefined range.
The threshold could be static , defined once and for all, or it could be dynamic . A dynamic
threshold could be based on an average of measurements carried out over a time interval, a so-
called integral control.
The dynamic threshold could also be a function of the values of multiple parameters at a given
time or a mix of the two. To maintain the system parameters in a given range, a high and a low
threshold are often defined.
The two thresholds determine different actions; for example, a high threshold could force the
system to limit its activities and a low threshold could encourage additional activities.
Proportional Thresholding:
(1) Application controllers that determine whether additional resources are needed and
(2) Cloud controllers that arbitrate requests for resources and allocate the physical
resources?
(3) Is it feasible to consider fine control?
(4) Are dynamic thresholds based on time averages better than static ones?
(5) Is it better to have a high and a low threshold, or it is suf?cient to de?ne only a high
threshold?
8
The essence of the proportional thresholding is captured by the following algorithm:
1. Compute the integral value of the high and the low thresholds as averages of the
maximum and, respectively, the minimum of the processor utilization over the process
history.
2. Request additional VMs when the average value of the CPU utilization over the current
time slice exceeds the high threshold.
3. Release a VM when the average value of the CPU utilization over the current time slice
falls below the low threshold.
The conclusions reached based on experiments with three VMs are as follows:
(a) Dynamic thresholds perform better than static ones and
(b) Two thresholds are better than one.
Though conforming to our intuition, such results have to be justified by experiments in a realistic
environment. Moreover, convincing results cannot be based on empirical values for some of the
parameters required by integral control equations.
Virtually all modern processors support dynamic voltage scaling (DVS) as a mechanism for
9
energy saving. Indeed, the energy dissipation scales quadratically with the supply voltage.
The management controls the CPU frequency and, thus, the rate of instruction execution. For
some compute-intensive workloads the performance decreases linearly with the CPU clock
frequency, whereas for others the effect of lower clock frequency is less noticeable or nonexistent.
The clock frequency of individual blades/servers is controlled by a power manager, typically
implemented in the firmware; it adjusts the clock frequency several times a second.
Three types of experiments were conducted: (i) with the power management turned off; (ii) when
the dependence of the power consumption and the response time were determined through a set of
exhaustive experiments; and (iii) when the dependency of the powercap pκ on nc was derived via
reinforcement-learning models.
10
The second type of experiment led to the conclusion that both the response time and the power
consumed are nonlinear functions of the powercap, pκ , and the number of clients, nc; more
specifically, the conclusions of these experiments are:
• At a low load the response time is well below the target of 1,000 msec.
• At medium and high loads the response time decreases rapidly when pk increases from 80 to 110
watts.
• For a given value of the powercap, the consumed power increases rapidly as the load increases.
Resource bundling complicates traditional resource allocation models and has generated interest
in economic models and, in particular, auction algorithms.
Pricing and Allocation Algorithms. A pricing and allocation algorithm partitions the set of users
into two disjoint sets, winners and losers, denoted as W and L, respectively. The algorithm should:
11
1. Be computationally tractable. Traditional combinatorial auction algorithms such as Vickey-
Clarke- Groves (VLG) fail this criteria, because they are not computationally tractable.
2. Scale well. Given the scale of the system and the number of requests for service, scalability is a
necessary condition.
3. Be objective. Partitioning in winners and losers should only be based on the price πu of a user’s
bid. If the price exceeds the threshold, the user is a winner; otherwise the user is a loser.
4. Be fair.Make sure that the prices are uniform. All winners within a given resource pool pay the
same price.
5. Indicate clearly at the end of the auction the unit prices for each resource pool.
6. Indicate clearly to all participants the relationship between the supply and the demand in the
system.
(a) the first one states that a user either gets one of the bundles it has opted for or nothing; no partial
allocation is acceptable.
(b) The second constraint expresses the fact that the system awards only available resources; only
offered resources can be allocated.
(c) The third constraint is that the bid of the winners exceeds the final price.
(d) The fourth constraint states that the winners get the least expensive bundles in their indifference
set.
(e) The fifth constraint states that losers bid below the final price.
(f) The last constraint states that all prices are positive numbers.
12
is computed. If all its components are negative, the auction stops; negative components mean that
the demand does not exceed the offer. If the demand is larger than the offer, z(t) _ 0, the auctioneer
increases the price for items with a positive excess demand and solicits bids at the new price.
13
7.Scheduling algorithms for computing clouds: (*********)
Scheduling is a critical component of cloud resource management. Scheduling is responsible for
resource sharing/multiplexing at several levels.
A server can be shared among several virtual machines, each virtual machine could support
several applications, and each application may consist of multiple threads.
CPU scheduling supports the virtualization of a processor, the individual threads acting as virtual
processors; a communication link can be multiplexed among a number of virtual channels, one
for each flow.
Scheduling algorithm should be efficient, fair, and starvation-free. The objectives of a scheduler
for a batch system are to maximize the throughput and to minimize the turnaround time
submission and its completion.
Schedulers for systems supporting a mix of tasks – some with hard real-time constraints, others
with soft, or no timing constraints – are often subject to contradictory requirements. Some
schedulers are preemptive, allowing a high-priority task to interrupt the execution of a lower-
priority one; others are non-preemptive
Figure 6.7 identifies several broad classes of resource allocation requirements in the
space defined by these two dimensions: best-effort, soft requirements, and hard requirements.
Hard- real time systems are the most challenging because they require strict timing and precise
amountsof resources.
14
Round-robin, FCFS, shortest-job-first (SJF), and priority algorithms are among the most
15
common scheduling algorithms for best-effort applications. Each thread is given control of the
CPU for a definite period of time, called a time-slice , in a circular fashion in the case of round-
robin scheduling.
The algorithm is fair and starvation-free. The threads are allowed to use the CPU in the order in
which they arrive in the case of the FCFS algorithms and in the order of their running time in the
case of SJF algorithms. Earliest deadline first (EDF) and rate monotonic algorithms (RMA) are
used for real-time applications.
Interconnection networks allow cloud servers to communicate with one another and with users.
These networks consist of communication links of limited bandwidth and
switches/routers/gateways of limited capacity. When the load exceeds its capacity, a switch starts
dropping packets because it has limited input buffers for the switching fabric and for the outgoing
links, as well as limited CPU cycles.
A fair queuing algorithm proposed in requires that separate queues, one per flow, be maintained
by a switch and that the queues be serviced in a round-robin manner. This algorithm guarantees
the fairness of buffer space management, but does not guarantee fairness of bandwidth allocation.
Indeed, a flow transporting large packets will benefit from a larger bandwidth (see Figure 6.8 ).
16
9. Start time fair queuing: (******************)
A hierarchical CPU scheduler for multimedia operating systems was proposed in. The basic
idea of the start-time fair queuing (SFQ) algorithm is to organize the consumers of the CPU
bandwidth in a tree structure; the root node is the processor and the leaves of this tree are the
threads of each application.
A scheduler acts at each level of the hierarchy. The fraction of the processor bandwidth, B,
allocated to the intermediate node i is
17
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
When a virtual machine is not active, its bandwidth is reallocated to the other VMs active at the time. When
one of the applications of a virtual machine is not active, its allocation is transferred to the other
applications running on the same VM. Similarly, if one of the threads of an application is not runnable, its
allocation is transferred to the other threads of the applications.
18
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
19
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
20
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
21
Dr .K. Sreenu. 4/4 CSE CLOUD COMPUTING (R204105A) – UNIT4
IMPORTANT QUESTIONS
1. Explain Policies and Mechanisms for Resource Management?
8. What is the role of power managers in cloud resource scheduling and management? Explain
briefly.
22