Toward A Cloud Operating System

Toward a Cloud Operating System
†
Fabio Pianese Peter Bosch Alessandro Duminuco Nico Janssens Thanos Stathopoulos Moritz Steiner
Alcatel-Lucent Bell Labs
Service Infrastructure Research Dept. †Computer Systems and Security Dept.
{firstname.lastname}@alcatel-lucent.com
Abstract—Cloud computing is characterized today by a hotch- turning what was once felt as the result of concrete equipment
potch of elements and solutions, namely operating systems run- and processes into an abstract entity, devoid of any physical
ning on a single virtualized computing environment, middleware connotation: this is what the expression “Cloud computing” is
layers that attempt to combine physical and virtualized resources
from multiple operating systems, and specialized application currently describing.
engines that leverage a key asset of the cloud service provider (e.g. Previous research has successfully investigated the viability
Google’s BigTable). Yet, there does not exist a virtual distributed of several approaches to managing large-scale pools of hard-
operating system that ties together these cloud resources into a
unified processing environment that is easy to program, flexible,
ware, users, processes, and applications. The main concerns
scalable, self-managed, and dependable. of these efforts were twofold: on one hand, exploring the
In this position paper, we advocate the importance of a virtual technical issues such as the scalability limits of management
distributed operating system, a Cloud OS, as a catalyst in unlock- techniques; on the other hand, understanding the real-world
ing the real potential of the Cloud—a computing platform with and “systemic” concerns such as ease of deployment and
seemingly infinite CPU, memory, storage and network resources.
Following established Operating Systems and Distributed Sys-
expressiveness of the user and programming interface.
tems principles laid out by UNIX and subsequent research efforts, Our main motivation lies in the fact that state-of-the-art
the Cloud OS aims to provide simple programming abstractions management systems available today do not provide access to
to available cloud resources, strong isolation techniques between the Cloud in a uniform and coherent way. They either attempt
Cloud processes, and strong integration with network resources.
At the same time, our Cloud OS design is tailored to the to expose all the low-level details of the underlying pieces of
challenging environment of the Cloud by emphasizing elasticity, hardware [2] or reduce the Cloud to a mere set of API calls—
autonomous decentralized management, and fault tolerance. to instantiate and remotely control resources [3][4][5][6],
to provide facilities such as data storage, CDN/streaming,
I. I NTRODUCTION and event queues [7][8][9], or to make available distributed
The computing industry is radically changing the scale of computing library packages [10][11][12]. Yet, a major gap still
its operations. While a few years ago typical deployed systems has to be bridged in order to bond the Cloud resources into
consisted of individual racks filled with few tens of computers, one unified processing environment that is easy to program,
today’s massive computing infrastructures are composed of flexible, scalable, self-managing, and dependable.
multiple server farms, each built inside carefully engineered In this position paper, we argue for a holistic approach to
data centers that may host several tens of thousand CPU cores Cloud computing that transcends the limits of individual ma-
in extremely dense and space-efficient layouts [1]. There are chines. We aim to provide a uniform abstraction—the Cloud
several reasons for this development: Operating System—that adheres to well-established operating
• Significant economies of scale in manufacturing and systems conventions, namely: (a) providing a simple and yet
purchasing huge amounts of off-the-shelf hardware parts. expressive set of Cloud metrics that can be understood by
• Remarkable savings in power and cooling costs from the the applications and exploited according to individual policies
massive pooling of computers in dedicated facilities. and requirements, and (b) exposing a coherent and unified
• Hardware advances that have made the use of system programming interface that can leverage the available network,
virtualization techniques viable and attractive. CPU, and storage as the pooled resources of a large-scale
• Commercial interest for a growing set of applications and distributed Cloud computer.
services to be offloaded “into the Cloud”. In Section II we elaborate our vision of a Cloud operat-
The commoditization of computing is thus transforming pro- ing system, discuss our working assumptions, and state the
cessing, storage, and bandwidth into utilities such as electrical requirements we aim to meet. In Section III we present a
power, water, or telephone access. This process is already set of elements and features that we see as necessary in
well under way, as today businesses of all sizes tend to Cloud OS: distributed resource measurement and management
outsource their computing infrastructures, often turning to techniques, resource abstraction models, and interfaces, both
external providers to fulfill their entire operational IT needs. to the underlying hardware and to the users/programmers.
The migration of services and applications into the network is We then briefly review the related work in Section IV, and
also modifying how computing is perceived in the mainstream, conclude in Section V with our plans for the future.
978-1-4244-6039-7/10/$26.00 2010
c IEEE 335
II. T HE C LOUD : A U NITARY C OMPUTING S YSTEM across the world, as their deployment is driven by such
practical concerns as:
The way we are accustomed to interact with computers has
• Availability of adequate facilities in strategic locations
shaped over the years our expectations about what computers
• High-bandwidth, multi-homed Internet connectivity
can and cannot achieve. The growth in processor power,
• Presence of a reliable supply of cheap electrical power
the development of new human-machine interfaces, and the
• Suitable geological and/or metereological properties
rise of the Internet have progressively turned an instrument
initially meant to perform batch-mode calculations into the (Cold weather, nearby lakes or glaciers for cheap cooling)
• Presence of special legal regimes (data protection, etc.)
personal gateway for media processing and social networking
we have today, which is an integral aspect of our lifestyle. The clusters’ computing and networking hardware is said to be
The emergence of Clouds is about to further this evolution inside the perimeter of the Cloud. The network that connects
with effects that we are not yet able to foresee: as processing the clusters and provides the expected networking services (IP
and storage move away from end-user equipment, the way routing, DNS naming, etc.) lies outside the Cloud perimeter.
people interact with smaller and increasingly pervasive pieces The reliance on commodity hardware and the reduced
of connected hardware will probably change in exciting and servicing capability compel us to treat Cloud hardware as
unexpected ways. unreliable and prone to malfunction and failures. The network
To facilitate this evolution, we argue it is important to needs also to be considered unreliable, as a single failure of
recognize that the established metaphor of the computer as a piece of network equipment can impact a potentially large
a self-contained entity is now outdated and needs to be number of computers at the same time. A typical mode of
abandoned. Computer networks have reached such a high network failure introduces partitions in the global connec-
penetration that in most cases all of the programs and data tivity which disrupt the end-to-end behavior of the affected
that are ever accessed on a user machine have in fact been transport-layer connections. Network issues may arise both
produced somewhere else and downloaded from somewhere inside the Cloud perimeter, where counter-measures may be
else. While much more powerful than in the past, the CPU available to address them quickly in a way that is transparent
power and storage capacity of the hardware a user may have to most applications, and outside, where it is not possible to
at her disposal pales compared to the power and storage of react as timely.
the hardware she is able to access over the Internet. We do not intend to formulate any specific assumption
We feel that the times are ready for a new way to understand on the applications that will run on the Cloud computer. In
and approach the huge amount of distributed, interconnected other words, we expect to satisfy the whole range of current
resources that are already available on large-scale networks: applicative requirements, e.g. CPU-intensive number crunch-
the first Cloud computing infrastructures that start to be com- ing, storage-intensive distributed backup, or network-intensive
mercially available provide us with a concrete set of working bulk data distribution (and combinations thereof). The Cloud
assumptions on which to base the design of future computer operating system aims to be as general purpose as possible,
systems. Research on distributed management of computer providing a simple set of interfaces for the management of
hardware and applications, together with the emergence of Cloud resources: all the policy decisions pertaining to the use
Internet-wide distributed systems, have provided a wealth of of the available resources are left to the individual applications.
experiences and technical building blocks toward the goal B. Cloud OS: a familiar OS metaphor for the Cloud
of building and maintaining large-scale computer systems. We formulate a new metaphor, the Cloud operating system,
However, users and developers still lack a definite perception that may be adequate to support the transition from individual
about the potential of the Clouds, whose size and aggregate computers as the atomic “computational units” to large-scale,
power are so large and hard to grasp: therefore we need to seamless, distributed computer systems. The Cloud OS aims
provide a new set of metaphors that unify and expose Cloud to provide a familiar interface for developing and deploying
resources in a simple yet powerful way. massively scalable distributed applications on behalf of a
large number of users, exploiting the seemingly infinite CPU,
A. Assumptions on Cloud infrastructure
storage, and bandwidth provided by the Cloud infrastructure.
A Cloud is a logical entity composed of managed computing The features of the Cloud OS aim to be an extension to
resources deployed in private facilities and interconnected those of modern operating systems, such as UNIX and its
over a public network, such as the Internet. Cloud machines successors: in addition to simple programming abstractions
(also called nodes) are comprised of inexpensive, off-the- and strong isolation techniques between users and applications,
shelf consumer-grade hardware. Clouds are comprised of a we emphasize the need to provide a much stronger level
large number of clusters (i.e. sets of nodes contained in a of integration with network resources. In this respect, there
same facility) whose size may range from a few machines to is much to be learnt from the successor of UNIX, Plan 9
entire datacenters. Clusters may use sealed enclosures or be from Bell Labs [13], which extended the consistency of the
placed into secluded locations that might not be accessible on “everything is a file” metaphor to a number of inconsistent
a regular basis, a factor that hinders access and maintenance aspects of the UNIX environment. More useful lessons on
activities. Clusters are sparsely hosted in a number of locations techniques of process distribution and remote execution can
336 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

be drawn from earlier distributed operating systems such as Costa et al. [16] argue for a higher degree of integration
Amoeba [14]. between services and networks in the datacenter by exposing
While a traditional OS is a piece of software that manages detailed knowledge about network topology and performance
the hardware devices present in a computer, the Cloud OS is a of individual links, which allows the adoption of automated
set of distributed processes whose purpose is the management network management techniques such as application-aware
of Cloud resources. Analogies to established concepts can multi-hop routing or efficient local-scope multicast.
therefore help us to describe the kind of features and interfaces The main purpose of our Cloud OS is the introduction of a
we wish to have in a Cloud OS, ignoring for the moment the new set of abstractions to represent the previously hidden cost
obvious differences of scale and implementation between the required to access networked Cloud resources in a synthetic
two scenarios: and sufficiently accurate way. It is our primary concern to
• an OS is a collection of routines (scheduler, virtual provide to users and developers alike with a consistent view
memory allocator, file system code, interrupt handlers, over the Cloud resources so as to encourage them to write new
etc.) that regulate the access by software to CPU, mem- forms of distributed applications. Developers in particular typ-
ory, disk, and other hardware peripherals; the Cloud OS ically expect to focus primarily on the functional requirements
provides an additional set of functionalities that give of the applications, rather than on the complicated trade-offs of
administrative access to resources in the Cloud: allocate distributed programming. The Cloud OS therefore will provide
and deallocate virtual machines, dispatch and migrate a library of standard functionalities, such as naming, consistent
processes, setup inter-process communication, etc. replication and elastic application deployment that attempt to
• an OS provides a standard library of system calls which cover the common requirements of distributed applications.
programs can use to interact with the underlying hard- D. Cloud OS requirements
ware; the Cloud OS provides a set of network-based inter-
faces that applications can use to query the management Whereas current datacenter setups can offer a fine-grained
system and control Cloud resources. amount of control and pervasive management capabilities, the
• an OS includes a standard distribution of libraries and
Cloud environment is much less predictable and harder to
software packages; the Cloud OS includes software sup- control: the environment imposes therefore several restrictions
port for the autonomous scaling and opportunistic deploy- to the Cloud OS design, such as the reliance on coarse-grained
ment of distributed applications. knowledge about Cloud resource availability, the need to detect
and tolerate failures and partitions, and a lack of global view
C. Toward seamless access to networked resources over the system state. Despite these limitations, our design
aims to meet the following general requirements:
The Cloud OS interface implements, as a straightforward
a) The Cloud OS must permit autonomous management
extension of a traditional OS interface, those additional func-
of its resources on behalf of its users and applications: Our
tions and abstractions that will facilitate the access to remote
main purpose is providing an abstraction of the Cloud as a
resources. Instances of software running on the Cloud are
coherent system beyond the individual pieces of hardware
expected to communicate over a variety of scopes (same node,
from which it is built. The Cloud OS should therefore expose
same cluster, between nodes in remote clusters) using network
a consistent and unified interface that conceals whenever
links whose behavior and properties can be very diverse. A
possible the fact that individual nodes are involved in its
major challenge of the Cloud environment is thus characteriz-
operations, and what those low-level operations are.
ing the network infrastructure in a much more expressive way
compared to existing distributed operating systems: in order to b) Cloud OS operation must continue despite loss of
support applications that are sensitive to inter-node latencies nodes, entire clusters, and network partitioning: Conforming
or that require heavy network activity, the cost1 (in terms of to our assumptions, we expect that every system component,
link latency, expected transmission delay, delay required to including networks, may unexpectedly fail, either temporar-
initialize a new virtual machine, etc.) of accessing a Cloud ily or permanently. Guaranteeing continued operation of the
resource needs to be advertised. Cloud management processes in these conditions involves
mechanisms for quickly detecting the failures and enacting
Our work participates in the current attempts at rethinking
appropriate measures. Note that fault-tolerance at the Cloud
the relationship between processing and networking at any
level does not imply any guarantee about the fault-tolerance of
scale—ranging from individual multi-processor systems [15]
individual applications: the state of any process could suddenly
to datacenters [16]—in order to encompass large-scale Cloud
disappear because of any of the previous events, therefore
architectures. Baumann et al. [15] observe that current multi-
Cloud applications should be designed with this in mind.
core processors are actually networked systems, and instead of
Several Cloud libraries that implement common fault-tolerance
building a traditional kernel for such processors with processor
and state recovery features are provided out of the box.
cores competing for shared locks, they adopt a message-based,
c) The Cloud OS must be operating system and architec-
distributed approach that ultimately reaches a higher efficiency.
ture agnostic: The network is the common interface boundary
1 In commercial Cloud deployments, cost may also include the actual pricing between the various software elements of the Cloud. The
of the resources consumed by the applications. reason for this choice is that we want to enable the broadest
2010 IEEE/IFIP Network Operations and Management Symposium Workshops 337

compatibility between hardware and software configurations, A. Logical architecture of the Cloud OS
while providing at the same time an easy way for future Figure 1 represents a logical model of Cloud OS. We define
evolution of the Cloud system, both at a global and at an the Cloud object as a set of local OS processes running on an
individual subsystem level. Experience shows that protocols single node, which are wrapped together and assigned locally
are able to withstand time much better than ABIs, standard a random identifier of suitable length to minimize the risk
library specifications, and file formats: long-lived protocols of system-wide ID collisions. A Cloud process (CP) is a
such as the X protocol and HTTP are good examples in collection of Cloud objects that implement the same (usually
this regard. While it is wise from an operational standpoint distributed) application.
to consolidate the number of architectures supported and We refer to the small number of CPs that regulate physical
standardize around a small number of software platforms, the allocation, access control, accounting, and measurements of
Cloud OS operation does not depend on any closed set of resources as the Cloud kernel space. Those CPs that do not
platforms and architectures. belong to kernel space pertain to the Cloud user space. User
d) The Cloud must support multiple types of applications, space CPs that are executed directly by users are called User
including legacy: In the assumptions above, we purposefully Applications, while Cloud Libraries are CPs typically called
did not specify a target set of applications that the Cloud is upon by Applications and other Libraries. Applications can in-
supposed to host. Rather than optimizing the system for a terface with Libraries and kernel CPs over the network through
specific mode of operation (e.g. high performance computing, a set of standard interfaces called Cloud System Calls2 . The
high data availability, high network throughput, etc.), we aim assumptions stated above pose very few constraints about the
to address the much broader requirements of a general-purpose features that the underlying Cloud hardware is expected to
scenario: applications of every type should ideally coexist provide. Basically, the ability to execute the Cloud kernel
and obtain from the system the resources that best match the processes, together with the availability of appropriate trust
application requirements. credentials, is a sufficient condition for a node to be part
e) The Cloud OS management system must be decentral- of the Cloud3 . A limited access to Cloud abstractions and
ized, scalable, have little overhead per user and per machine, interfaces is thus also achievable from machines that belong to
and be cost effective: The use of such a soft-state approach administrative domains other than that of the Cloud provider,
takes inspiration from recent peer-to-peer techniques: these with possible restrictions due to the extent of the management
systems are capable of withstanding failures and churn at rights available there.
the price of a reasonable amount of network overhead, and All objects in the Cloud user space expose a Cloud system
provide enough scalability to meet and surpass the magnitudes call handler to catch signals from the Cloud OS, i.e. they
of today’s datacenters and large-scale testbeds. Moreover, apart can be accessed via a network-based interface for manage-
from initial resource deployment and key distribution, no ment purposes. The association between object names and
human intervention should be required to expand the Cloud their network address and port is maintained by the process
resources. Likewise, user management should only entail management and virtual machine management kernel CPs,
the on-demand creation of user credentials, which are then and the resulting information is made available throughout
automatically propagated throughout the Cloud. the Cloud via the naming Library. The naming library also
f) The resources used in the Cloud architecture must keeps track of the link between User Application CPs and
be accountable, e.g. for billing and debugging purposes: the objects they are composed of. The access rights necessary
The cost of an application’s deployment across the Cloud is for all management operations are granted and verified by
also a part of the end-to-end metrics that may influence the the authentication kernel CP. Measurement kernel CPs are
scheduling of resources as per an application’s own policy. always active in the Cloud and operate in both on-demand and
Moreover, dynamic billing schemes based e.g. on resource background modes.
congestion [17] could be an effective way to locally encourage
a proportionally fair behavior among users of the system and B. Implementation of the Cloud kernel processes
increase the cost of attacks based on maliciously targeted 1) Resource measurement: The Cloud OS needs to main-
resource allocation [18]. tain an approximate view of the available Cloud resources. Our
III. T OWARD A C LOUD O PERATING S YSTEM current approach involves performing local measurements on
each Cloud node. This technique provides easy access to end-
In this section, we present the architecture and functional to-end variables such as latency, bandwidth, packet loss rate,
building blocks of the Cloud OS. Our current design ap- etc., which are precious sources of knowledge that are directly
proach leverages decades of experience in building networked exploitable by the applications. More detailed knowledge
systems, from the origins of the Internet architecture [19]
to subsequent achievements in distributed operating systems 2 The network protocols used among objects that belong to a same CP are
research [13] and large-scale network testbed administra- outside the scope of the Cloud OS interface definition.
3 Nodes that are not part of the Cloud are still capable to access Cloud
tion [20]. An additional inspiration, especially concerning the
resources using the network-based system call interface; however, without
implementation of Cloud OS, comes from the last decade of a full OS-level support for Cloud abstractions, they won’t provide seamless
advances in distributed algorithms and peer-to-peer systems. integration between the local and the Cloud environment.

Figure 1. Logical model of Cloud OS, featuring the division between Cloud kernel / Cloud user space and the system call and library API interfaces.
requires complete control over the network infrastructure, but applications, the Cloud OS needs first to collect and aggregate
it may be used in certain cases to augment the accuracy of them it in a timely way. Clearly, solutions based on centralized
end-to-end measurements (e.g., with short-term predictions of databases are not viable, since they lack the fault-tolerance
CPU load or networking performance [21]) in Clouds that span and the scalability we require. The use of distributed systems,
several datacenters. such as distributed hash tables (DHTs), has proved to be
Measurements can target either local quantities, i.e. inside very effective for publishing and retrieving information in
a single Cloud node, or pairwise quantities, i.e. involving large-scale systems [26], even in presence of considerable
pairs of connected machines (e.g. link bandwidth, latency, levels of churn [27]. However, DHTs offer hash table (key,
etc.). Complete measurements of pairwise quantities cannot be value) semantic, which are not expressive enough to support
performed in large-scale systems, as the number of measure- more complex queries such as those used while searching for
ment operations required grows quadratically with the size of resources. Multi-dimensional DHTs [28][29] and gossip-based
the Cloud. Several distributed algorithms to predict latencies approaches [30] extended the base (key, value) semantic in
without global measurement campaigns have been proposed: order to allow multi-criteria and range queries.
Vivaldi [22] collects local latency samples and represents 3) Distributed process and application management: The
nodes as points in a coordinate system. Meridian [23] uses Cloud OS instantiates and manages all objects that exist
an overlay network to recursively select machines that are across the Cloud nodes. A consolidated practice is the use
the closest to a given network host. Bandwidth estimation of virtual machines (VMs), which provide an abstraction that
in Cloud environments remains an open problem: despite the flexibly decouples the “logical” computing resources from
existence of a number of established techniques [24], most of the underlying physical Cloud nodes. Virtualization provides
them are too intrusive and unsuitable for simultaneous use and several properties required in a Cloud environment [31], such
to perform repeated measurements on high capacity links. as the support for multiple OS platforms on the same node and
the implicit isolation (up to a certain extent) between processes
2) Resource abstraction: Modern OS metaphors, such as
running on different VMs on the same hardware. Computation
the “everything is a file” model used by UNIX and Plan9, pro-
elasticity, load balancing, and other optimization requirements
vide transparent network interfaces and completely hide their
introduce the need for dynamic allocation of resources such as
properties and specificities from the applications. However,
the ability to relocate a running process between two nodes in
characterizing the underlying network is a crucial exigence for
the Cloud. This can be done either at the Cloud process level,
a Cloud OS, for network properties such as pairwise latencies,
i.e. migrating single processes between nodes, or at virtual
available bandwidth, etc., determine the ability of distributed
machine level, i.e checkpointing and restoring the whole VM
applications to efficiently exploit the available resources. One
state on a different node. The combination of process and
major strength of a file-based interface is that it is very flexible
VM migration, such as it was introduced by MOSIX [32],
and its shortcomings can be supplemented with an appropriate
is very interesting as a Cloud functionality as it allows to
use of naming conventions. We are considering several such
autonomously regroup and migrate bundles of related Cloud
mechanisms to present abstracted resource information from
objects with a single logical operation4 .
measurements to the applications, e.g. via appropriate exten-
4 Another compelling approach is Libra [33], which aims to bridge the
sions of the /proc interface or via POSIX-compatible semantic
cues [25]. distance between processes and VMs: the “guest” operating system is reduced
to a thin layer on top of the hypervisor that accesses the functions exposed
In order to present information about resources to the user by the “host” operating system through a file-system interface.

The Cloud operating system must also provide an interface of application development. On the other hand, the approach
to manage processes from a user perspective. This requires above promotes re-usability of Cloud application components,
the introduction of an abstraction to aggregate all the different which can be easily adapted to satisfy different specifications
computational resources is a single view. A good example just by updating a minor amount of embedded Cloud library
on how to do this is the recent Unified Execution Model parameters.
(UEM) proposal [34], which structures the interface as a
directory tree similar to the Plan 9 /proc file system and IV. R ELATED WORK
provides an intuitive way to create, copy, and move processes
Our work draws from a number of different fields and
between Cloud nodes. The novelty of the UEM approach is
research topics: distributed operating systems, remote applica-
the possibility to “switch the view” on the Cloud process
tion management, and large-scale peer-to-peer systems. With
namespace, using a simple extension of common shell built-in
Cloud OS, we aim to achieve an original synthesis of the three:
commands, inducing a transition e.g. from a per-node view of
we propose an architecture-agnostic functional platform to
the process environment to an application-based list of running
support the efficient implementation and seamless deployment
Cloud processes and objects.
of new distributed applications.
4) Access control and user authentication: Providing seam-
less support for large numbers of simultaneous users requires
A. Grid and Cloud middleware
a distributed authentication method to avoid single points of
failure, resulting in the complete or partial inaccessibility to Over the past few years, the adoption of cloud computing
Cloud resources. Plan 9 provides a very interesting distributed in enterprise development has been technologically supported
security model [35] which is based on a factotum server, by a very heterogeneous set of cloud services. IBM Research
running on every machine, that authenticates the users pro- Center recently stated that the heterogeneous nature of this
viding a single-sign-on facility based on secure capabilities. solution space hinders the adoption of Cloud technologies
Authentication of users by factotum needs to be supported by by major enterprises [39]. In [40], the authors argue that a
a Cloud-wide system for securely distributing and storing user “Cloud middleware” should solve this by homogenizing the
credentials, which could be seen as a scaled-up, distributed cloud interface. To justify this observation, IBM developed
version of the Plan 9 secstore service. the Altocumulus middleware, which offers a uniform API for
using Amazon EC2, Eucalyptus, Google AppEngine, and IBM
C. Features provided by the Cloud user space
HiPODS Cloud, aiming to provide an API which is Cloud
In order to fully exploit the potential of a general purpose agnostic (i.e., the integration of new public clouds requires the
Cloud OS, developers should be given access to a set of development of specific adaptors). Although our work aims
standard ways to satisfy common requirements of distributed to offer a homogeneous cloud interface as well, our work
large-scale applications. Cloud libraries provide a standard adopts a bottom-up instead of a top-down approach: instead of
API with features such as: providing means to move existing solutions under one single
• access to Cloud-wide object and process naming via DNS Cloud umbrella, we seek to offer a Cloud OS that is able to
and/or other distributed naming services [36], seamlessly control private and public cloud resources.
• distributed reliable storage functionality [25][37], The Globus toolkit [3] is a set of libraries and services that
• automated Cloud application deployment, horizontal scal- address security, information discovery, resource management,
ing, and lifecycle management [38], data management, communication, fault-detection and porta-
• high availability failover support with checkpointed repli- bility in a Grid environment. The lack of consistent interfaces
cated process execution. make the toolkit difficult to install, manage, and use. While the
As a general principle, the Cloud libraries provided by the generalization done by Globus is essentially at the application
Cloud OS should allow the developers to control the required level (indeed services virtualize components that live in the
level of data replication, consistency, and availability, and application level), we try to operate at a lower level, trying to
also the way failure handling is performed when application provide the user with an operating system abstraction. In our
requirements are not satisfied. This way, an application devel- approach, what is provided to users are raw resources (CPU,
oper can concentrate her attention on the specific properties network bandwidth, storage) and an interface to operate on
of the application, knowing that the system will try its best to processes, instead of higher-level services.
accommodate the stated requirements. For instance, when an Other Cloud middleware initiatives like SmartFrog [38]
application demands high availability and is capable of deal- have focused on managing the usage of private and public
ing with temporary inconsistencies, the library may provide cloud computing infrastructure. Similar to our Cloud OS,
eventual consistency support, instead of stronger consistency. SmartFrog supports transparent deployment of client services
The advantages of this configurability are twofold. On one and applications on the available resources, allows for on-
hand, an application developer doesn’t need to implement demand scaling, and provides failure recovery features. Note,
her own customized (application-specific) libraries, but instead however, that the SmartFrog middleware assumes Cloud appli-
can customize the level of support she needs from the Cloud cations are written in Java. Our Cloud OS, in contrast, provides
library API. This reduces the complexity and error proneness a language and technology agnostic API including a set of

network protocols. This way, the usability of our Cloud OS failure, Zorilla will find a new node and notify the application,
will not be restricted by technology lock-in. leaving to the application the implementation of mechanisms
Among the related literature, the goals of XtreemOS [41] for failure recovery.
are the closest to ours. It aims at the design and implementa- Plush [53] makes use of SWORD to provide an application
tion of an open source Grid operating system which would be management infrastructure service. It consists of tools for
capable of running on a wide range of underlying platforms, automatic deployment and monitoring of applications on large-
spanning from clusters to mobiles. It implements a set of scale testbeds such as PlanetLab. To do so it uses a set of
system services that extend those found in a typical Linux application controllers that run on each node, whereas, the
system. This is different from our approach, which focuses on execution of the distributed application is controlled by a
network protocols and has no specific requirements concerning centralized manager. While one requirement of our Cloud OS
architecture or operating system running on a node. XtreemOS is to be scalable, Plush has been able to manage only up
aims to be completely transparent to the applications, whereas to 1000 nodes in emulations and 500 nodes in a real-world
we want to provide additional information and hooks to the deployment.
application developers.
V. F UTURE DIRECTIONS
B. Global resource management The existence of simple yet powerful and expressive ab-
Early systems for monitoring and managing resources, such stractions is essential in realizing the full potential of Cloud
as Remos [42] and Darwin [43] used a centralized architecture Computing. To this purpose we introduced the Cloud operating
that suffered from clear scalability and reliability problems. system, Cloud OS. Cloud OS aims to provide an expressive set
Later, Astrolabe [44] adopted a distributed approach focused of resource management options and metrics to applications
on scalability: Astrolabe creates a hierarchy of zones and to facilitate programming in the Cloud, while at the same
inside each of them the availability of resources is advertised time exposing a coherent and unified programming interface
via a gossiping protocol. At each level of the hierarchy, the to the underlying distributed hardware. This unified interface
state of the resources gets incrementally aggregated. This will provide developers with a quick and transparent access to
approach hinders the efficient compilation of the list of nodes a massively scalable computing and networking environment,
that match a given job requirement. Like Astrolabe, [30] allowing the implementation of robust, elastic, and efficient
also relies on gossiping to locate appropriate nodes, but distributed applications. Our next steps beyond laying out the
doesn’t use a hierarchical infrastructure: gossip messages are architecture of CloudOS include, first, a detailed definition of
routed via subsequent hops across a multidimensional space functional elements and interfaces of the kernel-space Cloud
to reach nodes that satisfy all the specified constraints. The processes and of the user-space libraries, and second, the
SWORD [45] resource discovery service builds upon a DHT, design and implementation of the aforementioned elements
a structured overlay [46]. SWORD generates a different key with emphasis on fault-tolerance, security, and elasticity.
for each attribute based on its current value. Range queries
are efficient in SWORD, since each node is responsible for a R EFERENCES
continuous range of values [29]. [1] “HP performance-optimized datacenter (POD).” Data Sheet, 2008.
[2] IBM, “Tivoli netview.” [Online] http://www-
C. Remote application management 01.ibm.com/software/tivoli/products/netview/.
[3] I. Foster, C. Kesselman, J. M. Nick, and S. Tuecke, “The physiology
The first attempts to manage the computing resources of a of the grid: An open grid services architecture for distributed systems
large number of remote machines were SETI@Home [47] in integration,” in Open Grid Service Infrastructure WG, Global Grid
Forum, 2002.
the mid-nineties, followed by its successor BOINC [48] and [4] “Amazon EC2.” [Online] http://aws.amazon.com/ec2.
many other similar projects. The immense computing resource [5] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Yous-
that could be harnessed via a simple, centralized mechanism eff, and D. Zagorodnov, “The eucalyptus open-source cloud-computing
system,” in Proc. of Cloud Computing and Its Applications, Oct. 2008.
first tangibly illustrated the power of large-scale distributed [6] B. Rochwerger, A. Galis, D. Breitgand, E. Levy, J. Cáceres, I. Llorente,
batch computing. Y. Wolfsthal, M. Wusthoff, S. Clayman, C. Chapman, W. Emmerich,
Distributed application management framework soon aban- E. Elmroth, and R. Montero, “Design for future internet service infras-
tructures,” in Future Internet Assembly 2009, Prague, Czech Republic,
doned the initial centralized architecture, leveraging tech- In "Towards the Future Internet - A European Research Perspective",
niques from peer-to-peer search and file-sharing networks: pp. 227-237, IOS Press, May 2009.
these systems need in fact to integrate resource discovery, [7] “Amazon S3.” [Online] http://aws.amazon.com/s3/.
[8] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The google file system,”
application monitoring, and remote management [49][50]. The in Proceedings of the 19th ACM Symposium on Operating Systems
first generation of resource management systems was based on Principles (SOSP), 2003.
early peer-to-peer architectures. Zorilla [51] is very similar to [9] F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Bur-
rows, T. Chandra, A. Fikes, and R. E. Gruber, “Bigtable: A distributed
Gnutella, thus suffering from the same scalability issues [52]. storage system for structured data,” in 7th USENIX Symposium on
Job advertisements are flooded to other nodes in the system. Operating Systems Design and Implementation (OSDI), 2006.
Upon reception of an advertisement, the node checks if all [10] J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on
large clusters,” in Sixth Symposium on Operating System Design and
the requirements of that job are met at the current time, and Implementation (OSDI), 2004.
acknowledges or ignores the advertisement. In case of a node [11] “Apache Hadoop project.” [Online] http://hadoop.apache.org/.

[12] “Google AppEngine.” [Online] http://appengine.google.com. international conference on Virtual execution environments, (New York,
[13] R. Pike, D. Presotto, S. Dorward, B. Flandrena, K. Thompson, NY, USA), pp. 44–54, ACM, 2007.
H. Trickey, and P. Winterbottom, “Plan 9 from Bell Labs,” Computing [34] E. V. Hensbergen, N. P. Evans, and P. Stanley-Marbell, “A unified
Systems, vol. 8, no. 3, pp. 221–254, 1995. execution model for cloud computing,” in In Proc. of the 3rd ACM
[14] S. J. Mullender, G. van Rossum, A. S. Tanenbaum, R. van Renesse, SIGOPS International Workshop on Large Scale Distributed Systems
and H. van Staveren, “Amoeba: A distributed operating system for the and Middleware (LADIS), 2009.
1990s,” IEEE Computer, vol. 23, no. 5, pp. 44–53, 1990. [35] R. Cox, E. Grosse, R. Pike, D. L. Presotto, and S. Quinlan, “Security
[15] A. Baumann, P. Barham, P.-E. Dagand, T. Harris, R. Isaacs, S. Peter, in plan 9,” in Proceedings of the 11th USENIX Security Symposium,
T. Roscoe, A. Schüpbach, and A. Singhania, “The multikernel: A new (Berkeley, CA, USA), pp. 3–16, USENIX Association, 2002.
OS architecture for scalable multicore systems,” in Proceedings of the [36] V. Ramasubramanian and E. G. Sirer, “The design and implementation
22nd ACM Symposium on OS Principles (SOSP), 2009. of a next generation name service for the internet,” in SIGCOMM
[16] P. Costa, T. Zahn, A. Rowstron, G. O’Shea, and S. Schubert, “Why ’04: Proceedings of the 2004 conference on Applications, technologies,
Should We Integrate Services, Servers, and Networking in a Data Cen- architectures, and protocols for computer communications, (New York,
ter?,” in Proceedings of the International Workshop on Research on En- NY, USA), pp. 331–342, ACM, 2004.
terprise Networking (WREN’09), co-located with ACM SIGCOMM’09, [37] G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman,
(Barcelona, Spain), pp. 111–118, ACM Press, August 2009. A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels, “Dynamo:
[17] R. Gibbens and F. Kelly, “Resource pricing and the evolution of Amazon’s highly available key-value store,” SIGOPS Oper. Syst. Rev.,
congestion control,” Automatica, vol. 35, pp. 1969–1985, 1999. vol. 41, no. 6, pp. 205–220, 2007.
[18] T. Ristenpart, E. Tromer, H. Shacham, and S. Savage, “Hey, you, get [38] P. Goldsack, J. Guijarro, S. Loughran, A. Coles, A. Farrell, A. Lain,
off of my cloud: Exploring information leakage in third-party compute P. Murray, and P. Toft, “The smartfrog configuration management
clouds,” in Proceedings of the ACM Conference on Computer and framework,” SIGOPS Oper. Syst. Rev., vol. 43, no. 1, pp. 16–25, 2009.
Communications Security, (Chicago, IL), November 2009. [39] E. M. Maximilien, A. Ranabahu, R. Engehausen, and L. C. Anderson,
[19] D. D. Clark, “The design philosophy of the darpa internet protocols,” “Toward cloud-agnostic middlewares,” in Proceedings of OOPSLA ’09,
in Proc. SIGCOMM ’88, Computer Communication Review Vol. 18, No. (New York, NY, USA), pp. 619–626, ACM, 2009.
4, August 1988, pp. 106–114, 1988. [40] E. M. Maximilien, A. Ranabahu, R. Engehausen, and L. C. Anderson,
“IBM Altocumulus: a cross-cloud middleware and platform,” in Pro-
[20] L. Peterson, A. Bavier, M. E. Fiuczynski, and S. Muir, “Experiences
ceedings of OOPSLA ’09, (New York, NY, USA), pp. 805–806, ACM,
building Planetlab,” in In Proceedings of the 7th USENIX Symp. on
2009.
Operating Systems Design and Implementation (OSDI), 2006.
[41] M. Coppola, Y. Jegou, B. Matthews, C. Morin, L. Prieto, O. Sanchez,
[21] P. A. Dinda and D. R. O’Hallaron, “An extensible toolkit for resource E. Yang, and H. Yu, “Virtual organization support within a grid-
prediction in distributed systems,” Tech. Rep. CMU-CS-99-138, School wide operating system,” Internet Computing, IEEE, vol. 12, pp. 20–28,
of Computer Science , Carnegie Mellon University, 1999. March-April 2008.
[22] F. Dabek, R. Cox, F. Kaashoek, and R. Morris, “Vivaldi: A decentralized [42] T. DeWitt, T. Gross, B. Lowekamp, N. Miller, P. Steenkiste, J. Subhlok,
network coordinate system,” in In Proc. of ACM SIGCOMM, 2004. and D. Sutherland, “Remos: A resource monitoring system for network-
[23] B. Wong, A. Slivkins, and E. G. Sirer, “Meridian: a lightweight net- aware applications,” Tech. Rep. CMU-CS-97-194, School Computer
work location service without virtual coordinates,” SIGCOMM Comput. Science, Carnegie Mellon University, Pittsburgh, PA, 1997.
Commun. Rev., vol. 35, no. 4, pp. 85–96, 2005. [43] P. Chandra, A. Fisher, C. Kosak, T. Ng, P. Steenkiste, E. Takahashi, and
[24] R. S. Prasad, M. Murray, C. Dovrolis, and K. Claffy, “Bandwidth H. Zhang, “Darwin: Customizable resource management for value-added
estimation: Metrics, measurement techniques, and tools,” IEEE Network, network services,” IEEE NETWORK, vol. 15, pp. 22–35, 2001.
vol. 17, pp. 27–35, 2003. [44] R. Van Renesse, K. Birman, D. Dumitriu, and W. Vogel, “Scalable
[25] J. Stribling, Y. Sovran, I. Zhang, X. Pretzer, J. Li, M. F. Kaashoek, management and data mining using Astrolabe,” in Proceedings of the
and R. Morris, “Flexible, wide-area storage for distributed systems with First International Workshop on Peer-to-Peer Systems (IPTPS’01), 2001.
wheelfs,” in NSDI’09: Proceedings of the 6th USENIX symposium on [45] J. Albrecht, D. Oppenheimer, D. Patterson, and A. Vahdat, “Design and
Networked systems design and implementation, (Berkeley, CA, USA), Implementation Tradeoffs for Wide-Area Resource Discovery,” ACM
pp. 43–58, USENIX Association, 2009. Transactions on Internet Technology (TOIT), vol. 8, September 2008.
[26] P. Maymounkov and D. Mazières, “Kademlia: A peer-to-peer informa- [46] E. K. Lua, J. Crowcroft, M. Pias, R. Sharma, and S. Lim, “A survey
tion system based on the xor metric,” in IPTPS ’01: Revised Papers and comparison of peer-to-peer overlay network schemes,” IEEE Com-
from the First International Workshop on Peer-to-Peer Systems, (London, munications Survey and Tutorial, vol. 7, pp. 72–93, March 2004.
UK), pp. 53–65, Springer-Verlag, 2002. [47] SETI@Home. [Online] http://setiathome.berkeley.edu.
[27] M. Steiner, T. En-Najjary, and E. W. Biersack, “A global view of KAD,” [48] BOINC. [Online] http://boinc.berkeley.edu.
in IMC 2007, ACM SIGCOMM Internet Measurement Conference, [49] G. Fox, D. Gannon, S.-H. Ko, S. Lee, S. Pallickara, M. Pierce, X. Qiu,
October 23-26, 2007, San Diego, USA, 10 2007. X. Rao, A. Uyar, M. Wang, and W. Wu, Peer-to-Peer Grids. John Wiley
[28] P. Ganesan, B. Yang, and H. Garcia-Molina, “One torus to rule them all: and Sons Ltd, 2003.
multi-dimensional queries in p2p systems,” in WebDB ’04: Proceedings [50] I. Foster and A. Iamnitchi, “On death, taxes, and the convergence of
of the 7th International Workshop on the Web and Databases, (New peer-to-peer and grid computing,” in In 2nd International Workshop on
York, NY, USA), pp. 19–24, ACM, 2004. Peer-to-Peer Systems (IPTPS), pp. 118–128, 2003.
[29] A. Bharambe, M. Agrawal, and S. Seshan, “Mercury: Supporting [51] N. Drost, R. van Nieuwpoort, and H. Bal, “Simple locality-aware co-
scalable multi-attribute range queries,” in Proceedings of the 2004 allocation in peer-to-peer supercomputing,” in Proceedings of the 6th
conference on Applications, technologies, architectures, and protocols IEEE/ACM CCGrid Symposium, 2006.
for computer communications, pp. 353–366, 2004. [52] J. Ritter, “Why gnutella can’t scale. No, really.” [Online]
[30] P. Costa, J. Napper, G. Pierre, and M. V. Steen, “Autonomous Resource http://www.darkridge.com/ jpr5/doc/gnutella.html., Feb 2001.
Selection for Decentralized Utility Computing,” in Proceedings of the [53] J. Albrecht, R. Braud, D. Dao, N. Topilski, C. Tuttle, A. C. Snoeren,
29th IEEE International Conference on Distributed Computing Systems and A. Vahdat, “Remote Control: Distributed Application Configuration,
(ICDCS 2009), (Montreal, Canada), June 2009. Management, and Visualization with Plush,” in Proceedings of the
[31] R. Figueiredo, P. Dinda, and J. Fortes, “A case for grid computing on Twenty-first USENIX LISA Conference, November 2007.
virtual machines,” in International Conference on Distributed Comput-
ing Systems (ICDCS), vol. 23, pp. 550–559, 2003.
[32] T. Maoz, A. Barak, and L. Amar, “Combining virtual machine migration
with process migration for hpc on multi-clusters and grids,” in IEEE
Cluster 2008, Tsukuba, 2008.
[33] G. Ammons, J. Appavoo, M. Butrico, D. Da Silva, D. Grove,
K. Kawachiya, O. Krieger, B. Rosenburg, E. Van Hensbergen, and
R. W. Wisniewski, “Libra: a library operating system for a jvm in a
virtualized execution environment,” in VEE ’07: Proceedings of the 3rd

Toward A Cloud Operating System

Uploaded by

Copyright:

Available Formats

Toward A Cloud Operating System

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Toward A Cloud Operating System

Uploaded by

Copyright:

Available Formats

Toward a Cloud Operating System

336 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

2010 IEEE/IFIP Network Operations and Management Symposium Workshops 337

338 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

2010 IEEE/IFIP Network Operations and Management Symposium Workshops 339

340 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

2010 IEEE/IFIP Network Operations and Management Symposium Workshops 341

342 2010 IEEE/IFIP Network Operations and Management Symposium Workshops

You might also like