CloudTrace Demo: Tracing Cloud Network Delay
Giuseppe Di Lena, Frédéric Giroire, Thierry Turletti, Chidung Lac
To cite this version:
Giuseppe Di Lena, Frédéric Giroire, Thierry Turletti, Chidung Lac. CloudTrace Demo: Tracing Cloud
Network Delay. IEEE International Conference on Network Softwarization (NetSoft), Jun 2021, Fully
Virtual, France. IEEE, 10.1109/NetSoft51509.2021.9492583. hal-03364025
HAL Id: hal-03364025
https://inria.hal.science/hal-03364025
Submitted on 4 Oct 2021
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
CloudTrace Demo: Tracing Cloud Network Delay
Giuseppe Di Lena
Frédéric Giroire
Thierry Turletti
Chidung Lac
Université Côte d’Azur, Orange Labs Université Côte d’Azur, CNRS Inria, Université Côte d’Azur
Orange Labs
Lannion, France
Sophia Antipolis, France
Sophia Antipolis, France
Lannion, France
giuseppe.dilena@orange.com
frederic.giroire@cnrs.fr
thierry.turletti@inria.fr
chidung.lac@orange.com
Abstract—Many companies and organizations are moving
their applications from on-premises data centers to the cloud.
The cloud infrastructures can potentially provide an infinite
amount of computation (e.g., Elastic Compute) and storage
(e.g., Simple Service Storage). In addition, all cloud providers
propose different offers: IaaS, PaaS, and SaaS. This demo focuses
on the IaaS services, presenting a simple tool to measure the
network delay in a virtual infrastructure built entirely in the
cloud. These measurements are useful for organizations that are
moving current applications to, or creating new applications
in, the cloud, but have requirements on the maximum, or
average, network delay that these applications can tolerate. We
present CloudTrace, a simple CLI tool that creates regional and
multiregional experiments to measure delay, using Amazon AWS.
Index Terms—Cloud, Network
I. I NTRODUCTION
Cloud computing provides convenient on-demand access to
a potentially unlimited pool of computing resources. In recent
years, cloud computing resources have become cheaper and
more powerful, thanks to the growing interest of companies
around the world. Cloud computing brings several advantages:
a company has little or no capital expenditure (CAPEX) when
launching a new product, instead of having to invest in data
centers and servers before knowing how much these resources
are going to be used. It is flexible and on-demand. This means
that the users can consume computing power and only pay for
the amount of resources they actually consumed.
On a high level, cloud providers offer three service models,
Infrastructure as a Service (IaaS), Platform as a Service (PaaS)
and Software as a Service (SaaS). We focus on IaaS, where
users can manage the virtual instances and networks created
in the cloud infrastructure, while the cloud provider is responsible for managing the physical infrastructure for providing a
virtualization layer to isolate each user’s environment.
We provide a brief overview of the main component of
Amazon Web Services (AWS) that is responsible for this
cloud model which is Amazon Elastic Cloud Computing
(EC2). We describe all the virtual devices involved in the
creation of a virtual infrastructure, and we apply the best
practice strategies to deploy our experiments in a regional and
multiregional environment. All major cloud providers offer
resources for each instance flavor (e.g., t3.medium, m5.large,
etc.), describing how many vCores and how much RAM are
assigned. Regarding networking, AWS provides the maximum
throughput that a flavor is able to generate (usually from 500
Mbps to 100 Gbps). AWS does not provide the maximum
978-1-6654-0522-5/21/$31.00 ©2021 IEEE
delay inside the infrastructure. It only mentions that all the
regions and the availability zones are connected with a high
performance optical fiber network.
The goal of this demo is to present a simple Command
Line Interface (CLI) tool that can measure the network delay
between two, or more, different networks in AWS using
different deployment strategies.
II. R ELATED WORK
Since AWS launches EC2 service [2], multiple approaches
have been proposed to measure computing or network performances in virtualized environments and cloud infrastructures.
Most of them focus on the impact of a virtualized environment
with different hypervisors [10], virtualization using Virtual
Machines and containers [9], or the network performances
variability in virtualized instances [5]. One of the most convincing measurement tools for the cloud environments is
Cloudbench (CBTOOL) from IBM [7]. This tool automates
experiments for a large range of workloads (e.g., Hadoop [6])
in cloud environments. In [8], the authors performed largescale traceroute measurements over the global AWS network
infrastructure. The traceroutes were performed between 15
regions, and the goal was to study the complexity of the
AWS infrastructure. The work in [4] covers more generally
the major cloud providers and the main Tier-1 ISPs. It shows
how Internet traffic is more and more generated and shared
with private interconnection between the cloud providers, thus
bypassing the Tier-1 ISPs.
III. AWS BACKGROUND
The AWS global infrastructure is composed of 2 different
entities: Region and Availability Zone (AZ).
The Availability Zone is a set of one or more data centers
connected by redundant high-speed networks in order to
guarantee high availability to the users. The region represents
a set of multiple AZs clustered in a specific area and located
around the world, within 100 km of each other. AZs in a
same region are interconnected with high-bandwidth, lowlatency, high-throughput networks, over fully redundant and
dedicated metro fiber. To guarantee privacy, all traffic between
AZs is encrypted. If an application is partitioned across AZs,
companies are better isolated and protected from issues such as
power outages or natural disasters [1]. AWS provides hundreds
of different services. We describe briefly the main services
and tools provided for IaaS. The first layer of an IaaS in
AWS is the Virtual Private Cloud (VPC) that creates a
Fig. 2. Architecture in multiregional setup with VPC peering
Fig. 1. Architecture in regional setup
virtual network defined by the user. All virtual instances and
virtual devices (i.e., virtual gateways or route tables) should be
created within a VPC. The VPC is composed of one or more
subnets. The user managing the subnets can create as many
subnets as needed in a single VPC till the address space in the
VPC becomes empty. To control network flows in the network
subnet, the user can create route tables associated with them.
If the instances are in the same VPC, it is possible to connect
them directly using the private IP if they are in the same
subnet. If they are in a different subnet, the route table should
be configured before the connection (Fig. 1). By default, the
VPC does not provide any way to connect the instances to
the external network. The Virtual Internet Gateway is a
highly scalable virtual device managed entirely by AWS that
allows traffic to go from the VPC to the outside world and
vice-versa. Once created, the route table should be updated
to redirect traffic to the subnets. Two other main services are
Network Access Control List (NACL) and Security Group
(SG). NACL is a stateless firewall associated with an entire
subnet, while SG is a stateful firewall associated with a virtual
interface. There are three types of address in AWS: private,
public, and Elastic IP. Elastic IP is a public address that can be
conveniently attached to, and detached from, a virtual interface
at any time. Once created and attached, the Elastic IP is not
released even if the user deletes the instance or the interface.
The address should be explicitly released by the user when
it is not needed anymore. AWS proposes the VPC peering
service [3] which makes it possible to connect two or multiple
VPCs. Instances in the peering can connect between them
via private IP, as if they are in the same network. Amazon
ensures that the traffic going via the peering never leaves
the AWS network, increasing security and performances. The
last service is Elastic Compute (EC2), that provides secure
and resizable compute capacity in the cloud, and proposes a
large number of instance types [2]. For our experiments, we
use different architectures (Figs. 1 and 2). Fig. 1 describes
a regional infrastructure composed of two subnets created in
two different AZs within a single VPC, i.e., in a single region.
As explained above, the Internet Gateway is responsible to
allow connectivity with the external Internet. An instance is
created on each subnet, and the flow table is stored in a single
route table, serving the two subnets. The instances can connect
to each other via the public IPs (an elastic IP is attached to
each interface of the instance), or via the private IP since they
are on the same VPC. In Fig. 2, the infrastructure is shared
between two different regions. The architecture is similar, and
the instances can still use the public IP to connect to each other
like in the regional environment, but they cannot use by default
the private IPs. In this case, a VPC peering connection between
the two VPCs needs to be created to allow the instances to use
the private IP to connect them, as if they were on the same
network. Our tool automatically generates the infrastructure in
both cases, with two or more AZs in case of regional and two
or more regions in case of multiregional experiments.
IV. I MPLEMENTATION
CloudTrace is open-source, and it is compatible with Linux
and MAC OSX, while it is possible to install it on Windows
via Docker. The tool is built using LiteSQL, Ansible, ParisTraceroute, and the Boto3 API for the automatic deployment
in Amazon AWS. The basic idea is to have a CLI that takes
as input the regions and the type of experiment that the
user wants to run, and creates an environment performing
multiple traceroutes between the virtual instances. First of all,
CloudTrace creates a unique ID for the experiment (a string
composed of 8 alphanumeric random characters).
Regional Deployment. The user has to specify only one
region. The tool first checks if there is at least one VPC
slot available in the region, then creates the VPC with subnet
10.0.0.0/16. Finally, it creates then a 10.0.X.0/24 subnet for
each Availability Zone (AZ) available in the region. After
setting up the subnets, it creates an Internet gateway and
attaches it to the VPC. It then creates a route table and appends
all the routes necessary to allow external connectivity to the
future instances created within the subnet. To allow Internet
connectivity, adding the route is not enough. The tool also
adds the rules in the NACL. After configuring all the subnets
and the NACL and SG rules, the tool creates an instance in
each subnet.
Multiregional Deployment. The setup of the infrastructure
is similar to the regional one. First of all, the tool connects with
each region specified by the user and checks if there is at least
one VPC slot for each region. If it is the case, it creates a VPC,
a subnet and an Internet gateway in each region in the default
AZ, and configures them like in the regional experiment. The
VPC and the virtual subnet have the same subnet 10.0.X.0/24,
because in case of VPC peering, the subnet cannot overlap.
If the user selects the option to use VPC peering, a peering
connection between all the VPCs are created, and all the route
tables are updated, in order to forward the traffic via the
private IP between remote VPCs (Fig. 2). All the information
needed is saved in the client database (i.e., public IPs, VPC ID,
regions, instances ID, etc.). The tool parses all this information
and creates for each experiment an Ansible inventory (a file
that describes login info and addresses of the instances). For
each experiment, Ansible runs a standard playbook, a YAML
file that lists all the requirements the instances have to install
and all the commands that each instance has to execute before
running the experiment (e.g., update the kernel).
V. E XPERIMENTS
We present two experiments, tracing the network performances during one month in Amazon AWS, we will present
in the demo how easily is to setup multiple experiments like
these. The tests are made on two AWS regions: Frankfurt (eucentral-1) and London (eu-west-2), and ran from 04/11/2020
till 04/12/2020, where we collected in total more than one
millions of traceroutes.
Regional Experiment
Source
Destination
eu-central-1a
eu-central-1b
eu-central-1b
eu-central-1a
eu-central-1a
eu-central-1c
eu-central-1c
eu-central-1a
eu-central-1b
eu-central-1c
eu-central-1c
eu-central-1b
Confidence Interval Delay [ms]
Public IPs
Private IPs
0.835 ± 0.697
1.027 ± 1.04
0.864 ± 0.839
1.147 ± 1.202
1.066 ± 0.7
1.301 ± 1.496
1.094 ± 0.599
1.188 ± 1.369
0.915 ± 0.928
1.122 ± 1.505
0.943 ± 0.961
1.482 ± 2.149
TABLE I
F RANKFURT DELAY WITH CONFIDENCE INTERVAL (99%) WITH PUBLIC
AND PRIVATE IP S
Public vs. Private. AWS makes it possible to create a VPC
peering between two or more different regions. This is done
to connect two instances deployed in separated regions with
the private IP assigned by AWS. As we can see, the delay
using the private IP is higher, and also less stable, than for
the public IP. This behavior is also repeated in all the regional
experiments we performed. For example, Table I shows the
delay with the confidence intervals in the Frankfurt region.
We think that these differences are due to the additional layer
added inside the AWS network and the control that each packet
has to pass in order to circulate inside the AWS network.
Multiregional Experiments. We are performing the experiment using the public IP and AZ in different regions. Multiregional experiments mean long distances since the traffic has to
go from Frankfurt to London, i.e., higher delays are expected.
Fig. 3. Eu-central-1a to eu-west-2a trace (public IP)
Fig. 3 shows the delay between Frankfurt and London. The
black line indicates the daily rolling mean (it includes the
mean of the last 24 hours), while the blue line shows the
hourly rolling mean (the last 60 minutes).
VI. C ONCLUSIONS
AWS provides hundreds of services to their customers. We
considered the main EC2 components (e.g., VPCs, subnets,
Virtual Internet Gateway) that companies and organizations
use to move their infrastructures to the cloud. We mainly focused on the networking performances of the global infrastructure. Companies that are moving delay-sensitive applications
on AWS are interested in the latency of the network in a high
availability (multiregional) configuration or on a regional configuration with multiple Availability Zones. We implemented
CloudTrace, a simple tool to measure the networking performances and automatically plot statistics, together with a map
of the analyzed network. CloudTrace is publicly accessible at
https://github.com/Giuseppe1992/CloudTrace.
R EFERENCES
[1] Amazon. AWS Global Infrastructure documentation. https://aws.
amazon.com/about-aws/global-infrastructure/regions az/, 2020.
[2] Amazon.
EC2 instance types.
https://aws.amazon.com/ec2/
instance-types/, 2020.
[3] Amazon. Vpc peering aws documentation. https://docs.aws.amazon.
com/vpc/latest/peering/what-is-vpc-peering.html, 2020.
[4] Todd Arnold, Jia He, et al. Cloud Provider Connectivity in the Flat
Internet. In ACM IMC, pages 230–246, 2020.
[5] Karyna Gogunska, Chadi Barakat, and Guillaume Urvoy-Keller. Tuning
optimal traffic measurement parameters in virtual networks with machine
learning. In IEEE CloudNet, pages 1–3, 2019.
[6] Hadoop. Hadoop docs. https://hadoop.apache.org/docs/stable/, 2020.
[7] IBM. IBM Cloudbench Documentation. https://developer.ibm.com/
depmodels/cloud/projects/cloudbench-cbtool/, 2020.
[8] Quentin Jacquemart, Alessandro Baldi Vitali, and Guillaume UrvoyKeller. Measuring the Amazon Web Services (AWS) WAN Infrastructure. In CoRes 2019, Saint Laurent de la Cabrerisse, France, 2019.
[9] Z. Li, M. Kihl, Q. Lu, and J. A. Andersson. Performance overhead
comparison between hypervisor and container based virtualization. In
IEEE AINA, pages 955–962, 2017.
[10] PV Vardhan Reddy and Lakshmi Rajamani. Evaluation of different
hypervisors performance in the private cloud with SIGAR framework.
IJACSA, 5(2), 2014.