Rally is a Benchmark-as-a-Service project for OpenStack.
Rally is intended to provide the community with a benchmarking tool that is capable of performing specific, complicated and reproducible test cases on real deployment scenarios.
If you are here, you are probably familiar with OpenStack and you also know that it's a really huge ecosystem of cooperative services. When something fails, performs slowly or doesn't scale, it's really hard to answer different questions on "what", "why" and "where" has happened. Another reason why you could be here is that you would like to build an OpenStack CI/CD system that will allow you to improve SLA, performance and stability of OpenStack continuously.
The OpenStack QA team mostly works on CI/CD that ensures that new patches don't break some specific single node installation of OpenStack. On the other hand it's clear that such CI/CD is only an indication and does not cover all cases (e.g. if a cloud works well on a single node installation it doesn't mean that it will continue to do so on a 1k servers installation under high load as well). Rally aims to fix this and help us to answer the question "How does OpenStack work at scale?". To make it possible, we are going to automate and unify all steps that are required for benchmarking OpenStack at scale: multi-node OS deployment, verification, benchmarking & profiling.
Rally workflow can be visualized by the following diagram:
In terms of software architecture, Rally is built of 4 main components:
- Server Providers - provide servers (virtual servers), with ssh access, in one L3 network.
- Deploy Engines - deploy OpenStack cloud on servers that are presented by Server Providers
- Verification - component that runs tempest (or another specific set of tests) against a deployed cloud, collects results & presents them in human readable form.
- Benchmark engine - allows to write parameterized benchmark scenarios & run them against the cloud.
There are 3 major high level Rally Use Cases:
Typical cases where Rally aims to help are:
- Automate measuring & profiling focused on how new code changes affect the OS performance;
- Using Rally profiler to detect scaling & performance issues;
- Investigate how different deployments affect the OS performance:
- Find the set of suitable OpenStack deployment architectures;
- Create deployment specifications for different loads (amount of controllers, swift nodes, etc.);
 
 
- Automate the search for hardware best suited for particular OpenStack cloud;
- Automate the production cloud specification generation:
- Determine terminal loads for basic cloud operations: VM start & stop, Block Device create/destroy & various OpenStack API methods;
- Check performance of basic cloud operations in case of different loads.
 
 
Wiki page:
https://wiki.openstack.org/wiki/Rally
Rally/HowTo:
https://wiki.openstack.org/wiki/Rally/HowTo
Launchpad page:
https://launchpad.net/rally
Code is hosted on github:
https://github.com/stackforge/rally
Trello board:
https://trello.com/b/DoD8aeZy/rally

