Keywords

1 Introduction

A time-series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series [1] is a sequence taken at successive equally spaced points in time.

For Ex: “Customer A” logs a time-series which can be represented as cpu.utilization [“host” = ‘host1’, “region” = ‘prod’] 1562532547 80

  • ‘cpu.utilization’ is a metric name

  • “host” = ‘host1’, “region” = ‘prod’ are dimension K, V value pair

  • 1562532547 is epoc source timestamp

  • 80 is the value of the metric at a given timestamp.

Cardinality broadly refers to unique data values (or unique time-series) contained in a data set group. High cardinality implies, that group contains a large percentage of totally unique values. Low cardinality implies that the group contains many repeated elements.

Time-series data tends to be paired with metadata (also referred to as “tags”) that describes the data. Often this primary time-series data or the metadata is indexed [6] for faster query performance through data filters. The cardinality [2] of a time-series dataset is typically defined by the cross-product of the cardinality [2] of each individual indexed column.

Systems that contain multiple indexed columns, each with a large number of unique values, the cardinality of such cross-product can get really large and fall under “high-cardinality.”

The rest of this paper is organized as follows. Section 2 outlines the architecture and high level design. Section 3 proposes a novel solution to avoid cardinality explosion using cardinality based rate limiting. The section also goes on to describe sequencing steps. Section 4 provides details of tenants that are blacklisted due to cardinality explosion. Section 5 describes future work of integrating proposed solution into Kubernetes for ease of deployment and self-healing. Finally, Section 6 concludes the paper describing the importance of proposed architecture on availability of eBay massively parallel and scalable metric platform.

2 Architecture and Design

Fig. 1.
figure 1

System architecture

The Figure in 1 illustrates high level design and architecture of the system. Cardinality server(s) (annotated as CS), runs within each instance of the time series database. The server intercepts all metrics signals that a replica shard receives and computes cardinality using HyperLogLog (HLL) at namespace (customer or tenant) and at for each metric identified by its name. This information is periodically polled by a central assessment service which applies pre-configured threshold based rules on cardinality limits. Based on these policies either a namespace or its associated metric is marked as blacklisted. Once a metric or tenant is blacklisted, additional metric values are ignored at ingestion. A snapshot of blacklisted information is shown in the Fig. 1. Ingress daemon are stateless application which listen on a distributed streaming platform powered by Apache Kafka for all incoming metric signals. This application uses blacklisting provided by assessment service to block incoming namespaces or its associated metrics on hot path.

Fig. 2.
figure 2

Sample structure

Time series data tends to be paired with metadata [9] (sometimes called “tags”) that describes incoming data points. The primary time series data or the metadata [6] is indexed for faster query performance, so that one can quickly find the values that match all of the specified tags. The cardinality of a time series data set is typically defined by the cross product of the cardinality of each individual indexed column. For example for a set of candies of 10 colors with 5 categories (plain, peanut, almond, dark, and vanilla), the cardinality of such a set is \(10\times 5 = 50\) or sum permutations of candies. The underlying storage system must have correct indices that would then allow it to efficiently retrieve all blue and crispy candies (which are objectively the best). A time series data set with multiple indexed columns, each with a large number of unique values, the cardinality of that cross-product can get really large and termed as “high-cardinality”.

Cardinality typically affects time-series in following manner:

  • In a high-speed ingestion/indexing time series [1] system, the total memory footprint [12] is directly proportional to the number of active time-series it can handle.

  • Data series with high cardinality can adversely affect the read service level agreements (SLAs), as there are far more unique data points that need to be retrieved or rendered on a graph. This provides for a bad user experience. Many times having so many data points rendered on a graph panel is not very useful for a visual representation of that data.

Fig. 3.
figure 3

HLL blocks

There are multiple mechanisms available today to compute the overall cardinality of a time series data that are provided by various Open source libraries. Most of the libraries use HyperLogLog [4] (HLL, or its variant HLL++) to compute the cardinality on a data set. The basis of the HyperLogLog [4] algorithm, is the observation that the cardinality of a multiset of uniformly distributed random numbers can be estimated by computing the maximum number of leading zeros in the binary representation of each number in the set. On a data set with maximum number of leading zeros observed is n, an estimate for the number of distinct elements in the set is 2n. In the HyperLogLog [4] algorithm, a hash function is applied to each element in the original multiset to obtain a multiset of uniformly distributed random numbers with the same cardinality as the original multiset. The cardinality of this randomly distributed set can then be estimated using the algorithm above. The simple estimate of cardinality obtained using the algorithm above has the disadvantage of a large variance. In the ‘HyperLogLog‘ [4] algorithm, the variance is minimized by splitting the multiset into numerous subsets, calculating the maximum number of leading zeros in the numbers in each of these subsets, and using a harmonic mean to combine these estimates for each subset into an estimate of the cardinality of the whole set.

3 Solution

The proposed solution attempts to reduce cardinality explosion or slow poisoning (where cardinality of time series increases slowly over a period of time) done by non-compliant tenants by performing cardinality based rate limiting [12]. Rate limiting is performed on ingestion source (which receives time-series data in a system) based on centralized cardinality computation for a tenant on its logged [8] time series signals using custom threshold based cardinality policies across all available times-series underlying infrastructure.

Fig. 4.
figure 4

Policy definition

The proposed solution listed has following features.

  • Quota management is against cardinality at a tenant level and additionally at a metric level within a tenant. Such quota management allows multi-tenant support for time series data on shared infrastructure.

  • Metrics with high cardinality are identified with ease and blacklisted for a given customer. This avoids a blanket level blacklisting for the entire tenant. For instance, consider a production code roll-out done by a customer where it introduced a new metric. However, since the metric was logged incorrectly it leads to a cardinality explosion against that metric name. In such cases, if the cardinality for that metric breaches a certain threshold only that metric is blacklisted instead of blacklisting the entire customer profile.

  • Custom rules for cardinality quota is possible, where a specific customer requires high cardinality support for multiple metrics or a specific metric. Both of these proposals are supported and implemented.

  • Controlling the cardinality allows the system to better predict read performance throughout. For example, the maximum time of read queries are clearly determined. (since all queries will return data for only time-series within their cardinality limits as they were rate limited during ingestion)

  • Users are notified in real time, about cardinality threshold breaches both against individual metric or tenant during metric ingestion.

  • Today, many metric storages have scalability challenges when it comes to supporting huge cardinality data sets [10]. This problem is generally alleviated by horizontal scaling of shards to a cluster (total cardinality/number of unique time-series per shard). This is clearly not enough as more infrastructure is required as cardinality continues to grow for a tenant. There needs to be a way to compute cardinality quota for tenants based on their logging across all clusters or underlying infrastructure where their time-series is stored. This is currently not available in any systems out in the industry. Also, cardinality computation should allow us to blacklist certain tenants when they go over the time-series quota to prevent any underlying storage stability issues. The blacklisting would mean feeding the cardinality threshold breach information back to ingestion source which is lacking in the time series system today. The system should only blacklisting certain offending metrics in tenant space.

Fig. 5.
figure 5

BlackList namespaces with cardinality

3.1 Sequencing Steps

The following steps are performed for every incoming metric signal into ingress daemons as described in Fig. 1

  • A hash of incoming metric signal is computed and written to shard (replica) on metric store by ingress daemon locally.

  • Incoming data time series signals are then intercepted and passed on to the light weight cardinality module on every time series database process in the back end.

  • Cardinality module (CS) computes cardinality at two levels (tenant and metric) by creating separate HLLs per customer (hereafter referred to as a namespace) and at namespace + metric name (hereafter referred to a name level) The computation is done at (namespace, HLL) and (namespace, name, HLL)

  • This information is exposed on each cardinality server (CS) by a REST service using HTTP(s) ‘$hostname:8091/admin/cardinality’ as shown in Fig. 2.

The cardinality information depicts the start from where cardinality is computed to its endpoints. The REST service can optionally take in custom query parameters. ‘namespaces[]-namespaces-count’ depicts the cardinality on the time series database (TSDB) instance for a customer and ‘namespaces[]-namespaces- names[]->count’ which cardinality of top K metric names within the namespace.

The HLLs per (namespace) and (namespace, name) are persisted in local TSDB instances at a specified location ‘/mnt/promTsdb/storage/cardinality’ every hour, in HLL blocks as shown in Fig. 3.

  • The HLL blocks are mergeable over time and it will used to determine cardinality history for a (namespace) and (namespace,name) combination.

  • A central “assessment-service” (AS) described in Fig. 1 discovers all the Prometheus TSDB shards [5]. The discovery mechanism can be either static or dynamic. The assessment service is topology or infrastructure aware. This implies that the service is aware of the location of TSDB instances (or shards) that houses replica for a given column.

  • Assessment service invokes the REST service at “$hostname:8091/admin/cardinality” on every TSDB instances. The maximum cardinality number are computed across replicas of a given shard. Assesment service performs a sum total of the cardinality grouped by tenant and metric name. Cardinality is computed at both tenant level and top-K metric names within a tenant.

  • The central service exposes blacklisted information shown in Fig. 5, at a REST endpoint “blacklist”. This provides capability to source ingestion layer to either drop all incoming metrics for a tenant or specific metrics within a tenant quota. This information can be additionally persisted to ensure stateful cardinality information.

A tenant or a metric is blacklisted based on cardinality quota rules. As shown in Fig. 4, ‘profiles-)profile[]-) namespace-) threshold’ depicts customer or tenant cardinality quota and ‘profiles-) profile[]-) namespace-) metrics[]-) threshold’ is custom overridden quota for a metric within a tenant space.

Fig. 6.
figure 6

Blacklisted tenants

Ingress daemons showin in Fig. 1 feeds off the “blacklist” endpoint exposed by assessment service and blacklists any future time. Series for a (namespace) and (namespace, name) based on the cardinality list and rules.

The running cardinality context or quota is reset at the end of day (or custom time interval), further to which all signals are allowed until they breach the cardinality policy rules again (they get blacklisted on breach of quota).

4 Evaluation

The graph in Fig. 6 depicts reporting on tenants that either blacklisted or one of their associated metrics is blacklisted. The threshold alerts are configured on tenants identified by namespace, when they breach 80% of the allowed cardinality limits.

The graph in Fig. 7 depicts total cardinality seen per shard per day. This is a important measure to determine the rate of cardinality growth on a daily basis. The shards within the time series database based on internal benchmarks can support 20 million unique series per day. System administrators are notified by alerts when a shard reaches near this limit. This allows for resizing, capacity planing of the storage cluster to support growth in time series data.

Fig. 7.
figure 7

Cardinality per instance per day

5 Future Work

The entire platform is running on virtual machines and will be migrated to Kubernetes [13] that allows to containerize and orchestrate the deployment for centralized assessment service. Additionally, rules can be configured as custom resource definitions which will be propagated to entire infrastructure on any change. Any subscriber, such as assessment service will be notified on these changes and would have blacklisting information. Also, in the current systemdoes not account for historic cardinality information. In future, the system can use ‘Bollinger Bands’ [14] to compute moving average of cardinality information. The blacklist can be further enhanced by computing threshold over standard deviations on upper and lower bands.

6 Conclusion

The proposed system outlined in this paper uses custom cardinality rules, has helped to scale underlying monitoring platform and support ingestion of approximately five million metrics per second without any outages. Extensive benchmarking have been performed to see the overhead on including cardinality server on timeseries database and time to detect a rogue namespace or its associated metric. On a 50 node TSDB database, the system can handle approximately 8–10 million metrics per second with a footprint of 10% of CPU on database and 400 MB of memory. The solution was extremely efficient in finding slow poisoning use-cases.

Availability of metric [11] platform depends on proposed system. In many cases, it was observed that corrupt tenants can cause substantial damage to underlying infrastructure and metric platform. Extensive monitoring and alerting built around proposed system provides visibility into cardinality trends for customer data volumes. Cardinality trends has been used as an input in predicting future infrastructure scaling needs of metric platform.