Nothing Special   »   [go: up one dir, main page]

Gitlab

GitLab is a web-based DevOps lifecycle tool that provides a Git-repository manager providing wiki, issue-tracking and continuous integration and deployment pipeline features, using an open-source license, developed by GitLab Inc

Available solutions




This template is for Zabbix version: 7.0
Also available for: 6.4 6.2 6.0 5.4

Source: https://git.zabbix.com/projects/ZBX/repos/zabbix/browse/templates/app/gitlab_http?at=release/7.0

GitLab by HTTP

Overview

This template is designed to monitor GitLab by Zabbix that works without any external scripts. Most of the metrics are collected in one go, thanks to Zabbix bulk data collection.

The template GitLab by HTTP — collects metrics by an HTTP agent from the GitLab /-/metrics endpoint. See https://docs.gitlab.com/ee/administration/monitoring/prometheus/gitlab_metrics.html.

Requirements

Zabbix version: 7.0 and higher.

Tested versions

This template has been tested on:

  • GitLab 13.5.3 EE

Configuration

Zabbix should be configured according to the instructions in the Templates out of the box section.

Setup

This template works with self-hosted GitLab instances. Internal service metrics are collected from the GitLab /-/metrics endpoint. To access metrics following two methods are available:

  1. Explicitly allow monitoring instance IP address in gitlab whitelist configuration.
  2. Get token from Gitlab Admin -> Monitoring -> Health check page: http://your.gitlab.address/admin/health_check; Use this token in macro {$GITLAB.HEALTH.TOKEN} as variable path, like: ?token=your_token. Remember to change the macros {$GITLAB.URL}. Also, see the Macros section for a list of macros used to set trigger values.

NOTE. Some metrics may not be collected depending on your Gitlab instance version and configuration. See Gitlab's documentation for further information about its metric collection.

Macros used

Name Description Default
{$GITLAB.URL}

URL of a GitLab instance.

http://localhost
{$GITLAB.HEALTH.TOKEN}

The token path for Gitlab health check. Example ?token=your_token

{$GITLAB.UNICORN.UTILIZATION.MAX.WARN}

The maximum percentage of Unicorn workers utilization for a trigger expression.

90
{$GITLAB.PUMA.UTILIZATION.MAX.WARN}

The maximum percentage of Puma thread utilization for a trigger expression.

90
{$GITLAB.HTTP.FAIL.MAX.WARN}

The maximum number of HTTP request failures for a trigger expression.

2
{$GITLAB.REDIS.FAIL.MAX.WARN}

The maximum number of Redis client exceptions for a trigger expression.

2
{$GITLAB.UNICORN.QUEUE.MAX.WARN}

The maximum number of Unicorn queued requests for a trigger expression.

1
{$GITLAB.PUMA.QUEUE.MAX.WARN}

The maximum number of Puma queued requests for a trigger expression.

1
{$GITLAB.OPEN.FDS.MAX.WARN}

The maximum percentage of used file descriptors for a trigger expression.

90

Items

Name Description Type Key and additional info
Get instance metrics HTTP agent gitlab.get_metrics

Preprocessing

  • Check for not supported value: any error

    ⛔️Custom on fail: Discard value

  • Prometheus to JSON
Instance readiness check

The readiness probe checks whether the GitLab instance is ready to accept traffic via Rails Controllers.

HTTP agent gitlab.readiness

Preprocessing

  • Check for not supported value: any error

    ⛔️Custom on fail: Set value to: {"master_check":[{"status":"failed"}]}

  • JSON Path: $.master_check[0].status

  • Boolean to decimal

    ⛔️Custom on fail: Set value to: 0

  • Discard unchanged with heartbeat: 30m

Application server status

Checks whether the application server is running. This probe is used to know if Rails Controllers are not deadlocked due to a multi-threading.

HTTP agent gitlab.liveness

Preprocessing

  • Check for not supported value: any error

    ⛔️Custom on fail: Set value to: {"status": "failed"}

  • JSON Path: $.status

  • Boolean to decimal

    ⛔️Custom on fail: Set value to: 0

  • Discard unchanged with heartbeat: 30m

Version

Version of the GitLab instance.

Dependent item gitlab.deployments.version

Preprocessing

  • JSON Path: $[?(@.name=="deployments")].labels.version.first()

  • Discard unchanged with heartbeat: 3h

Ruby: First process start time

Minimum UNIX timestamp of ruby processes start time.

Dependent item gitlab.ruby.process_start_time_seconds.first

Preprocessing

  • JSON Path: $[?(@.name=="ruby_process_start_time_seconds")].value.min()

  • Discard unchanged with heartbeat: 3h

Ruby: Last process start time

Maximum UNIX timestamp ruby processes start time.

Dependent item gitlab.ruby.process_start_time_seconds.last

Preprocessing

  • JSON Path: $[?(@.name=="ruby_process_start_time_seconds")].value.max()

  • Discard unchanged with heartbeat: 3h

User logins, total

Counter of how many users have logged in since GitLab was started or restarted.

Dependent item gitlab.user_session_logins_total

Preprocessing

  • JSON Path: $[?(@.name=="user_session_logins_total")].value.first()

    ⛔️Custom on fail: Discard value

User CAPTCHA logins failed, total

Counter of failed CAPTCHA attempts during login.

Dependent item gitlab.failed_login_captcha_total

Preprocessing

  • JSON Path: $[?(@.name=="failed_login_captcha_total")].value.first()

    ⛔️Custom on fail: Discard value

User CAPTCHA logins, total

Counter of successful CAPTCHA attempts during login.

Dependent item gitlab.successful_login_captcha_total

Preprocessing

  • JSON Path: $[?(@.name=="successful_login_captcha_total")].value.first()

    ⛔️Custom on fail: Discard value

Upload file does not exist

Number of times an upload record could not find its file.

Dependent item gitlab.upload_file_does_not_exist

Preprocessing

  • JSON Path: $[?(@.name=="upload_file_does_not_exist")].value.first()

    ⛔️Custom on fail: Discard value

Pipelines: Processing events, total

Total amount of pipeline processing events.

Dependent item gitlab.pipeline.processing_events_total

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

Pipelines: Created, total

Counter of pipelines created.

Dependent item gitlab.pipeline.created_total

Preprocessing

  • JSON Path: $[?(@.name=="pipelines_created_total")].value.sum()

    ⛔️Custom on fail: Discard value

Pipelines: Auto DevOps pipelines, total

Counter of completed Auto DevOps pipelines.

Dependent item gitlab.pipeline.auto_devops_completed.total

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

Pipelines: Auto DevOps pipelines, failed

Counter of completed Auto DevOps pipelines with status "failed".

Dependent item gitlab.pipeline.auto_devops_completed_total.failed

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

Pipelines: CI/CD creation duration

The sum of the time in seconds it takes to create a CI/CD pipeline.

Dependent item gitlab.pipeline.pipeline_creation

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

Pipelines: Pipelines: CI/CD creation count

The count of the time it takes to create a CI/CD pipeline.

Dependent item gitlab.pipeline.pipeline_creation.count

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

Database: Connection pool, busy

Connections to the main database in use where the owner is still alive.

Dependent item gitlab.database.connection_pool_busy

Preprocessing

  • JSON Path: The text is too long. Please see the template.

Database: Connection pool, current

Current connections to the main database in the pool.

Dependent item gitlab.database.connection_pool_connections

Preprocessing

  • JSON Path: The text is too long. Please see the template.

Database: Connection pool, dead

Connections to the main database in use where the owner is not alive.

Dependent item gitlab.database.connection_pool_dead

Preprocessing

  • JSON Path: The text is too long. Please see the template.

Database: Connection pool, idle

Connections to the main database not in use.

Dependent item gitlab.database.connection_pool_idle

Preprocessing

  • JSON Path: The text is too long. Please see the template.

Database: Connection pool, size

Total connection to the main database pool capacity.

Dependent item gitlab.database.connection_pool_size

Preprocessing

  • JSON Path: The text is too long. Please see the template.

Database: Connection pool, waiting

Threads currently waiting on this queue.

Dependent item gitlab.database.connection_pool_waiting

Preprocessing

  • JSON Path: The text is too long. Please see the template.

Redis: Client requests rate, queues

Number of Redis client requests per second. (Instance: queues)

Dependent item gitlab.redis.client_requests.queues.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

  • Change per second
Redis: Client requests rate, cache

Number of Redis client requests per second. (Instance: cache)

Dependent item gitlab.redis.client_requests.cache.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

  • Change per second
Redis: Client requests rate, shared_state

Number of Redis client requests per second. (Instance: shared_state)

Dependent item gitlab.redis.client_requests.shared_state.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

  • Change per second
Redis: Client exceptions rate, queues

Number of Redis client exceptions per second. (Instance: queues)

Dependent item gitlab.redis.client_exceptions.queues.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

  • Change per second
Redis: Client exceptions rate, cache

Number of Redis client exceptions per second. (Instance: cache)

Dependent item gitlab.redis.client_exceptions.cache.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

  • Change per second
Redis: client exceptions rate, shared_state

Number of Redis client exceptions per second. (Instance: shared_state)

Dependent item gitlab.redis.client_exceptions.shared_state.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

  • Change per second
Cache: Misses rate, total

The cache read miss count.

Dependent item gitlab.cache.misses_total.rate

Preprocessing

  • JSON Path: $[?(@.name=="gitlab_cache_misses_total")].value.sum()

  • Change per second
Cache: Operations rate, total

The count of cache operations.

Dependent item gitlab.cache.operations_total.rate

Preprocessing

  • JSON Path: $[?(@.name=="gitlab_cache_operations_total")].value.sum()

  • Change per second
Ruby: CPU usage per second

Average CPU time util in seconds.

Dependent item gitlab.ruby.process_cpu_seconds.rate

Preprocessing

  • JSON Path: $[?(@.name=="ruby_process_cpu_seconds_total")].value.avg()

    ⛔️Custom on fail: Discard value

  • Change per second
Ruby: Running_threads

Number of running Ruby threads.

Dependent item gitlab.ruby.threads_running

Preprocessing

  • JSON Path: The text is too long. Please see the template.

Ruby: File descriptors opened, avg

Average number of opened file descriptors.

Dependent item gitlab.ruby.file_descriptors.avg

Preprocessing

  • JSON Path: $[?(@.name=="ruby_file_descriptors")].value.avg()

Ruby: File descriptors opened, max

Maximum number of opened file descriptors.

Dependent item gitlab.ruby.file_descriptors.max

Preprocessing

  • JSON Path: $[?(@.name=="ruby_file_descriptors")].value.max()

Ruby: File descriptors opened, min

Minimum number of opened file descriptors.

Dependent item gitlab.ruby.file_descriptors.min

Preprocessing

  • JSON Path: $[?(@.name=="ruby_file_descriptors")].value.min()

Ruby: File descriptors, max

Maximum number of open file descriptors per process.

Dependent item gitlab.ruby.process_max_fds

Preprocessing

  • JSON Path: $[?(@.name=="ruby_process_max_fds")].value.avg()

Ruby: RSS memory, avg

Average RSS Memory usage in bytes.

Dependent item gitlab.ruby.process_resident_memory_bytes.avg

Preprocessing

  • JSON Path: The text is too long. Please see the template.

Ruby: RSS memory, min

Minimum RSS Memory usage in bytes.

Dependent item gitlab.ruby.process_resident_memory_bytes.min

Preprocessing

  • JSON Path: The text is too long. Please see the template.

Ruby: RSS memory, max

Maximum RSS Memory usage in bytes.

Dependent item gitlab.ruby.process_resident_memory_bytes.max

Preprocessing

  • JSON Path: The text is too long. Please see the template.

HTTP requests rate, total

Number of requests received into the system.

Dependent item gitlab.http.requests.rate

Preprocessing

  • JSON Path: $[?(@.name=="http_requests_total")].value.sum()

  • Change per second
HTTP requests rate, 5xx

Number of handle failures of requests with HTTP-code 5xx.

Dependent item gitlab.http.requests.5xx.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

  • Change per second
HTTP requests rate, 4xx

Number of handle failures of requests with code 4XX.

Dependent item gitlab.http.requests.4xx.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

  • Change per second
Transactions per second

Transactions per second (gitlab_transaction_* metrics).

Dependent item gitlab.transactions.rate

Preprocessing

  • JSON Path: The text is too long. Please see the template.

    ⛔️Custom on fail: Discard value

  • Change per second

Triggers

Name Description Expression Severity Dependencies and additional info
Gitlab instance is not able to accept traffic last(/GitLab by HTTP/gitlab.readiness)=0 High Depends on:
  • Liveness check was failed
Liveness check was failed

The application server is not running or Rails Controllers are deadlocked.

last(/GitLab by HTTP/gitlab.liveness)=0 High
Version has changed

The GitLab version has changed. Acknowledge to close the problem manually.

last(/GitLab by HTTP/gitlab.deployments.version,#1)<>last(/GitLab by HTTP/gitlab.deployments.version,#2) and length(last(/GitLab by HTTP/gitlab.deployments.version))>0 Info Manual close: Yes
Too many Redis queues client exceptions

"Too many Redis client exceptions during the requests to Redis instance queues."

min(/GitLab by HTTP/gitlab.redis.client_exceptions.queues.rate,5m)>{$GITLAB.REDIS.FAIL.MAX.WARN} Warning
Too many Redis cache client exceptions

"Too many Redis client exceptions during the requests to Redis instance cache."

min(/GitLab by HTTP/gitlab.redis.client_exceptions.cache.rate,5m)>{$GITLAB.REDIS.FAIL.MAX.WARN} Warning
Too many Redis shared_state client exceptions

"Too many Redis client exceptions during the requests to Redis instance shared_state."

min(/GitLab by HTTP/gitlab.redis.client_exceptions.shared_state.rate,5m)>{$GITLAB.REDIS.FAIL.MAX.WARN} Warning
Failed to fetch info data

Zabbix has not received a metrics data for the last 30 minutes

nodata(/GitLab by HTTP/gitlab.ruby.threads_running,30m)=1 Warning Manual close: Yes
Depends on:
  • Liveness check was failed
Current number of open files is too high min(/GitLab by HTTP/gitlab.ruby.file_descriptors.max,5m)/last(/GitLab by HTTP/gitlab.ruby.process_max_fds)*100>{$GITLAB.OPEN.FDS.MAX.WARN} Warning
Too many HTTP requests failures

"Too many requests failed on GitLab instance with 5xx HTTP code"

min(/GitLab by HTTP/gitlab.http.requests.5xx.rate,5m)>{$GITLAB.HTTP.FAIL.MAX.WARN} Warning

LLD rule Unicorn metrics discovery

Name Description Type Key and additional info
Unicorn metrics discovery

DiscoveryUnicorn specific metrics, when Unicorn is used.

HTTP agent gitlab.unicorn.discovery

Preprocessing

  • Prometheus to JSON: unicorn_workers

    ⛔️Custom on fail: Discard value

  • JavaScript: The text is too long. Please see the template.

Item prototypes for Unicorn metrics discovery

Name Description Type Key and additional info
Unicorn: Workers

The number of Unicorn workers

Dependent item gitlab.unicorn.unicorn_workers[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='unicorn_workers')].value.sum()

Unicorn: Active connections

The number of active Unicorn connections.

Dependent item gitlab.unicorn.active_connections[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='unicorn_active_connections')].value.sum()

Unicorn: Queued connections

The number of queued Unicorn connections.

Dependent item gitlab.unicorn.queued_connections[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='unicorn_queued_connections')].value.sum()

Trigger prototypes for Unicorn metrics discovery

Name Description Expression Severity Dependencies and additional info
Unicorn worker utilization is too high min(/GitLab by HTTP/gitlab.unicorn.active_connections[{#SINGLETON}],5m)/last(/GitLab by HTTP/gitlab.unicorn.unicorn_workers[{#SINGLETON}])*100>{$GITLAB.UNICORN.UTILIZATION.MAX.WARN} Warning
Unicorn is queueing requests min(/GitLab by HTTP/gitlab.unicorn.queued_connections[{#SINGLETON}],5m)>{$GITLAB.UNICORN.QUEUE.MAX.WARN} Warning

LLD rule Puma metrics discovery

Name Description Type Key and additional info
Puma metrics discovery

Discovery of Puma specific metrics when Puma is used.

HTTP agent gitlab.puma.discovery

Preprocessing

  • Prometheus to JSON: puma_workers

  • JavaScript: The text is too long. Please see the template.

Item prototypes for Puma metrics discovery

Name Description Type Key and additional info
Active connections

Number of puma threads processing a request.

Dependent item gitlab.puma.active_connections[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='puma_active_connections')].value.sum()

Workers

Total number of puma workers.

Dependent item gitlab.puma.workers[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='puma_workers')].value.sum()

Running workers

The number of booted puma workers.

Dependent item gitlab.puma.running_workers[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='puma_running_workers')].value.sum()

Stale workers

The number of old puma workers.

Dependent item gitlab.puma.stale_workers[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='puma_stale_workers')].value.sum()

Running threads

The number of running puma threads.

Dependent item gitlab.puma.running[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='puma_running')].value.sum()

Queued connections

The number of connections in that puma worker's "todo" set waiting for a worker thread.

Dependent item gitlab.puma.queued_connections[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='puma_queued_connections')].value.sum()

Pool capacity

The number of requests the puma worker is capable of taking right now.

Dependent item gitlab.puma.pool_capacity[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='puma_pool_capacity')].value.sum()

Max threads

The maximum number of puma worker threads.

Dependent item gitlab.puma.max_threads[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='puma_max_threads')].value.sum()

Idle threads

The number of spawned puma threads which are not processing a request.

Dependent item gitlab.puma.idle_threads[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='puma_idle_threads')].value.sum()

Killer terminations, total

The number of workers terminated by PumaWorkerKiller.

Dependent item gitlab.puma.killer_terminations_total[{#SINGLETON}]

Preprocessing

  • JSON Path: $[?(@.name=='puma_killer_terminations_total')].value.sum()

    ⛔️Custom on fail: Discard value

Trigger prototypes for Puma metrics discovery

Name Description Expression Severity Dependencies and additional info
Puma instance thread utilization is too high min(/GitLab by HTTP/gitlab.puma.active_connections[{#SINGLETON}],5m)/last(/GitLab by HTTP/gitlab.puma.max_threads[{#SINGLETON}])*100>{$GITLAB.PUMA.UTILIZATION.MAX.WARN} Warning
Puma is queueing requests min(/GitLab by HTTP/gitlab.puma.queued_connections[{#SINGLETON}],15m)>{$GITLAB.PUMA.QUEUE.MAX.WARN} Warning

Feedback

Please report any issues with the template at https://support.zabbix.com

You can also provide feedback, discuss the template, or ask for help at ZABBIX forums

Articles and documentation

+ Propose new article

Didn't find integration you need?