High-level alerting metrics
Sourcegraph’s metrics include a single high-level metric
alert_count which indicates the number of
level=warning alerts each service has fired over time for each Sourcegraph service. This is the same metric presented on the Overview Grafana dashboard:
To set up notifications for these alerts, see: alerting.
Description: The number of alerts each service has fired and their severity level. The severity levels are defined as follows:
critical: something is definitively wrong with Sourcegraph. We suggest using a high-visibility notification channel for these alerts.
- Examples: Database inaccessible, running out of disk space, running out of memory.
- Suggested action: Page a site administrator to investigate.
warning: something could be wrong with Sourcegraph. We suggest checking in on these periodically, or using a notification channel that will not bother anyone if it is spammed. Over time, as warning alerts become stable and reliable across many Sourcegraph deployments, they will also be promoted to critical alerts in an update by Sourcegraph.
- Examples: High latency, high search timeouts.
- Suggested action: Email a site administrator to investigate and monitor when convenient, and please let us know so that we can improve them.
- Although the values of
alert_countare floating-point numbers, only their whole numbers have meaning. For example:
0.7indicate no alerts are firing, while
1.2indicates exactly one alert is firing and
3.0indicates exactly three alerts firing.
warning, as defined above.
service_name: the name of the service that fired the alert, one of the following constants:
name: the name of the alert that the service fired (chosen by the service)
description: a human-readable description of the alert
To get examples of how you might consume this metric in your own alerting system, see: Custom consumption of Sourcegraph alerts.
A complete reference of Sourcegraph’s vast set of Prometheus metrics is not yet available. If you are interested in this, please reach out by filing an issue or contacting us at [email protected]