Monitoring and Alerting System
This document aims to provide a light analysis of LeanXcale Monitoring and Alerting System, as well as setting a practical guide about generating new alert and record rules in Prometheus.
1. Architecture
LeanXcale monitoring is build on two open-source solutions: Prometheus and Grafana.
The following image shows the architecture of components and how they are logically grouped to be started and stopped.
LeanXcale Monitoring and Alerting System consist on the following elements:
-
Prometheus: Centralizes and stores metrics in a time series database.
-
Alert Manager: Triggers alarms based on Prometheus stored information (it is part of Prometheus software also).
-
Exporters: Running in monitorized hosts, export metrics sets to feed Prometheus:
-
Mastermind exporter: Publishes metrics from the following Lx components: Snapshot Server, Commit Sequencer and Config Manager.
-
Node Exporter: Comes by default with Prometheus installation, exports a lot of metrics about machine performance.
-
Metrics Exporter: It exports metrics about Lx components at two levels:
-
Lx Component running java code inside JVM.
-
Lx Component processes as OS level (I/O, memory, …)
-
-
In addition, it’s possible to feed Prometheus with another metrics. It only needs key-value files placed in /tmp/scrap path. It’s common to use these to export metrics resulting from executing specific processes. For example, this is the file generated in the TPCC benchmarking tests:
appuser@5c050e45b4d0:/tmp/scrap$ cat escada.prom
tpccavglatency 101
tpmCs 28910
aborttpmCs 124
Monitoring is started when the cluster is started and the central components (Prometheus and Grafana) are deployed in the metadata server.
2. Starting Monitor
From now on, let’s consider BASEDIR as the path where your LeanXcale database is installed.
As seen in the previous section diagram, there are two logical components involved in Monitoring and alerting that include the other components you can see in the previous diagram:
-
MonS: Monitor Server Components: Including Prometheus, Grafana and Alert Manager
-
MonC: Monitor Client Components: Including Node Exporter and Metrics Exporter. Mastermind Metrics Exporter is started when the Mastermind component itself is started.
Just need to run the following command:
source $BASEDIR/env.sh
startLX Mon
You can start sever and client monitoring components individually executing the folloing commands from the machines where they are placed:
lxManageNode start MonC
lxManageNode start MonS
3. LeanXcale Monitoring dashboard
As you can see, Grafana is included in LeanXcale distribution. You can access Grafana console in the following URL:
http://metadata_server:3000/
In the Dashboards section, you will find a default LeanXcale Dashboard that shows operating metrics, health status of components and alerts info. All these graphical information will give you an overall view of the status of your database.
4. Prometheus configuration
When Monitoring components are started for the first time, Prometheus is automatically configured. This Configuration is persisted in:
$BASEDIR/monitor/prometheus-package/prometheus.yml
This configuration is also accessible from the Configuration section in Prometheus console:
http://metadata_server:9091
It is also interesting to have a look at Targets section. There, you will be able to check if your metrics exporters are up and running.
As you can guess
lx-configuration
corresponds to Mastermind Metrics Exporter,
lx-metrics
with Metrics Exporter and lx-server
with Node Exporter.
4.1. Prometheus rules
As you can see in previous config file, rules are included in yml files that Prometheus found in rule_files tag values. All the rules defined in all files can be checked in Prometheus console (http://metadata_server:9091/rules).
Prometheus supports two types of rules which may be configured and then evaluated at regular intervals (remind that are stored in Prometheus as time series data).
4.1.1. Recording rules
Allows you to precompute frequently needed or computationally expensive expressions and save their result as a new set of time series. Querying the precomputed result will then often be much faster than executing the original expression every time it is needed. This could help to improve Grafana Dashboards performance.
It’s important to remark that there is no firing action for these rules, they are just calculated and stored in Prometheus ir order to be queried or collected for dashboards in Grafana.
At this moment, there are a few recording rules in LeanXcale installation, you can find them in:
$BASEDIR/monitor/prometheus-package/basic_rules.yml
Note that rules are distributed in groups whose name should be unique in the file.
A typical syntax for a recording rule is composed of two mandatory tags
(record
and expr
). An optional tag would be interval
.
Let’s take an example from this file:
groups:
- name: cluster metrics
interval: 20s # How often rules in the group are evaluated (overrides global config)
rules:
- record: instance:cpu_busy:avg_rate1m # Rule name: There is no mandatory way in which you must name the rules. But *level:metric:operations* is recommended.
expr: 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[1m])) * 100) # expression to be evaluated to add a new value point to the time series
The rule in the example calculates an average rate of CPU usage based on the metric called node_cpu_seconds_total. Let’s see an example of how this metric is shown at
http://metadata_server:9101/metrics
# HELP node_cpu_seconds_total Seconds the cpus spent in each mode. # TYPE node_cpu_seconds_total counter node_cpu_seconds_total{cpu="0",mode="idle"} 1276.51 node_cpu_seconds_total{cpu="0",mode="iowait"} 1.72 node_cpu_seconds_total{cpu="0",mode="irq"} 0 node_cpu_seconds_total{cpu="0",mode="nice"} 3.02 node_cpu_seconds_total{cpu="0",mode="softirq"} 44.63 node_cpu_seconds_total{cpu="0",mode="steal"} 0 node_cpu_seconds_total{cpu="0",mode="system"} 17.02 node_cpu_seconds_total{cpu="0",mode="user"} 42.66 node_cpu_seconds_total{cpu="1",mode="idle"} 1273.47
This metric has two labels, cpu and mode. These labels can be accessed from the expression to discriminate the concrete metric that is being evaluated.
To know more about Querying and Expression Syntax in Prometheus, please check (https://prometheus.io/docs/prometheus/latest/querying/basics/).
5. Creating a New Alert
If you want to create a new alert:
-
If it is an alert that is to be generated based on Prometheus host metrics, you just need to know the metric and the expression that is being evaluated. Go to Alerting Rules section.
-
Maybe you want to create an alert based on a precomputed operation over a metric. To read about that, please check here.
-
If you want to send an alert directly to Alert Manager, have a look at the LeanXcale standard format for manual alerts here, and consider these two options:
-
Consider that alerts can also be configured to override global configuration and setting a time delay to be waited since the expression condition is fulfilled until the effective firing of the alert. Check here.
5.1. Alerting rules
Allow you to define alert conditions based on Prometheus expression language expressions and to send notifications about firing alerts to an external service.
At this moment, alerting rules are defined on the following file:
$BASEDIR/monitor/prometheus-package/alert.rules.yml
Syntax is quite similar to recoding rules, although there are some new tags.
groups:
- name: example
rules:
- alert: HighErrorRate # Alert name
expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5 # Expression that sets de condition for the alert to be fired
for: 10m # causes Prometheus to wait for a certain duration between first encountering a new expression output vector element and counting an alert as firing for this element. In this case, Prometheus will check that the alert continues to be active during each evaluation for 10 minutes before firing the alert.
labels: # Set of additional labels to be attached to the alert.
severity: page
annotations: # specifies a set of informational labels that can be used to store longer additional information such as alert descriptions or runbook links
summary: "High request latency on {{ $labels.instance }}"
description: "{{ $labels.instance }} has a median request latency above 1s (current value: {{ $value }}s)"
In label and annotations there are two variables that can be used:
-
$labels: Holds the label key/value pairs of an alert instance.
-
$Value: Holds the evaluated value of an alert instance (it means, the value that fires the alert).
If you set a time period in for
tag, it causes Prometheus to wait
for a certain duration between first encountering a new expression
output vector element and counting an alert as firing for this element.
In this case, Prometheus will check that the alert continues to be
active during each evaluation for 10 minutes before firing the alert.
IMPORTANT!!!
It’s not recommended to use $value variable in labels tag. The reason is
that the labels
tag includes a label called value
, which will usually
vary from one evaluation from the next. The effect of this is that each
evaluation will see a brand-new alert, and treat the previous one as no
longer firing. This is as the labels of an alert define its identity,
and thus the for
condition will never be satisfied.
It should be used in annotations tag, let’s see two examples:
# INCORRECT: THIS ALERT MAY NOT BE NEVER FIRED
groups:
- name: example
rules:
- alert: ExampleAlertIncorrect
expr: metric > 10
for: 5m
labels:
severity: page
value: "{{ $value }}" ######## DON'T INCLUDE $value HERE!
annotations:
summary: "Instance {{ $labels.instance }}'s metric is too high"
# CORRECT ALERT
- alert: ExampleAlert
expr: metric > 10
for: 5m
labels:
severity: page
annotations:
summary: "Instance {{ $labels.instance }}'s metric is {{ $value }}"Combining recording and alerting rules**
5.2. Alerting rules based on recording rules
Sometimes it could be useful in LeanXcale procedures to set alerts based on complex precomputed metrics. Let’s see an example:
groups:
- name: recording_rules
interval: 5s
rules:
- record: node_exporter:node_filesystem_free:fs_used_percents
expr: 100 - 100 * ( node_filesystem_free{mountpoint="/"} / node_filesystem_size{mountpoint="/"} )
- name: alerting_rules
rules:
- alert: DiskSpace10%Free
expr: node_exporter:node_filesystem_free:fs_used_percents >= 90
# Note that previous expression evaluates the metric defined in the recording rule.
labels:
severity: moderate
annotations:
summary: "Instance {{ $labels.instance }} is low on disk space"
description: "{{ $labels.instance }} has only {{ $value }}% free."
6. Testing rules file syntax
Prometheus provides a tool to check if your rule files are syntactically correct after modifying them.
In the LeanXcale installation, this tool is in:
$BASEDIR/monitor/prometheus-package/promtool
Usage example:
./promtool check rules alert.rules.yml
Checking alert.rules.yml
SUCCESS: 5 rules found
7. Sending manually an alert to Alert Manager
7.1. Manual Alerts format
A Manual Alert would contain the following tags:
-
status: Alert’s activation state (firing, resolved)
-
labels:
-
alertname: Unique ID for the alert.
-
alertType: This field represents the type, the idea is setting a classification for alerts based on the possible anomalous situation that could happen in different LeanXcale components.
-
severity: Severity level (critical, warning)
-
component: LeanXcale component that sends the Alert to Alert Manager (KVDS, LgLTM, QE, ZK, LgSnS, LgCmS, MtM, CflM, KVMS).
-
serviceIp: IP/Hostname:Port where the component that fires the alert is running.
-
-
annotations:
-
summary: A descriptive message of the problem.
-
An example of an alert to be fired would look like this:
[{
"status": "firing",
"labels": {
"alertname": "1234567890",
"alertType": "alertType1",
"severity": "warning",
"component": "KVDS",
"serviceIp": "172.17.0.2:9993"},
"annotations": {
"summary": "Alarm message"}
}]
7.2. Alerts from Bash scripts
It’s possible to send an alert to Alert Manager using the sendAlert.sh
script included in LeanXcale installation, you can find it in:
$BASEDIR/monitor/sendAlert.sh
This script finds out in Prometheus config file the host and port where the Alert Manager is running.
./sendAlert.sh <firing|resolved> # Alert status
<warning|critical> # Alert severity
<AlertId> # Alert unique ID
<AlertType> # Alert Type (indicates an anomalous situation)
<Component> # Lx Component that fires alarm
<ServiceIP> # hostname/IP:Port
"<Message>"
Let’s see how we would have fired the example’s alert:
./sendAlert.sh firing warning 1234567890 alertType1 KVDS "172.17.0.2:9993" "Alarm message"
There is a script in LeanXcale installation that could help us to check if this is really working (you should have started LeanXcale Monitor previously, see first section). It takes one optional argument that indicates how much fo the alert information you want to see:
-
simple (default): Just alertname, start datetime and summary message.
-
extended: All alert’s tags in tabular format.
-
json: All alert’s tags (even internal ones) in json format
$BASEDIR/monitor/alarm_console.sh [simple|extended|json]
The previously fired alert will be shown like this:
Alertname Starts At Summary
1234567890 2019-05-29 09:15:36 UTC Alarm message
And we also can see the alert in the Alert Manager console.
To resolve the alert, just run again the script with resolved status:
./sendAlert.sh resolved warning 1234567890 alertType1 KVDS "172.17.0.2:9993" "Alarm message"
7.3. Alerts from Java (Log4j appender)
It is also possible to send alerts manually to Alert Manager from your Java code. You will need to set a new appender and a new logger in log4j2’s file located in
.$BASEDIR-BIN/templates/tm/log4j2.properties
appender.http.type=Http
appender.http.name=http
appender.http.layout.type=PatternLayout
appender.http.layout.pattern=%m
appender.http.url=http://172.17.0.2:9093/api/v1/alerts
...
#Alert Manager
logger.Alert Manager.name=Alert Manager
logger.Alert Manager.level=INFO
logger.Alert Manager.appenderRefs=Alert Manager
logger.Alert Manager.appenderRef.Alert Manager.ref=async
In this project, an AlertSender.java
class is defined. It contains a
static method that allows you to send a JSON alert message:
public static void sendAlert(Severity severity, String idAlarm, Status status, String message);
You can find a usage example in:
.$BASEDIR-dependencies/TM/common/src/test/java/com/leanxcale/alert/TestAlertSender.java
8. Querying the Alert Manager
It’s possible to get alerts info from Alert Manager in an easy way using
also amtool
. Let’s say we have the following alerts active:
Labels Annotations Starts At Ends At
Generator URL
alertType="alertType1" alertname="1234567890" component="KVDS" serviceIp="172.17.0.2:9993" severity="warning" summary="Alarm message" 2019-05-29 09:25:29 UTC 2019-05-29 09:30:29 UTC
alertType="alertType1" alertname="1234567891" component="KVDS" serviceIp="172.17.0.2:9993" severity="warning" summary="Alarm message" 2019-05-29 09:25:35 UTC 2019-05-29 09:30:35 UTC
alertType="alertType1" alertname="1234567892" component="KVDS" serviceIp="172.17.0.2:9993" severity="warning" summary="Alarm message" 2019-05-29 09:25:43 UTC 2019-05-29 09:30:43 UTC
alertType="alertType2" alertname="1234567893" component="KVDS" serviceIp="172.17.0.2:9993" severity="warning" summary="Alarm message" 2019-05-29 09:26:05 UTC 2019-05-29 09:31:05 UTC
alertType="alertType2" alertname="1234567894" component="KVDS" serviceIp="172.17.0.2:9993" severity="warning" summary="Alarm message" 2019-05-29 09:26:10 UTC 2019-05-29 09:31:10 UTC
In the first example, we ask the Alert Manager for the alert with alertname=1234567890 and in the second one, for all the alerts with alertType=alertType2:
appuser@5c050e45b4d0:$BASEDIR/monitor$ Alert Manager-package/amtool --Alert Manager.url=http://172.17.0.2:9093 alert query -o extended alertname="1234567890"
Labels Annotations Starts At Ends At Generator URL
alertType="alertType1" alertname="1234567890" component="KVDS" serviceIp="172.17.0.2:9993" severity="warning" summary="Alarm message" 2019-05-29 09:25:29 UTC 2019-05-29 09:30:29 UTC
appuser@5c050e45b4d0:$BASEDIR/monitor$ Alert Manager-package/amtool --Alert Manager.url=http://172.17.0.2:9093 alert query -o extended alertType="alertType2"
Labels Annotations Starts At Ends At Generator URL
alertType="alertType2" alertname="1234567893" component="KVDS" serviceIp="172.17.0.2:9993" severity="warning" summary="Alarm message" 2019-05-29 09:26:05 UTC 2019-05-29 09:31:05 UTC
alertType="alertType2" alertname="1234567894" component="KVDS" serviceIp="172.17.0.2:9993" severity="warning" summary="Alarm message" 2019-05-29 09:26:10 UTC 2019-05-29 09:31:10 UTC
9. Alerts
This is the detailed list of the events that are centralized in the alert system:
9.1. Query Engine events
Alert Identifier | Type | Description |
---|---|---|
START_SERVER |
warning(info) |
Indicates a server have been started. It shows the server address as IP:PORT |
START_SERVER_LTM |
warning(info) |
Indicates the LTM is started and the QE is connected to a Zookeeper instance |
START_SERVER_ERROR |
critical |
There has been an error while starting the server and/or the LTM |
STOP_SERVER |
warning(info) |
Indicates the server has been stopped |
SESSION_CONSISTENCY_ERROR |
warning |
Error when setting session consistency at LTM. |
DIRTY_METADATA |
warning |
The metadata has been updated but the QE couldn’t read the new changes |
CANNOT_PLAN |
warning |
The QE couldn’t come up with an execution plan for the query |
FORCE_CLOSE_CONNECTION |
warning(info) |
A connection was explicitly closed by an user |
FORCE_ROLLBACK |
warning(info) |
A connection was explicitly rollbacked by an user |
FORCED_ROLLBACK_FAILED |
warning |
Forced rollback failed |
FORCE_CANCEL_TRANSACTION |
warning(info) |
Forced transaction cancellation. The transaction was not associated to any connection |
KV_DIAL_BEFORE |
critical |
QE is not connected to the DS |
KV_NOT_NOW |
warning |
DS error: not now |
KV_ABORT |
warning(info) |
DS error: aborted by user |
KV_ADDR |
warning |
DS error: bad address |
KV_ARG |
warning |
DS error: bad argument |
KV_BAD |
warning |
DS error: corrupt |
KV_AUTH |
warning |
DS error: auth failed |
KV_BUG |
warning |
DS error: not implemented |
KV_CHANGED |
warning |
DS error: resource removed or format changed |
KV_CLOSED |
warning |
DS error: stream closed |
KV_CTL |
warning |
DS error: bad control request |
KV_DISKIO |
critical |
DS error: disk i/o error |
KV_EOF |
warning |
DS error: premature EOF |
KV_FMT |
warning |
DS error: bad format |
KV_FULL |
critical |
DS error: resource full |
KV_HALT |
critical |
DS error: system is halting |
KV_INTR |
warning |
DS error: interrupted |
KV_IO |
critical |
DS error: i/o error |
KV_JAVA |
warning |
DS error: java error |
KV_LOW |
critical |
DS error: low on resources |
KV_LTM |
warning |
DS error: LTM error |
KV_REC |
warning |
DS error: Rec error |
KV_LOG |
warning |
DS error: Log error |
KV_MAN |
critical |
DS error: please, read the manual |
KV_NOAUTH |
warning |
DS error: auth disabled |
KV_NOBLKS |
critical |
DS error: no more mem blocks |
KV_NOBLOB |
warning(info) |
DS error: no such blob. QE might not being handling blobs correctly |
KV_NOIDX |
warning(info) |
DS error: no such index. QE might not being handling index correctly |
KV_NOMETA |
critical |
DS error: no metadata |
KV_NOREG |
warning |
DS error: no such region |
KV_NOTOP |
warning |
DS error: no such tid operation |
KV_NOSERVER |
critical |
DS error: no server |
KV_NOTAVAIL |
critical |
DS error: not available |
KV_NOTID |
warning |
DS error: no such tid |
KV_NOTUPLE |
warning |
DS error: no such tuple |
KV_NOTUPLES |
warning |
DS error: no tuples |
KV_NOUSR |
warning(info) |
DS error: no such user |
KV_OUT |
critical |
DS error: out of resources |
KV_PERM |
warning(info) |
DS error: permission denied |
KV_PROTO |
warning |
DS error: protocol error |
KV_RDONLY |
warning |
DS error: read only |
KV_RECOVER |
warning |
DS error: system is recovering |
KV_RECOVERED |
warning |
DS error: system recovered |
KV_SERVER |
critical |
DS error: server |
KV_TOOLARGE |
warning |
DS error: too large for me |
KV_TOOMANY |
warning |
DS error: too many for me |
KV_TOUT |
warning |
DS error: timed out |
KV_REPL |
warning |
DS error: replica error |
9.2. Health Monitor events
Alert Identifier | Type | Description |
---|---|---|
COMPONENT_FAILURE |
warning |
Failure from the indicated service |
TIMEOUT_TOBEREGISTERED |
warning |
Waiting to register the indicated service |
CANNOT_REGISTER |
critical |
Cannot register the indicated service. Restart it |
9.3. Configuration events
Alert Identifier | Type | Description |
---|---|---|
COMPONENT_FAILURE |
warning |
The indicated service is down |
HAPROXY_NO_CONNECTION |
warning |
No connection with HA proxy while stopping |
TIMEOUT_BUCKET_UNASSIGNED |
critical |
The bucket reconfiguration couldn’t be done |
TIMEOUT_TOBEREGISTERED |
warning |
The indicated servipe still has dependencies to solve |
CANNOT_REGISTER |
critical |
|
CONSOLE_SERVER_DOWN |
warning |
Couldn’t start the console server |
RESTART_COMPONENT |
warning |
Coudn’t restart the indicated service |
SETTING_EPOCH |
warning |
Waiting until STS > RSts |
RECOVERY_FAILURE |
warning |
The indicated service has not been recovered |
RECOVERY_TIMEOUT |
warning |
The indicated service is bein recovered |
HOTBACKUP |
warning |
Pending recovery from hotbackup |
9.4. Transaction Log events
Alert Identifier | Type | Description |
---|---|---|
LOGDIR_ERROR |
critical |
Cannot create folder or is not a folder |
LOGDIR_ALLOCATOR_ERROR |
critical |
Error managing disk in logger |
LOGGER_FILE_ERROR |
warning |
IO file error |
LOGNET_CONNECTION_FAILED |
critical |
Cannot dial logger |
LOGSRV_CONNECTION_ERROR |
critical |
Network error |
KVDS_RECOVERY_FAILED |
critical |
The kvds instance couldn’t be recovered from logger |
FLUSH_FAILED |
critical |
Unexpected exception flushing to logger |