[RELEASE] Scylla Monitoring Stack 3.5.1

Amnon Heiman

<amnon@scylladb.com>

unread,

Nov 15, 2020, 8:56:10 AM11/15/20

to scylladb-dev, ScyllaDB users

The Scylla team announces the release of Scylla Monitoring Stack 3.5.1

Scylla Monitoring Stack is an open-source stack for monitoring Scylla Enterprise and Scylla Open Source, based on Prometheus and Grafana. Scylla Monitoring Stack 3.5.1 supports:

Scylla Open Source versions 3.3, 4.0, 4.1, and 4.2
Scylla Enterprise versions 2019.x and 2020.x
Scylla Manager 2.1.x and 2.2.x

Related Links

Bug Fixes

sstable reads units are wrong #1125
Links on the "Nodes" table on "Overview" don't go to the correct place #1122
Hack that added support for the old and new ports doesn't work with the Agent graphs on the Manager dashboard #1120
The -N flag is ignored in start-all.sh #1116
Remove the avg line from the multi-graph panels in the Overview dashboard #1115
The advanced dashboard has a dot in the uid #1112
Average read latency is miscalculated #1110
Manager showing "offline" despite metrics being present #1109
branch-3.5 grafana failure #1104
Alternator: "Average UpdateItem latency by Instance", show data in seconds and not milliseconds #1101
Alternator dashboard doesn't have GetItem Latencies #1100

Known issues

Following Scylla-Monitoring 2.2 ports change, Prometheus will listen to both the old port and the new to help during the migration.
This is was found to cause issues when the port in scylla_manager_server.yml is changed to the new 5090 port.

We suggest that following a Scylla-Manager upgrade to version 2.2, edit prometheus/prometheus.yml.template and remove scylla_manager1 job from it.

Amnon Heiman

<amnon@scylladb.com>

unread,

Nov 24, 2020, 2:53:28 AM11/24/20

to scylladb-dev, ScyllaDB users

The Scylla team announces the release of Scylla Monitoring Stack 3.5.2

Scylla Monitoring Stack is an open-source stack for monitoring Scylla Enterprise and Scylla Open Source, based on Prometheus and Grafana. Scylla Monitoring Stack 3.5.2 supports:

Scylla Open Source versions 3.3, 4.0, 4.1, and 4.2
Scylla Enterprise versions 2019.x and 2020.x
Scylla Manager 2.1.x and 2.2.x

Related Links

Download Scylla Monitoring 3.5.2

Bug Fixes

Alternator dashboard node-table should use the names of the new dashboards #1134
start-grafana.sh looks for the docker IP which breaks on Podman #1145
prometheus.yml.template: remove the second manager job #1150

Notice to users who update Scylla-Manager to version 2.2

Following Scylla-Monitoring 2.2 ports change, you will need to update scylla_manager_server.yml with the new port.

Amnon Heiman

<amnon@scylladb.com>

unread,

Jan 18, 2021, 4:20:48 AM1/18/21

to scylladb-dev, ScyllaDB users

The Scylla team is pleased to announce the release of Scylla Monitoring Stack 3.6.

Scylla Monitoring Stack is an open-source stack for monitoring Scylla Enterprise and Scylla Open Source, based on Prometheus and Grafana. Scylla Monitoring Stack 3.6 supports:

Scylla Open Source versions 4.1, 4.2 and 4.3

Scylla Enterprise versions 2019.x and 2020.x
Scylla Manager 2.1.x and 2.2.x

Related Links

Download Scylla Monitoring 3.6

New in Scylla Monitoring Stack 3.6

Adding the Advisor section #1162

The Advisor is a new concept in Scylla Monitoring. It identifies potential problems and notifies them. The Advisor section in the Overview dashboard has two parts, one for various issues detected, like unprepared statements. The second is an indication of how balanced the system is. When the cluster works properly, all nodes and shards should act the same. An outlier shard could be a result of a problem. For example, if the number of CQL connections per shard varies between shards, it indicates a driver configuration issue.

Use Loki as data source #1147

Grafana Loki is a log aggregation system inspired by Prometheus. The monitoring stack will use Loki for alert and metrics generation. Note that it does not act as a centralized monitoring system. In Scylla Monitoring, Loki gets the traces using rsyslog. Make sure to configure the rsyslog client on the Scylla servers.

Add Scylla Open Source 4.3 dashboards #1144
New look to the node table #1097

The node tables are part of the Datacenter section in the Overview dashboard. The table is now more organized and more informative.

This is how it looks like when a node joins the cluster

Collapsible rows #973

Collapsible rows are now used in various places on the dashboard. You can open them for additional information.

New Lightweight Transactions (LWT) metrics for the dashboard #936

LWT involved multiple Paxos messages. New panels in the LWT section now show the number of Paxos messages. This gives an insight into the actual traffic involved in the LWT operations.

Easy way to capture the entire dashboard, in one click #248

At the bottom of each dashboard, there are now two buttons, one to report an issue on the page and another to take a snapshot of the dashboard as a download image file.

Support dynamic intervals #957

Many graphs on the dashboards use a rate interval; some activity measured over a period of time. There has been a long discussion in the Grafana community as to which interval to use for a timescale.

In general, when looking at graphs of different time ranges (i.e., last hour vs. last week), i the time rate interval should make sense.

Grafana 7.2 came with a dynamic interval to solve this issue. You can read more about it here.

Grafana: Use UTC by default #1065

Time shown in graphs is now displayed in UTC time instead of the browser local time.

Upgrade to Grafana 7.3.5 #1061

Operational Changes

Configure rsyslog on the Scylla hosts. Scylla monitoring uses Loki to generate metrics and alerts from logs. It gets the traces from rsyslog. For the full functionality to work, you need an rsyslog agent running on each of the Scylla machines and to add the scylla monitoring as an rsyslog target.
Use docker-compose as an optional replacement for start-all.sh #273
A command line option to add Prometheus targets #1197

Bug Fixes

Passing --no-loki got illegal option: --error #1152

Amnon Heiman

<amnon@scylladb.com>

unread,

Feb 8, 2021, 1:39:42 PM2/8/21

to scylladb-dev, ScyllaDB users

The Scylla team announces the release of Scylla Monitoring Stack 3.6.1

Scylla Monitoring Stack is an open-source stack for monitoring Scylla Enterprise and Scylla Open Source, based on Prometheus and Grafana. Scylla Monitoring Stack 3.6.1 supports:

Scylla Open Source versions 4.1, 4.2, 4.3 and 4.4

Scylla Enterprise versions 2019.x and 2020.x
Scylla Manager 2.1.x and 2.2.x

Related Links

Download Scylla Monitoring 3.6.1

Bug Fixes

Write latency and write count should not include hints/streaming scheduling group #1265
Update all advisor / cql dashboard queries taking into account only the user gerenated queries and not internal ones #1263

These bug fixes are relevant for Scylla Open-source 4.2, 4.3, 4.4 users and for Sclla enterprise 2020.1 users

Amnon Heiman

<amnon@scylladb.com>

unread,

Mar 15, 2021, 4:40:06 AM3/15/21

to scylladb-dev, ScyllaDB users

The Scylla team announces the release of Scylla Monitoring Stack 3.6.2

Scylla Monitoring Stack is an open-source stack for monitoring Scylla Enterprise and Scylla Open Source, based on Prometheus and Grafana. Scylla Monitoring Stack 3.6.2 supports:

Scylla Open Source versions 4.1, 4.2, 4.3 and 4.4

Scylla Enterprise versions 2019.x and 2020.x
Scylla Manager 2.1.x and 2.2.x

Related Links

Download Scylla Monitoring 3.6.2

Bug Fixes

Timeouts and latencies per shards panels are missing #1294
Non Token Aware queries for counters - A work around for #804

Amnon Heiman

<amnon@scylladb.com>

unread,

Mar 22, 2021, 4:42:56 PM3/22/21

to scylladb-dev, ScyllaDB users

The Scylla team announces the release of Scylla Monitoring Stack 3.6.3

Scylla Monitoring Stack is an open-source stack for monitoring Scylla Enterprise and Scylla Open Source, based on Prometheus and Grafana. Scylla Monitoring Stack 3.6.3 supports:

Scylla Open Source versions 4.1, 4.2, 4.3 and 4.4

Scylla Enterprise versions 2019.x and 2020.x

Scylla Manager 2.1.x, 2.2.x and 2.3.x

Related Links

Download Scylla Monitoring 3.6.3

New Dashboards

Scylla Manager 2.3.x

Bug Fixes

loki container breaks -A option in start-all.sh #1326

Amnon Heiman

<amnon@scylladb.com>

unread,

Apr 26, 2021, 4:57:40 AM4/26/21

to scylladb-dev, ScyllaDB users

The Scylla team is pleased to announce the release of Scylla Monitoring Stack 3.7.

Scylla Monitoring Stack is an open-source stack for monitoring Scylla Enterprise and Scylla Open Source, based on Prometheus and Grafana. Scylla Monitoring Stack 3.7 supports:

Scylla Open Source versions 4.2, 4.3 and 4.4
Scylla Enterprise versions 2019.x, 2020.x and 2021.x
Scylla Manager 2.2.x, 2.3.x

Related Links

Download Scylla Monitoring 3.7

Versions updates Scylla Monitoring Stack 3.7

set Prometheus version to 2.25.2 #1333
Update the Alertmanager plugin to 1.0 #1288
Switch the Alertmanager to the new table panels #1071

New in Scylla Advisor

New Advisor feature: more detailed advice. (Learn more about Scylla Advisor here.)

New in Scylla Monitoring Stack 3.7

Overview dashboard enhancements:

Add manager task progress indication to the overview dashboard #1250

The Manager progress is now part of the header rows, for example, this is how a backup looks like:

Hinted handoffs being accumulated/being sent - annotation #1258

When a node is temporarily down, the updates that would have been sent to it are stored as hints, when the node is up again, those hints are sent. This translates to extra load on other nodes. There are optional annotations for storing and sending hints.

Secondary Indexes/Materialized Views background-built - annotation #1257

When adding a Secondary Index or a Materialized View to an existing table, the new index will be built in the background. This will add extra load on the nodes. You can use the MV annotation to see when a Materialized View or Secondary index is being built.

Provide an indication of coordinator / replica errors per node #1229

Visually present error/no error on the node table #1035

The Node Table, found on the DC section that is part of the overview dashboard, can indicate when there are CQL optimization warnings, and when there are errors on the node.

CQL Dashboard

Add CQL errors to dashboards #1276

Scylla 4.4 comes with additional CQL errors. The new CQL Errors panel, found on the CQL dashboard, will show those errors. Please note, that for clarity, only active errors will be shown.

Add more info to client table (Scylla Open Source 4.4) #1259

Scylla Open Source 4.4 adds additional information to the client table found on the CQL dashboard.

Update Scylla Manager Dashboard #1180

The manager dashboard got a facelift. It now shows the last success and last failure of Backup and Repair tasks.

Panel for Scylla HWLB #907

Heat Weighted Load Balancing (HWLB) is an optimization mechanism that distributes queries according to the probability a requested value will be in the cache.