[QUEUED scylladb next] storage_proxy: trace speculative retries

0 views
Skip to first unread message

Commit Bot

<bot@cloudius-systems.com>
unread,
Jun 27, 2024, 10:37:52 AMJun 27
to scylladb-dev@googlegroups.com, Avi Kivity
From: Avi Kivity <a...@scylladb.com>
Committer: Nadav Har'El <n...@scylladb.com>
Branch: next

storage_proxy: trace speculative retries

A speculative retry can appear out of the blue[1] and confuse people, as
it looks like the consistency level was elevated. Fix by adding such
a tracepoint.

Sample output:

```
activity | timestamp | source | source_elapsed | client
---------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+-----------
Execute CQL3 query | 2024-06-27 14:25:58.947000 | 127.0.0.1 | 0 | 127.0.0.1
Parsing a statement [shard 0] | 2024-06-27 14:25:58.947918 | 127.0.0.1 | 2 | 127.0.0.1
Processing a statement for authenticated user: anonymous [shard 0] | 2024-06-27 14:25:58.948025 | 127.0.0.1 | 108 | 127.0.0.1
Creating read executor for token -4069959284402364209 with all: [127.0.0.1, 127.0.0.2] targets: [127.0.0.2] repair decision: NONE [shard 0] | 2024-06-27 14:25:58.948125 | 127.0.0.1 | 209 | 127.0.0.1
Added extra target 127.0.0.1 for speculative read [shard 0] | 2024-06-27 14:25:58.948128 | 127.0.0.1 | 212 | 127.0.0.1
Creating speculating_read_executor [shard 0] | 2024-06-27 14:25:58.948129 | 127.0.0.1 | 213 | 127.0.0.1
read_data: sending a message to /127.0.0.2 [shard 0] | 2024-06-27 14:25:58.948138 | 127.0.0.1 | 222 | 127.0.0.1
Launching speculative retry for data [shard 0] | 2024-06-27 14:25:58.948234 | 127.0.0.1 | 318 | 127.0.0.1
read_data: querying locally [shard 0] | 2024-06-27 14:25:58.948235 | 127.0.0.1 | 319 | 127.0.0.1
Start querying singular range {{-4069959284402364209, pk{000400000001}}} [shard 0] | 2024-06-27 14:25:58.948246 | 127.0.0.1 | 330 | 127.0.0.1
[reader concurrency semaphore user] admitted immediately [shard 0] | 2024-06-27 14:25:58.948250 | 127.0.0.1 | 334 | 127.0.0.1
[reader concurrency semaphore user] executing read [shard 0] | 2024-06-27 14:25:58.948258 | 127.0.0.1 | 342 | 127.0.0.1
Querying cache for range {{-4069959284402364209, pk{000400000001}}} and slice [(-inf, +inf)] [shard 0] | 2024-06-27 14:25:58.948281 | 127.0.0.1 | 365 | 127.0.0.1
Page stats: 1 partition(s), 0 static row(s) (0 live, 0 dead), 1 clustering row(s) (1 live, 0 dead) and 0 range tombstone(s) [shard 0] | 2024-06-27 14:25:58.948311 | 127.0.0.1 | 395 | 127.0.0.1
Querying is done [shard 0] | 2024-06-27 14:25:58.948320 | 127.0.0.1 | 404 | 127.0.0.1
read_data: message received from /127.0.0.1 [shard 0] | 2024-06-27 14:25:58.948351 | 127.0.0.2 | 12 | 127.0.0.1
Done processing - preparing a result [shard 0] | 2024-06-27 14:25:58.948354 | 127.0.0.1 | 438 | 127.0.0.1
Start querying singular range {{-4069959284402364209, pk{000400000001}}} [shard 0] | 2024-06-27 14:25:58.948370 | 127.0.0.2 | 31 | 127.0.0.1
[reader concurrency semaphore user] admitted immediately [shard 0] | 2024-06-27 14:25:58.948374 | 127.0.0.2 | 35 | 127.0.0.1
[reader concurrency semaphore user] executing read [shard 0] | 2024-06-27 14:25:58.948388 | 127.0.0.2 | 49 | 127.0.0.1
Querying cache for range {{-4069959284402364209, pk{000400000001}}} and slice [(-inf, +inf)] [shard 0] | 2024-06-27 14:25:58.948405 | 127.0.0.2 | 66 | 127.0.0.1
Page stats: 1 partition(s), 0 static row(s) (0 live, 0 dead), 1 clustering row(s) (1 live, 0 dead) and 0 range tombstone(s) [shard 0] | 2024-06-27 14:25:58.948424 | 127.0.0.2 | 85 | 127.0.0.1
Querying is done [shard 0] | 2024-06-27 14:25:58.948430 | 127.0.0.2 | 91 | 127.0.0.1
read_data handling is done, sending a response to /127.0.0.1 [shard 0] | 2024-06-27 14:25:58.948436 | 127.0.0.2 | 97 | 127.0.0.1
read_data: got response from /127.0.0.2 [shard 0] | 2024-06-27 14:25:58.949140 | 127.0.0.1 | 1224 | 127.0.0.1
Request complete | 2024-06-27 14:25:58.947449 | 127.0.0.1 | 449 | 127.0.0.1
```

Ref #18988

[1] not completely out of the blue, ff29f430 indicates that a speculative read
*can* happen.

Closes scylladb/scylladb#19520

---
diff --git a/service/storage_proxy.cc b/service/storage_proxy.cc
--- a/service/storage_proxy.cc
+++ b/service/storage_proxy.cc
@@ -5385,9 +5385,11 @@ class speculating_read_executor : public abstract_read_executor {
auto send_request = [&] (bool has_data) {
if (has_data) {
_proxy->get_stats().speculative_digest_reads++;
+ tracing::trace(_trace_state, "Launching speculative retry for digest");
make_digest_requests(resolver, _targets.end() - 1, _targets.end(), timeout);
} else {
_proxy->get_stats().speculative_data_reads++;
+ tracing::trace(_trace_state, "Launching speculative retry for data");
make_data_requests(resolver, _targets.end() - 1, _targets.end(), timeout, true);
}
};

Commit Bot

<bot@cloudius-systems.com>
unread,
Jun 27, 2024, 2:32:23 PMJun 27
to scylladb-dev@googlegroups.com, Avi Kivity
From: Avi Kivity <a...@scylladb.com>
Committer: Nadav Har'El <n...@scylladb.com>
Branch: master
Reply all
Reply to author
Forward
0 new messages