[PATCH v1 06/18] gossiper: use peers table to detect address change

1 view
Skip to first unread message

Gleb Natapov

<gleb@scylladb.com>
unread,
Mar 31, 2025, 4:35:36 AMMar 31
to scylladb-dev@googlegroups.com
This requires serializing entire handle_state_normal with a lock since
it both reads and updates peers table now (it only updated it before the
change). This is not a big deal since most of it is already serialized
with token metadata lock. We cannot use it to serialize peers writes
as well since the code that removes an endpoint from peers table also
removes it from gossiper which causes on_remove notification to be called
and it may take the metadata lock as well causing deadlock.
---
service/storage_service.cc | 15 +++++----------
1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/service/storage_service.cc b/service/storage_service.cc
index db902881398..8496f026825 100644
--- a/service/storage_service.cc
+++ b/service/storage_service.cc
@@ -2295,6 +2295,9 @@ future<> storage_service::handle_state_bootstrap(inet_address endpoint, locator:
}

future<> storage_service::handle_state_normal(inet_address endpoint, locator::host_id host_id, gms::permit_id pid) {
+ static semaphore lock(1);
+ auto units = co_await get_units(lock, 1);
+
slogger.debug("endpoint={}/{} handle_state_normal: permit_id={}", endpoint, host_id, pid);

auto tokens = get_tokens_for(endpoint);
@@ -2325,20 +2328,12 @@ future<> storage_service::handle_state_normal(inet_address endpoint, locator::ho
// Old node in replace-with-same-IP scenario.
std::optional<locator::host_id> replaced_id;

- auto ips = _gossiper.get_nodes_with_host_id(host_id);
+ auto id_to_ip_map = co_await get_host_id_to_ip_map();

std::optional<inet_address> existing;

if (tmptr->get_topology().find_node(host_id)) {
- // If node is not in the topology there is no existing address
- // If there are two addresses for the same id the "other" one is existing
- // If there is only one it is existing
- if (ips.size() == 2) {
- if (ips.erase(endpoint) == 0) {
- on_internal_error(slogger, fmt::format("Gossiper has two ips {} for host id {} but none of them is {}", ips, endpoint, host_id));
- }
- }
- existing = *ips.begin();
+ existing = id_to_ip_map.contains(host_id) ? id_to_ip_map[host_id] : endpoint;
}

if (existing && *existing != endpoint) {
--
2.47.1

Avi Kivity

<avi@scylladb.com>
unread,
Mar 31, 2025, 6:43:24 AMMar 31
to Gleb Natapov, scylladb-dev@googlegroups.com
On Mon, 2025-03-31 at 10:51 +0300, 'Gleb Natapov' via ScyllaDB development wrote:
This requires serializing entire handle_state_normal with a lock since
it both reads and updates peers table now (it only updated it before the
change). This is not a big deal since most of it is already serialized
with token metadata lock. We cannot use it to serialize peers writes
as well since the code that removes an endpoint from peers table also
removes it from gossiper which causes on_remove notification to be called
and it may take the metadata lock as well causing deadlock.
---
 service/storage_service.cc | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/service/storage_service.cc b/service/storage_service.cc
index db902881398..8496f026825 100644
--- a/service/storage_service.cc
+++ b/service/storage_service.cc
@@ -2295,6 +2295,9 @@ future<> storage_service::handle_state_bootstrap(inet_address endpoint, locator:
 }
 
 future<> storage_service::handle_state_normal(inet_address endpoint, locator::host_id host_id, gms::permit_id pid) {
+    static semaphore lock(1);


I guess this is meant only to be used from shard 0, but it needs a comment and an on_internal_error if not.


Better to make it thread_local. I still have dreams of running multiple nodes in a process (each node is a seastar::smp instance) and mutable statics make it impossible. Even better put it in the storage_service class near what it protects.

+    auto units = co_await get_units(lock, 1);
+
     slogger.debug("endpoint={}/{} handle_state_normal: permit_id={}", endpoint, host_id, pid);
 
     auto tokens = get_tokens_for(endpoint);
@@ -2325,20 +2328,12 @@ future<> storage_service::handle_state_normal(inet_address endpoint, locator::ho
     // Old node in replace-with-same-IP scenario.
     std::optional<locator::host_id> replaced_id;
 
-    auto ips = _gossiper.get_nodes_with_host_id(host_id);
+    auto id_to_ip_map = co_await get_host_id_to_ip_map();
 
     std::optional<inet_address> existing;
 
     if (tmptr->get_topology().find_node(host_id)) {
-        // If node is not in the topology there is no existing address
-        // If there are two addresses for the same id the "other" one is existing
-        // If there is only one it is existing
-        if (ips.size() == 2) {
-            if (ips.erase(endpoint) == 0) {
-                on_internal_error(slogger, fmt::format("Gossiper has two ips {} for host id {} but none of them is {}", ips, endpoint, host_id));
-            }
-        }
-        existing = *ips.begin();
+        existing = id_to_ip_map.contains(host_id) ? id_to_ip_map[host_id] : endpoint;


Using find() is better since it looks up only once. unordered_map::operator[] is really bad since it emits code to create the entry, even though it will never be called.

Gleb Natapov

<gleb@scylladb.com>
unread,
Mar 31, 2025, 6:48:33 AMMar 31
to Avi Kivity, scylladb-dev@googlegroups.com
OK. Note that get_token_metadata_lock() that is called a couple if lines
below already asserts internally (not even on_internal_error)

> Better to make it thread_local. I still have dreams of running multiple
> nodes in a process (each node is a seastar::smp instance) and mutable
> statics make it impossible. Even better put it in the storage_service
> class near what it protects.

Note that this is all legacy. The code is not called with raft topology.
OK. This is not a performance critical code though.


> >      }
> >  
> >      if (existing && *existing != endpoint) {
> > --
> > 2.47.1
> >
>

--
Gleb.

Avi Kivity

<avi@scylladb.com>
unread,
Mar 31, 2025, 7:01:57 AMMar 31
to Gleb Natapov, scylladb-dev@googlegroups.com
Of course, but it still hurts my eyes.

Reply all
Reply to author
Forward
0 new messages