Hi Wilmer,
I played more with graph sync in light nodes that is connected to hub peers and wanted to share the results here.
I have noticed that channels are pruned based on expiry time and, when assumechannelvalid=true, also when the edge is disabled.
One thing I am not sure I understand is why prune disabled edges and not just wait for them to expire, as far as I know they won't be considered in path finding even if in the graph.
By observing a well connected hub status it turns out that there are ~1500 zombies. I also noticed that when the mobile client asks for these zombies in "query_short_channel_ids" during historical sync the hub didn't include these in the response even if asked for (probably because they don't exist in its graph live view).
So according to this the client can safely query for the zombie channels, assuming that only those that are not zombies in the hub will be included in the response.
In some mobile devices the zombie channels grew up to 40,000 channels, part of these channels were real closed channels but at least 15,000 turned out to be regular live channels that were moved to the zombie bucket and never got out due to the short uptime of the light node.
I was testing the following changes on top of master on such devices:
1. Don't prune disabled channels
2. include the zombie channels in the query_short_channel_ids submitted from the client.
The results were:
1. The first sync was a bit long as it needed to fetch half of the graph.
2. After the first sync the ~15,000 channels moved from the zombies storage into the live graph which made the graph a lot more up to date.
3. From that moment on there were only ~1500 channels that the client doesn't know about which are included in the "query_short_channel_ids" of the historical sync, these channels are the real zombies on the hub.
4. The additional overhead for the next historical sync was only including these 1500 in the "query_short_channel_ids" as the hub didn't respond with updates/announcements for them due to their zombiness.
So it looks like the lifecycle of a channels in the graph of a light node after these changes is:
- Start by entering the live graph view.
- Pruned to zombies on expiration
- Either returning to the live graph when a newer update arrives.
- Or stay in the zombies bucket as real closed channel.
According to that we can safely periodically delete the zombie channels on light node as a maintenance procedure that will get rid of real closed channels.
I think these changes maintain a more consistent view of the graph for light node with a relatively small price.
Roei