Some things you need to remember about how this estimate works:
1. size_estimates is node-local so if for example you have N=10 nodes and RF=3, each node will contain around 1/10*3 = 30% of the partitions, so the number you get by querying one node will be 30% of the total partition count.
2. If the data is not fully repaired, when you ask one node for its count, it may include fewer or more partitions than some other node. Hopefully the error here is relatively small.
3. Even on one node and one shard, the data is split into multiple separate sstables. Scylla can't just add the number of partitions in the different sstables, as those may be overlapping (changing the same partitions) or not. So Scylla can use a "cardinality estimator", which to make a long story short, samples the different sstables in a way that we can figure out easily how much the sstables overlap to improve the estimate.
After writing the above, I went to the actual code, db/size_estimates_virtual_reader.cc, and it *seems* that the third part is missing in the code! It seems we just sum the number of keys in the separate sstable. Maybe someone else remembers why? This naive "sum" thing will be fairly accurate (up to 10%) in LCS, but can be wildly inaccurate in STCS depending on your disk space amplification (how many copies of the same partition appear in different sstables). I'll open an issue about this.