Hi All,
Although there has been a lot of hard effort applied to this problem, I
don't beleive the current approach to thinblocks to be scalable in the
long run, even it could work. The basic reason being the number of
rounds trips needed to get all the transaction data is too many to
achieve any kind of useful performance. Furthermore when tx rates are
in the 100's or 1000's of transactions per second, mempools will be much
further out of sync. I don't believe you could get everything you
needed in any reasonable amount of time, if at all under those
conditions. And on top of that the merkleblock is problematic. While a
valuable goal, I think part of the problem here is trying to make this
work with clients that are already out there when what is really needed
is a break from the past. A new protocol version and new classes of
transactions.
Reference implementation:
https://github.com/ptschip/bitcoinxt/tree/thinblocks
* still using Mike Hearn's block re-assembly code and with a few
other snippets.
* and dagurval's suggestion to add the coinbase tx to every thinblock.
(If you want to test you'll need two nodes running, with one
connected to the other, as it has its own protocol version)
The reference implementation works in the following way. (Only one
round trip is required).
Node A is behind by one block and makes a thinblock request to Node B in
the following way:
1) Node A creates a bloom filter seeded with the contents of its memory
pool
2) Node A sends the bloom filter along with a getdata to Node B
3) Node B sends back a "thinblock" (class CThinBlock) transaction
which contains the block header information, all the transaction hashes
in the block, and any transactions that do not match the bloom filter
that Node A has sent.
4) Node A receives the "thinblock" and reconstructs the block.
Although I have only tested this for a few days, I have yet to see this
fail and yet to see a re-request for transactions however if the need
should arise for re-requesting transactions then the following will occur.
5) If there are still any transactions missing then a "CThinBlockTx"
transaction is sent from from Node A. This contains a map of the
missing transaction hashes seeded with null tx's.
6) Node B upon receiving the CThinBlockTx request will take the object
and fill in the transaction blanks, getting the transactions from the
block on disk rather that memory (in this way we can be sure the tx's
are there as they may already have been purged from memory). Once the
blanks are filled in the object is sent back to Node A and the block
reconstructed. As I've said, I've tested that this works but have yet
to see it needed, and again, just one round trip to get all re-requested
transactions.
In the long run, IMO this is much more scalable solution as we only need
one round trip and we avoid the problematic use of a merkleblock. Also
the generation of the bloom filter is fast and the structure is not
overly large. A casual observation showed a 50MB memory pool yeilding a
10KB bloom filter or thereabouts (some tuning may be possible to get
that lower). Thinblocks coming back are typically 20 to 50KB in size
and missing 1 to 10 transactions.
If you find this interesting and worthwhile I'll be happy to submit a PR.
And here are some typical numbers after the mempool has warmed up...
2016-01-01 09:33:47 Sending thinblock - thinblock size: 43020 vs block
size: 605766 => tx hashes: 1337 transactions: 1
2016-01-01 09:55:34 Sending thinblock - thinblock size: 50044 vs block
size: 763098 => tx hashes: 1539 transactions: 2
2016-01-01 09:57:30 Sending thinblock - thinblock size: 14623 vs block
size: 180801 => tx hashes: 355 transactions: 2
2016-01-01 10:11:52 Sending thinblock - thinblock size: 32877 vs block
size: 484331 => tx hashes: 1018 transactions: 1
2016-01-01 10:18:02 Sending thinblock - thinblock size: 22159 vs block
size: 285710 => tx hashes: 505 transactions: 5
2016-01-01 10:41:55 Sending thinblock - thinblock size: 64589 vs block
size: 952085 => tx hashes: 2011 transactions: 1
2016-01-01 10:45:06 Sending thinblock - thinblock size: 14005 vs block
size: 192031 => tx hashes: 349 transactions: 4
2016-01-01 11:11:51 Sending thinblock - thinblock size: 68531 vs block
size: 932644 => tx hashes: 2134 transactions: 1
2016-01-01 11:17:44 Sending thinblock - thinblock size: 24239 vs block
size: 542876 => tx hashes: 719 transactions: 2
2016-01-01 11:21:50 Sending thinblock - thinblock size: 44481 vs block
size: 178405 => tx hashes: 355 transactions: 21
2016-01-01 11:27:48 Sending thinblock - thinblock size: 18532 vs block
size: 252596 => tx hashes: 460 transactions: 5
2016-01-01 11:31:52 Sending thinblock - thinblock size: 21316 vs block
size: 260582 => tx hashes: 458 transactions: 6
2016-01-01 11:51:50 Sending thinblock - thinblock size: 47692 vs block
size: 792155 => tx hashes: 1483 transactions: 1
2016-01-01 12:03:24 Sending thinblock - thinblock size: 49549 vs block
size: 385895 => tx hashes: 890 transactions: 11
2016-01-01 12:12:29 Sending thinblock - thinblock size: 28845 vs block
size: 668592 => tx hashes: 892 transactions: 1
2016-01-01 12:14:41 Sending thinblock - thinblock size: 28585 vs block
size: 142093 => tx hashes: 232 transactions: 10
2016-01-01 12:42:55 Sending thinblock - thinblock size: 67693 vs block
size: 999851 => tx hashes: 2106 transactions: 1
2016-01-01 12:46:31 Sending thinblock - thinblock size: 16815 vs block
size: 381712 => tx hashes: 518 transactions: 1
2016-01-01 12:48:21 Sending thinblock - thinblock size: 71075 vs block
size: 249546 => tx hashes: 350 transactions: 6
2016-01-01 12:49:00 Sending thinblock - thinblock size: 49443 vs block
size: 248150 => tx hashes: 31 transactions: 6
2016-01-01 12:49:55 Sending thinblock - thinblock size: 3211 vs block
size: 239966 => tx hashes: 91 transactions: 1