A somewhat contrived but helpful example: suppose you have a very large text file. You store it in perkeep. Then you find a typo in the beginning, fix it, and store the new version. Although the files are mostly identical, with a Levenstein distance of 1, a fixed width approach will encode them into completely disjoint sets of chunks.
By using subregions of the contents to determine the boundary positions, large strings with identical contents will be split at different offsets if the data at those offsets are the same.
In this example, the 1st chunk will be different for both files (with high probability) all subsequent chunks will be identical, allowing the data to be deduplicated. The tradeoff is the complexity of variable sized chunks, and computing the hashes over these sliding windows, and furthermore this is convergent independently of the order in which the data was inserted (unlike say Git which depends on the commit ordering for deduplicating similar but non identical files when constructing pack files). There are many variants of this technique, I believe the idea originated as Rabin fingerprinting, and that rsync was the first prominent application specifically intended for a distributed setting (don't quote me on that though, I did not verify).
A less contrived example might be large documents with embedded resources (e.g. a slideshow with images), or binary files with editable metadata (although I believe formats I place such metadata at the end)
Having stared at the code for a bit, I'm not sure exactly why this tree structure is computed the way that it is.
The rationale for the Bytes schema object being recursive is that if the chunks of a file/bytes object were encoded as a linear sequence, there would be no possibility of reusing intermediate nodes.
If I'm not mistaken, this logic organizes this tree structure based on the bit score just to retain the same kind of order insensitivity as the chunking itself, and using the bits score is just a stochastic approach to get a structure that is roughly balanced, by arbitrarily promoting nodes with progressively rarer bit scores (but equally some other metric of how to group the chunks could be used, c.f.
prolly trees in noms). I'm sure someone will correct me if I've missed something.