Am 13.07.21 um 21:18 schrieb Rainer Weikusat:
> I'm currently working on something supposed to enable matching names in
> DNS queries against a large "list of bad domains you don't want to be
> talking to". The test file I'm working with has 1,118,228
> entries. Reading all of the information in it into memory needs a little
> over 34M of RAM.
Too much for reasonable RAM cache efficiencies. You should consider more
efficient data structures.
I wrote something like that about 25 years ago (domain & URL blacklists
& whitelists for security critical environments).
Trie (not tree!) structures could significantly reduce the amount of
memory and avoid too much random access to the memory. You could use
single characters or even domain components like the TLD as atoms in the
trie.
Furthermore tree lookups have high cache efficiency (compared to hash
tables) since every search starts at the same root.
In fact you need hybrid structures that decay to ordinary B trees at
deeper levels to avoid large overhead on long names.
Trie and B tree structures can also be combined. E.g. you could use
chunks of 4 bytes as trie atoms, e.g. ".com" or "
n.de" in the top level
This 32 bit values could be directly stored in the B tree nodes used for
dispatch tables. With a node size of 63 or 127 you could get only a few
bits overhead per B tree entry, since an 8 bit value is sufficient for
node size and node type (node/leaf). And redundant trailing domain parts
are still deduplicated due to the trie structure. The next level
contains only the previous 4 bytes of the domain name.
> A program using calloc to allocate the necessary
> storage ends up with an >49M heap due to memory wasted by GNU malloc on
> $something.
Do you allocate the storage as a single chunk? (not recommended)
Or did you allocate every string individually? Then you get a
considerable overhead.
> In contrast to this, using mmap to allocate an area of a suitable size
> and putting the structures and domain names there just wastes 2400
> bytes.
mmap operates directly on MMU-Level, typically allocating 4k Pages.
It is efficient if it is not used too often or for too small chunks.
> The code will possibly also run on a modern "embedded system"/ SOHO
> router.
Then you should /really/ optimize the data structures. The CPUs of these
systems are usually not that fast. And even small delays of DNS queries
are annoying.
> While these are pretty powerful nowadays, 15M of RAM getting
> sucked into malloc for no particular purpose seems excessive to
> me.
34MB for a domain blacklist on an embedded device sounds unreasonably to
me too. At least you should ensure that only a small part of this memory
is "hot", i.e. used for most of the comparisons for performance reasons.
Marcel