This is definitely a useful thing to have and I've wanted it myself before. However I'm generally of the opinion that we should avoid making more collection manipulation functions that are unnecessarily specialized to one type of collection. I'd like to see functions that operate on all sequences rather than on hash tables alone. Here are some usages of hash-filter compared with some existing alternatives:
(hash-filter ht #:predicate pred)
==
; Sequence composition
(make-immutable-hash
(sequence->list
(sequence-filter (lambda (pair) (pred (cdr pair)))
(sequence-map cons ht))))
==
; For macros
(for/hash ([(k v) (in-hash ht)] #:when (pred v)) (values k v))
The for macro is probably what I'd reach for in standard racket code. It has some drawbacks: the first time I tried to write it I forgot to write (values k v) and instead just wrote v. Keeping track of pulling apart the key and value, naming them with variables, and putting them back together is a little annoying.
The sequence composition approach is, in my humble opinion, completely unreadable and difficult to write to boot. The sequence-map function works totally fine on multivalued sequences but sequence-filter only works on single-value sequences, so we have to do a dance with cons car and cdr to turn the multivalued sequence of keys and values into a single-valued sequence of key-value cons pairs. Furthermore, the only built-in function for building a hash table from a sequence is overly specialized to lists, so you have to copy the sequence into a list solely so you can turn that list into a hash table with make-immutable-hash.
Instead of adding hash-filter to racket/hash, I think there's a few improvements we could make to the sequence composition side of things:
- Make sequence-filter allow multivalued sequences, provided the arity of the predicate is consistent with the arity of the sequence.
- Add a sequence->hash function that accepts a multivalued sequence of keys and values (exactly like what in-hash produces) and copies them into a hash table.
For the filter-by-value, filter-by-key, and filter-by-key-and-value cases, this lets us write:
(sequence->hash (sequence-filter (lambda (k v) (pred v)) ht)) ; Filter values
(sequence->hash (sequence-filter (lambda (k v) (pred k)) ht)) ; Filter keys
(sequence->hash (sequence-filter pred ht)) ; Filter key-value pairs
Which is close enough to hash-filter to satisfy my desire for conciseness, while still being general enough to work with other kinds of key-value sequences.
Shameless self plug: you might be interested in
Rebellion's take on this, which uses
transducers and key-value structs called
entries:
(transduce (in-hash-entries ht) (filtering-values pred) #:into into-hash) ; Filter values
(transduce (in-hash-entries ht) (filtering-keys pred) #:into into-hash) ; Filter keys
(transduce (in-hash-entries ht) (filtering (lambda (e) (pred (entry-key e) (entry-value e)))) #:into into-hash) ; Filter key-value entries