What you're planning makes sense to me, and that article is indeed good advice. Since you're in Python, I think you can just use % to mod your hash values into a given range when you need to, without worrying about the number of bits involved (and this should be faster than converting to a float/double and back). Something like:
def in_range(hash_value, lower_bound, upper_bound):
return lower_bound + hash_value % (upper_bound - lower_bound)
If you're using a high quality hash function like xxHash mentioned in that article, the chances that this modulus brings out a pattern in the results should be pretty low, and you can always do some tests or look at the source if you're worried about that. In that case you could also use something like this to select from a list:
person['surname'] = surname_list[in_range(<hash computation>, 0, len(surname_list))]
Another technique that might be useful at some point if name collisions are too frequent for your liking would be to use a hash value to seed a shuffle of your lists and then index directly by ID, which will make collisions rare-to-impossible depending on the lengths of your name lists and whether they're the same length or not.
Just an implementation note: I tend to write functions like "def surname(person_id, group_id):" to wrap up the logic into one place where I can re-use it if necessary across contexts, and then I'd just call that function in the for loop. That way it's easy to change internal hash values or re-compute a person's name in another part of the code if necessary (if the correct seeds are available) without potentially getting into copy/paste bugs. Writing things this way also forces you to define things as strict functions and allows you to see exactly which inputs are necessary, so you won't accidentally rely on a global variable or something like that that can later cause weird bugs when you decide to change it.
-Peter Mawhorter