Recommended data structure / field types for handling multiple/historical email addresses

34 views
Skip to first unread message

Matt Chambers

unread,
Apr 18, 2018, 1:46:32 PM4/18/18
to open source deduplication
Hey gang,

Looking to add matching capability for multiple/historic email addresses and curious about best practices around field types and data structure for handling this. I've been experimented with using the "Set" variable type, as my intuition is this should allow for sensible matching between records like: 


But I'm not entirely clear on the impact of using Set variable type and whether this is the ideal matching method. Alternatively, I've considered un-nesting emails like below, but given there are also historical phone numbers, addresses, etc, I'm concerned about the impact on dataset size with this approach:



Any thoughts or guidance would be sincerely appreciated.
Reply all
Reply to author
Forward
0 new messages