| From Slack: https://puppetcommunity.slack.com/archives/CFD8Z9A4T/p1552586549522900 I wrote a Puppet language function to, given an array of hashes, return a hash that counted duplicate instances of one of the keys in the original hash and then returned all of the values where the count was > 1. this was the code:
# Find count of records with duplicates: |
$id_count = $system_info.reduce({}) |$results, $rec| { |
$id = $rec['system_id'] |
$current = $results[$id].lest || { 0 } |
$results + {$id => $current + 1} |
} |
|
# Find IDs with count > 1: |
$duplicate_ids = $id_count.filter |$id, $count| { $count > 1 }.keys
|
on my set of records, which was around 1800 or so I think, it took more than two minutes to run, per the profile data I converted it to Ruby and it was, of course, super fast the ruby I ended up with was:
def find_dupes_system_info(system_info) |
res = Hash.new(0) |
system_info.each { |sys| res[sys['system_id']] += 1 } |
res.reject { |_id, count| count == 1 }.keys |
end
|
155.4891 seconds vs 0.0137 seconds |