Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Searching without retaining all results

4 views
Skip to first unread message

Chris Ridd

unread,
Aug 3, 2017, 12:15:02 PM8/3/17
to perl...@perl.org
I had a requirement to extract 30+ million entries from an ldap server, and naively thought that using callbacks would be useful.

While they are useful in getting me each result quickly, I didn't realise that Search still captures each result in self. So after reading several million entries Linux's oom killer kindly killed a) the ldap server and then b) my script to try and get back some memory!

I ended up patching Search.pm to avoid capturing each result when using callbacks. Is there a better way to do this?

Cheers,

Chris

Frank Swasey

unread,
Aug 3, 2017, 1:00:06 PM8/3/17
to Chris Ridd, perl...@perl.org
Was there a reason not to use ldapsearch or slapcat to extract the entries of interest or the entire database into an LDIF file and then process with Net::LDAP::LDIF?

Brandon Hume

unread,
Aug 3, 2017, 1:00:06 PM8/3/17
to perl...@perl.org
On 8/3/2017 12:59 PM, Chris Ridd wrote:
> I ended up patching Search.pm to avoid capturing each result when using callbacks. Is there a better way to do this?

Were you using paged results? I've certainly never had to manage 30M+
results, but switching to paged changed a script that checks over ~300k
results from a multi-gig RSS process to something that uses barely over
100M.

Graham Barr

unread,
Aug 3, 2017, 1:00:07 PM8/3/17
to Chris Ridd, perl...@perl.org
Did you call  $mesg->pop_entry; in you callback. See http://search.cpan.org/~marschap/perl-ldap-0.65/lib/Net/LDAP/FAQ.pod#USING_THE_CALLBACK_SUBROUTINE_APPROACH  for an example

Graham.

Chris Ridd

unread,
Aug 3, 2017, 1:45:02 PM8/3/17
to Brandon Hume, perl...@perl.org
Good idea!

Chris Ridd

unread,
Aug 3, 2017, 1:45:02 PM8/3/17
to Frank Swasey, perl...@perl.org


> On 3 Aug 2017, at 17:35, Frank Swasey <Frank....@uvm.edu> wrote:
>
> Was there a reason not to use ldapsearch or slapcat to extract the entries of interest or the entire database into an LDIF file and then process with Net::LDAP::LDIF?

I was wondering whether at some point exporting the database would make more sense (nb not openldap) but parsing an enormous ldif might not prove very performant either.

Chris

Chris Ridd

unread,
Aug 3, 2017, 1:45:02 PM8/3/17
to Graham Barr, perl...@perl.org
Aha, that's exactly what I was missing. Thanks Graham!

Chris
0 new messages