Proper way to clean/purge the cache?

60 views
Skip to first unread message

Peter Valdemar Mørch

unread,
Sep 23, 2016, 1:03:19 AM9/23/16
to Perl-Cache Discuss, Peter Valdemar Mørch
I have a cache that looks like this:

    {
         "driver" : "File",
         "root_dir": "/some/dir",
         "expires_in" : "5 minutes",
         "namespace" : "MyNamespace"
    }

Here, 24 hours later it takes up 18GB, most of it older than 5min.

So how do I purge the cache efficiently?

The documentation says:

purge( )

Remove all entries that have expired from the namespace associated with this cache instance. Warning: May be very inefficient, depending on the number of keys and the driver.


Ok, so what is the efficient or intended way to keep the file system tidy by removing the keys and values for expired keys? I didn't see any pointers in CHI::Driver::File either. Manually searching the file system for files older than 5 minutes seems like I'm doing part of the job myself that I was hoping CHI would do. And it wouldn't handle if e.g. somebody uses $expires_in in $cache->set($key, $data, $expires_in).

I don't see any API method to continuously and periodically maintain/clean the cache of expired data, since it clearly looks like this needs doing. The only candidate looks like the "very inefficient" purge(). I'm hoping for a solution that has the same API for all drivers, so that the clean up code doesn't need to know the details of the CHI driver configuration.

Should I just use $cache->purge() and ignore the "inefficient" warning? If so won't that be bad e.g. for memcached based caches, especially since memcached already does efficient purging itself?

Or did I misunderstand something? What is the intended way to keep the cache "clean"?

Peter

Peter V. Mørch

unread,
Sep 23, 2016, 1:03:20 AM9/23/16
to Perl-Cache Discuss, Peter V. Mørch

Swartz, Jonathan

unread,
Sep 23, 2016, 1:08:39 AM9/23/16
to perl-cach...@googlegroups.com, Peter Valdemar Mørch
Just traverse the cache file hierarchy and remove files based on their last modification date. You can do this with Unix find or File::Find.

--
You received this message because you are subscribed to the Google Groups "Perl-Cache Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to perl-cache-disc...@googlegroups.com.
To post to this group, send email to perl-cach...@googlegroups.com.
Visit this group at https://groups.google.com/group/perl-cache-discuss.
For more options, visit https://groups.google.com/d/optout.

Peter V. Mørch

unread,
Sep 23, 2016, 8:10:09 AM9/23/16
to Perl-Cache Discuss, pe...@morch.com
On Friday, September 23, 2016 at 7:08:39 AM UTC+2, Jonathan Swartz wrote:
Just traverse the cache file hierarchy and remove files based on their last modification date. You can do this with Unix find or File::Find.

I can see from the base-class implementation of CHI::Driver::purge(), that it calls $self->get_keys and removes those where $obj->is_expired.

CHI::Driver::File::get_keys uses File::Find. $fileCache->purge() has the added advantage over raw/simple File::Find that it also handles specific values of $expires_in given to $cache->set($key, $data, $expires_in), which a raw File::Find would't be able to handle.

I can also see that e.g. CHI::Driver::Memcached::Base has:

    __PACKAGE__->declare_unsupported_methods(
        qw(dump_as_hash get_keys get_namespaces is_empty clear purge));

So it seems that purge() isn't actually so inefficient after all. Since these two cases are my use cases for now, what I'm doing now is:

    eval {
        $cache->purge();
    };
    if (my $err = $@) {
        if ($err !~ /^method 'purge' not supported/) {
            die $err;
        }
    }

But it would be nice to remove the warning in the documentation about inefficiency, since the code is as efficient as possible. And document that purge throws an exception in pathalogical cases such as memcached.

So for now I can scratch my own itch - using a method documented as inefficient,but which in reality is either ok or throws an exception.

Swartz, Jonathan

unread,
Sep 23, 2016, 9:51:34 AM9/23/16
to perl-cach...@googlegroups.com, pe...@morch.com
It is still far less efficient than simply looking at the file modification time - you have to open each file and convert it to a cache entry. But if it works for you, great! I’d still keep the warning.

Thanks
Jon

Peter V. Mørch

unread,
Sep 23, 2016, 11:00:29 AM9/23/16
to perl-cach...@googlegroups.com
Damn! I missed that. Yikes! It doesn't just look at the file names/paths but has to open and read_file the entire files. I now see that the warning is entirely warranted! ;-)

I guess all this together means that CHI itself doesn't implement a way to keep a file cache from eventually using the entire disk other than the inefficient $cache->purge(), that has to read the entire cache contents.

The recommendation specifically for CHI::Driver::File is to implement your own with File::Find, but you'll have to know details about your application. E.g. if it uses the $expires_in parameter for $cache->set() calls, the custom purge code needs to know what the maximum used $expire_in is, and then you can only purge entries older than that if you want it to be efficient. (Or, I guess since this is *cache* after all, one could just wipe the odd entry with a longer $expire_in and force it to be recalculated.) For other drivers, other custom purge code.

I have to say that this surprises me. I would have thought that every single user of CHI would need efficient garbage collection / a way to continuously keep the total cache size in check; including handling custom $expire_in and whatnot.

I'll think about what to do now.

Thanks for your input, Jon! I also want to say that I'm very grateful for CHI - it is such an improvement over what we used before!

Peter

On Fri, Sep 23, 2016 at 3:51 PM, Swartz, Jonathan <swa...@pobox.com> wrote:
It is still far less efficient than simply looking at the file modification time - you have to open each file and convert it to a cache entry. But if it works for you, great! I’d still keep the warning.

Thanks
Jon

On Sep 23, 2016, at 5:10 AM, Peter V. Mørch <p...@capmon.dk> wrote:

On Friday, September 23, 2016 at 7:08:39 AM UTC+2, Jonathan Swartz wrote:
Just traverse the cache file hierarchy and remove files based on their last modification date. You can do this with Unix find or File::Find.

I can see from the base-class implementation of CHI::Driver::purge(), that it calls $self->get_keys and removes those where $obj->is_expired.

CHI::Driver::File::get_keys uses File::Find. $fileCache->purge() has the added advantage over raw/simple File::Find that it also handles specific values of $expires_in given to $cache->set($key, $data, $expires_in), which a raw File::Find would't be able to handle.

I can also see that e.g. CHI::Driver::Memcached::Base has:

    __PACKAGE__->declare_unsupported_methods(
        qw(dump_as_hash get_keys get_namespaces is_empty clear purge));

So it seems that purge() isn't actually so inefficient after all. Since these two cases are my use cases for now, what I'm doing now is:

    eval {
        $cache->purge();
    };
    if (my $err = $@) {
        if ($err !~ /^method 'purge' not supported/) {
            die $err;
        }
    }

But it would be nice to remove the warning in the documentation about inefficiency, since the code is as efficient as possible. And document that purge throws an exception in pathalogical cases such as memcached.

So for now I can scratch my own itch - using a method documented as inefficient,but which in reality is either ok or throws an exception.


--
You received this message because you are subscribed to the Google Groups "Perl-Cache Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to perl-cache-discuss+unsub...@googlegroups.com.
To post to this group, send email to perl-cache-discuss@googlegroups.com.

--
You received this message because you are subscribed to a topic in the Google Groups "Perl-Cache Discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/perl-cache-discuss/6pATPgkySAM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to perl-cache-discuss+unsub...@googlegroups.com.
To post to this group, send email to perl-cache-discuss@googlegroups.com.



--
Peter Valdemar Mørch

Udviklingchef
Mobil: +45 4062 6296
CapMon A/S
Lyskær 9, 1.th., 2730 Herlev
Web: www.capmon.dk

Perrin Harkins

unread,
Sep 23, 2016, 11:16:14 AM9/23/16
to perl-cach...@googlegroups.com
I suspect most users are not caching 18GB every 24 hours. And many are using drivers other than File.

Breaking the encapsulation (by writing your own file expiring cron) seems like a pretty common tradeoff to make when you need unusual performance. An alternative would be to write your own hybrid driver that keeps the metadata in a more efficient storage system, like BerkeleyDB or MySQL while using files for the actual cached data.

Jon

To unsubscribe from this group and stop receiving emails from it, send an email to perl-cache-discuss+unsubscribe@googlegroups.com.

To post to this group, send email to perl-cache-discuss@googlegroups.com.
Visit this group at https://groups.google.com/group/perl-cache-discuss.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "Perl-Cache Discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/perl-cache-discuss/6pATPgkySAM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to perl-cache-discuss+unsubscribe@googlegroups.com.

To post to this group, send email to perl-cache-discuss@googlegroups.com.
Visit this group at https://groups.google.com/group/perl-cache-discuss.
For more options, visit https://groups.google.com/d/optout.



--
Peter Valdemar Mørch

Udviklingchef
Mobil: +45 4062 6296
CapMon A/S
Lyskær 9, 1.th., 2730 Herlev
Web: www.capmon.dk

--

Aristotle Pagaltzis

unread,
Sep 23, 2016, 4:08:49 PM9/23/16
to perl-cach...@googlegroups.com
* Peter V. Mørch <p...@capmon.dk> [2016-09-23 14:12]:
> eval {
> $cache->purge();
> };
> if (my $err = $@) {
> if ($err !~ /^method 'purge' not supported/) {
> die $err;
> }
> }

Please replace with this:

if ( my $sub = $cache->can('purge') ) { $cache->$sub }

Peter Valdemar Mørch

unread,
Sep 24, 2016, 6:18:58 AM9/24/16
to perl-cach...@googlegroups.com

Will do!


--
You received this message because you are subscribed to the Google Groups "Perl-Cache Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to perl-cache-discuss+unsub...@googlegroups.com.
To post to this group, send email to perl-cache-discuss@googlegroups.com.

Peter V. Mørch

unread,
Oct 4, 2016, 6:09:02 AM10/4/16
to Perl-Cache Discuss

On Friday, September 23, 2016 at 10:08:49 PM UTC+2, Aristotle Pagaltzis wrote:
Please replace with this:

  if ( my $sub = $cache->can('purge') ) { $cache->$sub }

That would've been nice, actually, but it seems __PACKAGE__->declare_unsupported_methods() doesn't return false for $cache->can("method").

I tried this:

my $cache = CHI->new(%$cacheConfig);
if ($cache->can('purge')) {
    eval {
        $cache->purge();
    };
    if ($@) {
        print "Error from purge: $@";
    }
}

and got:

Error from purge: method 'purge' not supported by 'CHI::Driver::Memcached::Base' at /bla/Bla.pm line X.

So I'll stick with what I had.
Reply all
Reply to author
Forward
0 new messages