Alternative Caching Strategies

34 views
Skip to first unread message

Elliot Ali

unread,
Apr 19, 2020, 5:57:42 AM4/19/20
to htmlpurifier
Whilst it looks possible to do so, how are alternative caching strategies employed in practice? I have written an implementation that works using Redis, but can't work out how to inject it. It's a shame that the config requires a "short name" to a previously registered class, rather than just the class name as this would be much easier all round!

Keith Davis

unread,
Apr 19, 2020, 6:50:52 AM4/19/20
to htmlpurifier
We just created a wrapper class and cache the output using a hash of the config and the input string as the key.

Elliot Ali

unread,
Apr 19, 2020, 10:13:25 AM4/19/20
to htmlpurifier
I don't believe that it's that simple? Although I will probably do that too.

My understanding is that htmlpurifier caches the definitions, as this speeds up processing times for every purification. So where, in the config, you normally specify "serialize" or null for the cache definition implementation (here), it implies you can specify something else (by first making your own implementation). I think I've achieved the latter part, but can't see now how to set it up.

Maybe you're right, and caching the output is all that is required (thinking about it you might be spot on) but still - the functionality is there, so there must be a way to use it!

Edward Z. Yang

unread,
Apr 19, 2020, 11:31:26 PM4/19/20
to Elliot Ali, htmlpurifier
Assuming that you're actually interested in caching definitions on Redis
(and not just the full end to end output--if that's what you want, just
do a wrapper calss, as Keith suggested), there's just one more extra
step you have to do. Given a subclass of HTMLPurifier_DefinitionCache
(let's call it MyDefinitionCache), then you write:

$factory = HTMLPurifier_DefinitionClassFactory::instance();
$factory->register('MyCache', 'MyDefinitionCache'); // short name, class name

and now set Cache.DefinitionImpl to 'MyCache'.

Edward

Excerpts from Elliot Ali's message of 2020-04-19 07:13:25 -0700:
> I don't believe that it's that simple? Although I will probably do that too.
>
> My understanding is that htmlpurifier caches the definitions, as this
> speeds up processing times for every purification. So where, in the config,
> you normally specify "serialize" or null for the cache definition
> implementation (here
> <http://htmlpurifier.org/live/configdoc/plain.html#Cache.DefinitionImpl>),

Elliot Ali

unread,
Apr 20, 2020, 12:17:53 AM4/20/20
to htmlpurifier
Hi Edward,

Thank you for that. I think Keith may be right, but it's context dependent - the source doesn't change much. If it did I might still need to implement definition caching properly.

What I can't see is where and how to use the code you've proposed?

Elliot.

Keith Davis

unread,
Apr 21, 2020, 8:39:44 AM4/21/20
to htmlpurifier
My caching code or are you referring to Edward's comments?

Elliot Ali

unread,
Apr 21, 2020, 10:15:12 AM4/21/20
to htmlpu...@googlegroups.com
Sorry Edward - your code.

-- 
You received this message because you are subscribed to the Google Groups "htmlpurifier" group.
To unsubscribe from this group and stop receiving emails from it, send an email to htmlpurifier...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/htmlpurifier/19166eb9-79e8-4558-8887-6829933170ef%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages