Using Ehcache for streaming access to large media

41 views
Skip to first unread message

Martin Petzold

unread,
Nov 20, 2019, 5:56:23 AM11/20/19
to ehcache-dev
Dear all.

We are using Ehcache in production already. We also use it to cache media (with disk tier only!). However, with larger media (videos) we run into memory problems.

We have implemented our own Serializer for our media types. This works well, but every time we get an entry it seems to be loaded fully into memory. Is this correct?

I'm not sure if internally a mapped file channel is used for persistent data. However, we would like to get access to some handle (file channel) for this entry in order to load (map) only segments into memory (FileChannel.map(FileChannel.MapMode mode, long position, long size).

Is there any internal API we could use in order to get access to the underlying file or file channel?

Is there any other way to prevent Ehcache to load the entry fully into memory?

Thanks and kind regards,

Martin

Chris Dennis

unread,
Nov 20, 2019, 10:27:15 AM11/20/19
to ehcac...@googlegroups.com

When you read from the disk store it will call your Serializer to recover the in-memory representation of your cached value. So yes with a simple Serializer implemention it will load the entry fully in to memory. This is mainly because adopting a lazy operating mode would require implementing a closeable entry object so Ehcache would know when it was safe to dispose of (and/or reuse) the storage used by the entry. By detaching the value under lock within the entry read path we avoid this complexity.

 

There are a number of approaches I can think of that might help you here but some of them are going to be pretty invasive within Ehcache. It might however be possible to do this by storing the large files in a lookaside structure and only storing a reference to this secdonary structure in the cache.

 

What you would do here is build class that is both a CacheEventListener and Serializer. When asked to serialize an entry it stores the large payload in a single file on the filesystem returns a marker object containing the path to the file to store in the cache. On being asked to deserialize it will then read the path from the cache and return whatever access object is needed on to that file (a FileChannel for example). Then you rely on the CacheEventListener to trap removals from the cache and clean-up unused files. Depending on your tolerance for corner-cases you may need to implement some kind of reference tracking for these returned objects/cache entries so that you can delay deleting the file until everyone has dropped their references (both the cache itself and your returned handles). You might be able to lean on the OS to do this for you though.

 

Chris

--
You received this message because you are subscribed to the Google Groups "ehcache-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ehcache-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ehcache-dev/4de5cda2-0f4e-4aa2-89a0-7746168c872a%40googlegroups.com.

martin.rich...@googlemail.com

unread,
Nov 20, 2019, 11:30:51 AM11/20/19
to ehcache-dev

Dear Chris,


thanks for your suggestion. This could be a solution and I had this in mind. However, with this the size of the cache will not match the actual cache size. I am using statistics and also need to know about the cache size.

Couldn't it be a valid feature to get some sort of streaming and/or selective (region) access to the underlying data?


Kind regards,


Martin

Martin Petzold

unread,
Nov 20, 2019, 11:30:51 AM11/20/19
to ehcac...@googlegroups.com

Dear Chris,

thanks for your suggestion. This could be a solution and I had this in mind. However, with this the size of the cache will not match the actual cache size. I am using statistics and also need to know about the cache size.

Couldn't it be a valid feature to get some sort of streaming and/or selective (region) access to the underlying data?

Kind regards,

Martin

Am 20.11.19 um 16:27 schrieb Chris Dennis:
To view this discussion on the web visit https://groups.google.com/d/msgid/ehcache-dev/DM6PR01MB5899533AA53435345E23AEDEAE4F0%40DM6PR01MB5899.prod.exchangelabs.com.
-- 
Martin Petzold (Gründer & Geschäftsführer / Founder & Managing Director)

TAVLA Technology UG (haftungsbeschränkt)
Im Dau 14
50678 Köln
Deutschland

Telefon: +49 (0)221 / 3466 0885
Mobil: +49 (0)179 / 9220154
E-Mail: martin....@tavla.de
Reply all
Reply to author
Forward
0 new messages