JPA L1 Cache Confusion

Daryl Stultz

unread,

Jun 18, 2010, 3:00:11 PM6/18/10

to Ebean ORM

Hey,

I got tripped up by this in my OpenJPA project. I've got object "A"
with child collection "B". The B's are loaded but don't necessarily
represent what's in the database (possibly stale). Within the
lifecycle of one EM (one HTTP Request) I change a property on A and
"save" it (merge). I keep working with the original instance of A, not
the one returned from the merge. After saving A I need to check some
business rules on the B collection. Rather than use the loaded
collection on A, I run a new query to get the B's supposedly fresh
from the database. It seems the query to get the B's is actually
getting them from the A collection of B's. Not exactly, but that's
what it seems like. If I call EM.clear() after the save, before
querying the B collection, it works fine. I think what's happening is
that the merge on A is putting a copy of A and its child B's in the L1
cache. Then the query to load the B's is getting them from the L1
cache instead of the database. I've simplified things a bit and could
probably design my data model better, but this all leads me to Ebean.

I'm wondering if/how I might run into a similar problem with Ebean.
Does Ebean have an L1 cache? Even if not, with the L2 cache on,
perhaps the same thing might happen? What sort of scenarios might lead
Ebean to pull from the cache in such a way that the cache does not
represent what's in the database? (Aside from modifications via JDBC
or some external process.)

Thanks.

/Daryl

Rob Bygrave

unread,

Jun 18, 2010, 7:22:54 PM6/18/10

to eb...@googlegroups.com

> Does Ebean have an L1 cache?

Yes. It is also known as the "Persistence Context".

> What sort of scenarios might lead
> Ebean to pull from the cache in such a way that the cache does not
> represent what's in the database?

The Persistence Context (L1 Cache) is transaction scoped... so as long
as the query used a different transaction it could not use the
original L1 cache. Technically the Persistence Context does live
beyond the end of a transaction as it is used for lazy loading - but
it is not used for any 'new queries'.

Advanced Note: We could actually make the Persistence Context longer
lived than transaction scoped ("JPA Extended Persistence Context") but
there was been no need to date and I don't forsee it. (JPA on the
other hand does need this).

> Even if not, with the L2 cache on

Yes. The L2 cache is not strictly 'read consistent' with the DB. That
is, the L2 cache is updated after the successful DB commit so there is
a small time lag between those 2 events where a different thread can
hit the cache and got a 'stale' bean from the cache.

With the Lucene Integration coming ... you can think of the L2 cache
being backed by Lucene - and at this stage I have made this
non-transactional (like SOLR and unlike Compass).

The net net... is that if you need to guarantee read consistency we
should be hitting the DB and NOT using the L2 cache (L2 bean cache or
Lucene indexes). Alternatively we could use a 'transactional cache'
with the extra costs of 2PC etc but personally I'd rather lean on the
DB for read consistency and live with the non-strict L2 cache.

That said, there are a lot of cases where we don't need strict read
consistency and want to use the L2 bean cache or Lucene indexes for
the benefits of performance and horizontal scalability.

This is a big subject ... hopefully that made some sense.

Cheers, Rob.

Daryl Stultz

unread,

Jun 18, 2010, 8:00:38 PM6/18/10

to Ebean ORM

On Jun 18, 7:22 pm, Rob Bygrave <robin.bygr...@gmail.com> wrote:
> > Does Ebean have an L1 cache?
>
> Yes. It is also known as the "Persistence Context".

> The Persistence Context (L1 Cache) is transaction scoped... so as long
> as the query used a different transaction it could not use the
> original L1 cache. Technically the Persistence Context does live

Ok, so saving a bean doesn't create a new persistence context that the
saved bean would be added to. I think this is what's happening in my
JPA case.

> Yes. The L2 cache is not strictly 'read consistent' with the DB. That
> is, the L2 cache is updated after the successful DB commit so there is
> a small time lag between those 2 events where a different thread can
> hit the cache and got a 'stale' bean from the cache.

Yes, that makes sense, that's behavior I would expect.
It looks like there are some handy methods for clearing the cache so
that would help in unusual situations.

Out of curiosity (it will be a long time before I'm using Ebean with
L2 cache), do you run the root SQL query against the database to get
primary keys, then either pull a bean from the L2 cache, if present,
or populate from result sets? What about children? Suppose you have
object A which is read consistent with the DB and in the L2 cache.
Your query would return this object so it's fetched from the cache.
But the children of A (B's) are stale. Do you assume the B's are
current or reload them? I'm not sure this is really the question I
want to ask. I might be thinking about inverse relation management. If
object A is in the cache with B's loaded. I instantiate a new B and
set A as the parent and save it - but I don't add it to the collection
of B's under A. If A comes back from a query and gets pulled from the
L2 cache, will it contain the new B in its collection? (Really the
same goes for 1to1 parent-child relationships as well.)

Thanks.

/Daryl

Rob Bygrave

unread,

Jun 19, 2010, 8:37:51 PM6/19/10

to eb...@googlegroups.com

> so saving a bean doesn't create a new persistence context

Correct.

> Out of curiosity (it will be a long time before I'm using Ebean with
> L2 cache), do you run the root SQL query against the database to get
> primary keys, then either pull a bean from the L2 cache, if present,
> or populate from result sets?

The L2 cache is a bunch of bean caches (1 Map per type with the Id as
the key) ... and a bunch of query caches (1 Map per type with the
query hash as the key).

The bean cache currently is used for find by Id, getReference and
joins. In the "joins" case we the final object graphs are a mix of
data from the resultset and the L2 cache.

> But the children of A (B's) are stale.

If we hit the cache with read-only=false then the beans returned from
the L2 cache are flat so their children will be loaded on demand. If
you hit the cache with read-only-true then Ebean can give you shared
instances and the cached bean could have children loaded (that are all
marked as 'shared instances' and immutable).

With Lucene it will be a bit different in that we can effectively have
the local Lucene index act like a DB materialised view for many
queries (depending on the query and the expressions used) and we can
either use the "bean cache" or not.

I don't think this fully answers your question though...

On 6/19/10, Daryl Stultz <kungfum...@gmail.com> wrote:
>
>

Daryl Stultz

unread,

Jun 19, 2010, 9:01:41 PM6/19/10

to Ebean ORM

On Jun 19, 8:37 pm, Rob Bygrave <robin.bygr...@gmail.com> wrote:
> The bean cache currently is used for find by Id, getReference and
> joins. In the "joins" case we the final object graphs are a mix of
> data from the resultset and the L2 cache.

Hmm, not quite sure I understand. When you say "joins" do you mean
"fetch"?

> > But the children of A (B's) are stale.
>
> If we hit the cache with read-only=false then the beans returned from
> the L2 cache are flat so their children will be loaded on demand.

Ok, that's good to know. Does that apply to single associations as
well?

> I don't think this fully answers your question though...

That's ok, like I said, I'm a long way from using L2.

/Daryl

Rob Bygrave

unread,

Jun 20, 2010, 5:38:16 AM6/20/10

to eb...@googlegroups.com

> When you say "joins" do you mean "fetch"?

Yes.

> Does that apply to single associations as well?

Yes. Both *ToOne and *ToMany are loaded on demand.

Daryl Stultz

unread,

Jun 28, 2010, 12:28:37 PM6/28/10

to Ebean ORM

On Jun 20, 5:38 am, Rob Bygrave <robin.bygr...@gmail.com> wrote:
>
> Yes. Both *ToOne and *ToMany are loaded on demand.

"loaded on demand" but still possible to load from L2 cache, right?

So there's no "object graph" in the L2 cache, just flat lists of
objects that can be re-assembled?

/Daryl

Rob Bygrave

unread,

Jun 28, 2010, 4:29:42 PM6/28/10

to eb...@googlegroups.com

The Objects in the L2 cache can be object graphs.

All the beans making up those object graphs are marked as "sharedInstance"'s ... meaning they are read-only and any children are read-only. So if you lazy load a *ToMany relationship of a 'sharedInstance' then that object is no longer flat ... and all the children are also marked as 'sharedInstance'.

This is only for the case where you are hitting the L2 bean cache with read-only. If you hit the L2 cache wanting writable/updatable beans... then the bean returned to you is not a 'sharedInstance' but instead a (flat) copy is made and returned instead.

Daryl Stultz

unread,

Jul 17, 2010, 10:44:48 AM7/17/10

to Ebean ORM

On Jun 18, 7:22 pm, Rob Bygrave <robin.bygr...@gmail.com> wrote:
> > Does Ebean have an L1 cache?
>
> Yes. It is also known as the "Persistence Context".

Given this JPA scenario:

List<EntityA> list1 = em.createQuery(q1).getResultList();
List<EntityA> list2 = em.createQuery(q2).getResultList();

There is a single persistence context across the 2 queries, so any
common entities (root EntityA or any associations) will come from the
L1 cache.

In Ebean:

List<EntityA> list1 =
Ebean.find(EntityA.class).where()...q1...findList();
List<EntityA> list2 =
Ebean.find(EntityA.class).where()...q2...findList();

There is no "entity manager" to tie the 2 queries to the same
persistence context. These 2 queries will have each its own PC, yes?
Therefore no L1 cache benefit, correct?

/Daryl

Rob Bygrave

unread,

Jul 17, 2010, 7:36:04 PM7/17/10

to eb...@googlegroups.com

Depends on the transaction demarcation.

When do your transaction(s) start and end? Aka the Persistence Context is by default transaction scoped.
I assume you do not want to use the "Query Cache"?

Daryl Stultz

unread,

Jul 19, 2010, 9:13:30 AM7/19/10

to Ebean ORM

On Jul 17, 7:36 pm, Rob Bygrave <robin.bygr...@gmail.com> wrote:
> When do your transaction(s) start and end?

I don't open transactions around a series of select statements, but I
think I see what you are saying, if I do open a transaction, the
second query can benefit from there being only one PC for the 2
queries.

> I assume you do not want to use the "Query Cache"?

My app will be in transition from JDBC to EBean for quite some time,
the JDBC code modifies the database, so no caching until it's all
EBean.

/Daryl

Rob Bygrave

unread,

Jul 20, 2010, 3:56:36 AM7/20/10

to eb...@googlegroups.com

>> second query can benefit from there being only
>> one PC for the 2 queries.

Yes.

So just to expand ... Ebean DOES have a Persistence Context and it is "Transaction Scoped". There has not yet been a need to actually enable "public" access to the Persistence Context and "share" it across multiple transactions (if you cast to a SpiTransaction you can get/set the PersistenceContext but that is not "public" API yet).

Reply all

Reply to author

Forward