How to use an ExecutorService with a JpaPersistModule ?

1,361 views
Skip to first unread message

jbl

unread,
May 13, 2015, 5:58:10 AM5/13/15
to google...@googlegroups.com
Hello, 


I'm relatively new to guice, and I've got a problem that I quite don't understand.

I have a webapp based on guice, eclipse link, mysql that runs on a tomcat 7.  The app uses a com.google.inject.persist.jpa.JpaPersistModule and a com.google.inject.persist.PersistFilter

For some requests the webapp creates a set of java.lang.Runnable, passes them to a java.util.concurrent.ExecutorService, and returns.  The runnables are numerous time-consuming jobs that must be executed in background to avoid blocking the request. Also, each job has to access the EntityManager. However, the current implementation is quite naive: 

    @Inject private Provider<MyRunnable> myRunnableProvider;
   
@Inject private ExecutorService executorService;
   
   
MyRunnable runnable = myRunnableProvider.get();
    executorService
.submit(runnable);

This code runs fine on my development computer, but I get this exception when I run it in a production environment : 

java.lang.IllegalStateException: Work already begun on this thread. Looks like you have called UnitOfWork.begin() twice without a balancing call to end() in between.
    at com
.google.common.base.Preconditions.checkState(Preconditions.java:150) ~[guava-15.0.jar:na]
    at com
.google.inject.persist.jpa.JpaPersistService.begin(JpaPersistService.java:73) ~[guice-persist-4.0.jar:na]
    at com
.jbl.MyRunnable.run(MyRunnable.java:107)

I think that this has something to do with the Scope, but I don't quite understand how to correct this.  

Thank you for any insight on this !

Laszlo Ferenczi

unread,
May 13, 2015, 7:16:57 AM5/13/15
to google...@googlegroups.com
Hi,

I think the issue will be in the MyRunnable class.
You create the MyRunnable instance in the http worker thread, if you simply inject an EntityManager there it'll be the one bound to the current thread (the http worker). After submitting it to a different thread you essentially leak the transaction which can have many side effects (like when the same http worker is selected again, the PersistFilter will try to start a new transaction but it's already open for that thread - held by the long running background job).

The correct way to handle this is to inject a Provider<EntityManager> in MyRunnable and annotate the database using methods with @Transactional.

--
L


--
L

--
You received this message because you are subscribed to the Google Groups "google-guice" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-guice...@googlegroups.com.
To post to this group, send email to google...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-guice.
To view this discussion on the web visit https://groups.google.com/d/msgid/google-guice/141ea4f0-3f34-45c9-a8cf-c28779859a2d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

JB Lézoray

unread,
May 13, 2015, 8:35:45 AM5/13/15
to google...@googlegroups.com
Hi Laszio, 

Thank you for your answer. 

I think you pinpointed the problem, and I'm a step forward in resolving it. 

Here is the MyRunnable implementation, it already relies on two providers for EntityManager and UnitOfWork.
But now my question is : how to make sure that the Provider<EntityManager> in the MyRunnable worker returns another EntityManager than the one in the MyServletWorker ? 




// A time consuming job. 

class MyRunnable implements Runnable {


  @Inject Provider<EntityManager> entityManagerProvider;

  

  @Inject Provider<UnitOfWork> unitOfWorkProvider;


  @Override

  public void run()

  {

    UnitOfWork unitOfWork = null;

    EntityManager em = null;

    try {

      unitOfWork = unitOfWorkProvider.get(); 

      unitOfWork.begin();

      em  = entityManagerProvider.get();

      this.doTimeConsumingStuff(em);

    } finally {

      if (unitOfWork != null) unitOfWork.end();

      if (em != null && em.isOpen()) em.close(); // should it be explicitly closed ?

    }

  }


  @Transactional

  private void doTimeConsumingStuff(EntityManager em) {/* ... */}

}



// Executed within the http worker thread.

// It Creates and submits the jobs.   

class MyServletWorker {


  @Inject Provider<EntityManager> entityManagerProvider;


  @Inject ExecutorService executorService;


  public void createAndSubmitJobs(MyData data) {

    EntityManager em = entityManagerProvider.get();

    Set<MyRunnable> jobs = createJobs(em, data);

    for (MyRunnable job : jobs)

      executorService.submit(job);

    // ....

  }


  @Transactional

  public Set<MyRunnable> createJobs(EntityManager em, MyData data) {/* ... */}

You received this message because you are subscribed to a topic in the Google Groups "google-guice" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-guice/JKPs9sayXi4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-guice...@googlegroups.com.

To post to this group, send email to google...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-guice.

For more options, visit https://groups.google.com/d/optout.


-- 

Jean-Baptiste LÉZORAY
Ingénieur Logiciel
Portable: 06 64 86 32 92
Bâtiment Nucléole - 85 rue de Saint-Brieuc - 35 000 RENNES
Retrouvez-nous sur www.dolmen.bzh

Stephan Classen

unread,
May 13, 2015, 8:41:19 AM5/13/15
to google...@googlegroups.com
First of all: guice-persist has some known bugs and they haven't been fixed for a long time. It looks like the extension is no longer maintained. Have a look into
http://onami.apache.org/persist/
It is concepionally build on guice persist and migrating existing code should be very easy.

Regarding your problem:
The persist filter follows the j2ee recomendation which states you should not start a new thread in a web request. Therefore it asumes that only the http thread is used.
If you start your own worker threads then there are two possible scenarios.

1. each worker has a task which is independant of the others. It is ok if some worker commit their work and others roll back.

2. all or some tasks are related and either everything is comitted or rolled back.

Scenario 1. is the simple one. All you need to wrap your worker code in the following snipped:
try {
unitOfWork.begin();
doActualJob();
} finally {
unitOfWork.end();
}
This is required since every thread which uses the entity manager must be in an active unitOfWork. The persist filter does exactly the above for the http thread.

If you happen to have scenario 2. it gets way more complicated. Because guice persist and onami persist both expect only one thread to partisipate in a unit of work. Therefore I will not go into details on how to implement this.
Ask on this mailing list and I can help you.

Stephan Classen

unread,
May 13, 2015, 8:51:34 AM5/13/15
to google...@googlegroups.com
The persist framework takes care that every thread has its own entity manager.
So you don't need to do this.
Calling unitOfWork.end() will close the entity manager for you. So also no need to do this manually.

The unitOfWork is a singleton. So you can inject it directly and don't need to use the provider.
You must use the provider for the entityManager as you do.

Also important:
@Transactional works using interception at runtime. Threrefore it can only be used on non private, non final and non static methods. Because for private, final, or static methods the code can be inlined by the compiler. Then it is no longer possible to intercept the method call at runtime.

JB Lézoray

unread,
May 13, 2015, 11:11:22 AM5/13/15
to Stephan Classen, google...@googlegroups.com
Hi Stephan,

I followed your advices, and it seems that I don't have the error when I inject a UnitOfWork instead of a Provider<UnitOfWork>.  

As far as I understand it, if a MyRunnable gets executed by the ExecutorService before the end of the MyServletWorker, then it receives a copy of the existing UnitOfWork instead of a new instance. 
Is it a bug or did I misunderstand something ?  I relied on a Provider<UnitOfWork> because I wanted the UnitOfWork to be created when the Runnable is executed rather than when the Runnable is instantiated. 

Thank you for your answers, it has saved me *a lot* of debugging time. 


// A time consuming job. 

class MyRunnable implements Runnable {


  @Inject Provider<EntityManager> entityManagerProvider;

  

  @Inject UnitOfWork unitOfWork;


  @Override

  public void run()

  {

    try {

      unitOfWork.begin();

      EntityManager em  = entityManagerProvider.get();

      this.doTimeConsumingStuff(em);

    } finally {

      unitOfWork.end();

    }

  }


  @Transactional

  public void doTimeConsumingStuff(EntityManager em) {/* ... */}

}



For more options, visit https://groups.google.com/d/optout.

Stephan Classen

unread,
May 13, 2015, 5:59:42 PM5/13/15
to google...@googlegroups.com
Almost.

UnitOfWork is in the singleton scope. This means there will exist exactly one instance for the entire injector. So it does not matter if you inject it directly or retrieve it from a provider. You will always recieve the very same (and only) instance.

The EntityManager on the other hand has a life cycle. Therefore it is very important to controll the time of its creation. This can be done using the provider.
The life cycle of the entity manager is controlled by the unit of work. Because it is a singelton and controles many entity manager it must somehow map a call to begin() or end() to an single entity manager. This is done by the thread. So every call to the unit of work has only an influence on the entity manager for the same thread.

An entity manager must only be retrieved after the unit of work has been started and may no longer be used after the unit of work has ended. One of the open issues with guice persist is, that retrieving an entity manager outside of a unit of work does not throw an exception but implicitly starts the unit of work. This will almost always lead to the "unit of work already begun" exception. But at the time the exception occures it is no longer possible to determine where the unit of work was started.

So once again. Consider onami persist since it is based on guice persist but improved on the lessons learned from guice persist.

And an advice to the code snipped you posted. As a rule of thumb: never store the entity manager in a field. And never pass the entity manager to anything (this helps avoiding to store it in a field). If you need to pass it to another class pass the provider instead. Or even bether have the provider inject into the other class. Calling the provider is fast and it prevents you from running into a closed entity manager.
Reply all
Reply to author
Forward
0 new messages