Problem publishing measurements

25 views
Skip to first unread message

Adam Todorski

unread,
Oct 14, 2012, 3:13:14 PM10/14/12
to reef-...@googlegroups.com
I have an application that publishes measurements into a Reef installation as often as every few seconds and as infrequently as a couple of minutes.  There are between one and a couple of dozen values published per batch and the process is threaded such that multiple batches may be published by different instances of the MeasurementService in different threads simultaneously.  This process has been in production for some time and published millions of batches of measurements without issue, until now:

Error publishing collection to REEF: Couldn't get tree for entity: b9f07316-6de9-4942-a436-093f82d3448f relation: owns depth: 1 - Select based filtering included items that should have been filtered: List(Denied(Entity(ENDPOINT.Equipment),No permission (0) matched entity:read. Assuming deny *))
        at org.totalgrid.reef.services.framework.HasRead$class.performRead(ServiceCrud.scala:84)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.core.EntityService.performRead(EntityService.scala:65)[79:org.totalgrid.reef.services-core:0.4.8]
        at org.totalgrid.reef.services.framework.HasRead$class.read(ServiceCrud.scala:70)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.core.EntityService.read(EntityService.scala:65)[79:org.totalgrid.reef.services-core:0.4.8]
        at org.totalgrid.reef.services.framework.ServiceBehaviors$GetEnabled$$anonfun$get$1.apply(ServiceBehaviors.scala:47)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.framework.ServiceBehaviors$GetEnabled$$anonfun$get$1.apply(ServiceBehaviors.scala:45)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.metrics.DatabaseActionsCounter$$anonfun$transaction$1.apply(DatabaseActionsCounter.scala:31)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.metrics.DatabaseActionsCounter$$anonfun$transaction$1.apply(DatabaseActionsCounter.scala:29)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.framework.RequestContextSourceWithHeaders$$anonfun$transaction$1.apply(RequestContext.scala:113)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.framework.RequestContextSourceWithHeaders$$anonfun$transaction$1.apply(RequestContext.scala:110)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.DependenciesSource$$anonfun$transaction$1.apply(ServiceDependencies.scala:77)[79:org.totalgrid.reef.services-core:0.4.8]
        at org.totalgrid.reef.services.DependenciesSource$$anonfun$transaction$1.apply(ServiceDependencies.scala:77)[79:org.totalgrid.reef.services-core:0.4.8]
        at org.totalgrid.reef.services.framework.ServiceTransactable$$anonfun$1.apply(ServiceTransactable.scala:28)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.squeryl.dsl.QueryDsl$class._using(QueryDsl.scala:46)[54:org.totalgrid.reef.dependencies:0.4.8]
        at org.squeryl.dsl.QueryDsl$class._executeTransactionWithin(QueryDsl.scala:105)[54:org.totalgrid.reef.dependencies:0.4.8]
        at org.squeryl.dsl.QueryDsl$class.transaction(QueryDsl.scala:60)[54:org.totalgrid.reef.dependencies:0.4.8]
        at org.squeryl.PrimitiveTypeMode$.transaction(PrimitiveTypeMode.scala:40)[54:org.totalgrid.reef.dependencies:0.4.8]
        at org.totalgrid.reef.persistence.squeryl.SessionDbConnection$$anonfun$1.apply(DbConnection.scala:59)[70:org.totalgrid.reef.persistence-api:0.4.8]
        at org.squeryl.dsl.QueryDsl$class._using(QueryDsl.scala:46)[54:org.totalgrid.reef.dependencies:0.4.8]
        at org.squeryl.dsl.QueryDsl$class.using(QueryDsl.scala:38)[54:org.totalgrid.reef.dependencies:0.4.8]
        at org.squeryl.PrimitiveTypeMode$.using(PrimitiveTypeMode.scala:40)[54:org.totalgrid.reef.dependencies:0.4.8]
        at org.totalgrid.reef.persistence.squeryl.SessionDbConnection.transaction(DbConnection.scala:58)[70:org.totalgrid.reef.persistence-api:0.4.8]
        at org.totalgrid.reef.persistence.squeryl.SessionDbConnection.inTransaction(DbConnection.scala:69)[70:org.totalgrid.reef.persistence-api:0.4.8]
        at org.totalgrid.reef.services.framework.ServiceTransactable$.doTransaction(ServiceTransactable.scala:26)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.DependenciesSource.transaction(ServiceDependencies.scala:77)[79:org.totalgrid.reef.services-core:0.4.8]
        at org.totalgrid.reef.services.framework.RequestContextSourceWithHeaders.transaction(RequestContext.scala:110)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.metrics.DatabaseActionsCounter.transaction(DatabaseActionsCounter.scala:29)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.framework.ServiceBehaviors$GetEnabled$class.get(ServiceBehaviors.scala:45)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.core.EntityService.get(EntityService.scala:65)[79:org.totalgrid.reef.services-core:0.4.8]
        at org.totalgrid.reef.services.framework.ServiceBehaviors$GetEnabled$class.getAsync(ServiceBehaviors.scala:52)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.core.EntityService.getAsync(EntityService.scala:65)[79:org.totalgrid.reef.services-core:0.4.8]
        at org.totalgrid.reef.services.framework.AsyncContextRestService$class.respondAsync(ServiceEntryPoint.scala:50)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.core.EntityService.respondAsync(EntityService.scala:65)[79:org.totalgrid.reef.services-core:0.4.8]
        at org.totalgrid.reef.services.metrics.ServiceMetricsInstrumenter.respondAsync(ServiceMetricsInstrumenter.scala:53)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.framework.ServiceMiddleware.handleRequest(ServiceMiddleware.scala:64)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.services.framework.ServiceMiddleware.respond(ServiceMiddleware.scala:41)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.client.sapi.service.AsyncService$class.respond(AsyncService.scala:35)[62:org.totalgrid.reef.client:0.4.8]
        at org.totalgrid.reef.services.framework.ServiceMiddleware.respond(ServiceMiddleware.scala:33)[78:org.totalgrid.reef.services-framework:0.4.8]
        at org.totalgrid.reef.client.impl.DefaultServiceBinding$$anon$1$$anonfun$onMessage$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(DefaultServiceBinding.scala:53)[62:org.totalgrid.reef.client:0.4.8]
        at org.totalgrid.reef.client.impl.DefaultServiceBinding.org$totalgrid$reef$client$impl$DefaultServiceBinding$$safely(DefaultServiceBinding.scala:72)[62:org.totalgrid.reef.client:0.4.8]
        at org.totalgrid.reef.client.impl.DefaultServiceBinding$$anon$1$$anonfun$onMessage$1.apply$mcV$sp(DefaultServiceBinding.scala:46)[62:org.totalgrid.reef.client:0.4.8]
        at net.agileautomata.executor4s.impl.FunRun.run(DecoratedExecutor.scala:28)[53:net.agile-automata.executor4s.2.9.0-1:0.1.10]
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)[:1.6.0_26]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)[:1.6.0_26]
        at java.lang.Thread.run(Thread.java:662)[:1.6.0_26]

If I am misusing the service (concurrency issue?) then I would like to know how, if this is a bug then I would like to report it.

-Adam

Daniel Evans

unread,
Oct 16, 2012, 3:05:43 PM10/16/12
to reef-...@googlegroups.com
Adam,

Client objects are designed to be used by a single thread only, as they contain their own state, including the auth token. You can use the spawn() method on the Client interface to copy a logged-in client for use on another thread.

I can't tell for certain if a threading bug is responsible from the information you gave. Have you made updates to your model? Does this error happen repeatedly, or just occasionally, or just once? Does restarting fix it?

--

Daniel Evans

Software Engineer

Green Energy Corp

(919) 836-9916

dev...@greenenergycorp.com

 

Transforming the Smart Grid

www.greenenergycorp.com

 


Adam Todorski

unread,
Oct 16, 2012, 3:15:50 PM10/16/12
to reef-...@googlegroups.com
Thanks Daniel.  I'm not sharing Clients between threads, but I am sharing a connection.  I achieve this using a ThreadLocal like so:

    private static final ThreadLocal<Client> threadLocal = new ThreadLocal<Client>() {
    @Override
    public Client get() {
    try {
return connection.login(user);
} catch (ReefServiceException e) {
throw new RuntimeException(e);
}
    }
    };

Where connection is a Connection object instantiated early in the app's lifecycle and threadLocal.get() is called on demand in each thread doing business.  The intent was to reuse Clients within the same thread but ensure that each thread is using a different Client at any one time.  Is this valid or could this be the source of my problem (ie: should I pin the life cycle of each Connection and Client to each worker thread in my pool)?

This has happened a few times seemingly at random over the course of a great deal of production use of the application, which is why it seems like a threading bug to me.

Thanks,
Adam

Daniel Evans

unread,
Oct 17, 2012, 2:21:36 PM10/17/12
to reef-...@googlegroups.com
It looks like you're not reusing Clients on different threads, though the implementation you pasted would seem to cause a full login (auth request roundtrip) every time you call ThreadLocal.get(). This will have an unnecessary performance cost on both the client and server. You could either login once for every ThreadLocal (by overriding initialValue() instead) or login earlier in the application, extract your auth token, and call the "createClient" method on the Connection interface.

As for the error itself, it appears there's a race condition or some other intermittent failure somewhere between the client Entity request and the server-side auth system. I opened a ticket here:


We'll need to find a way to reproduce it and inspect the state of the system/database. In the meantime, only logging in once (getting a single auth token you then reuse) might help, depending on where the problem actually lies.

Thanks,

Adam Todorski

unread,
Oct 19, 2012, 5:13:17 PM10/19/12
to reef-...@googlegroups.com
Good advice on the Clients coming out of the ThreadLocal, thank you.  Is there any chance that the client will somehow expire or become otherwise unusable in the case of a long running thread that uses the Client infrequently?  Is there a light weight test that can be applied to clients coming out of the TL's get() method to avoid returning a Client that is unusable by the caller?

Thanks for looking at this issue and logging the bug.

--Adam

Adam Todorski

unread,
Oct 21, 2012, 10:42:52 AM10/21/12
to reef-...@googlegroups.com
I have observed a second variation of the same error:

Error publishing collection to REEF: Couldn't get tree for entity: 4fa00b58-f7cc-4e19-8e9b-05f6bf4fa10e relation: owns depth: 1 - key not found: 00000000-0000-0000-0000-000000000000
scala.collection.MapLike$class.default(MapLike.scala:224)
scala.collection.immutable.Map$Map1.default(Map.scala:106)
scala.collection.MapLike$class.apply(MapLike.scala:135)
scala.collection.immutable.Map$Map1.apply(Map.scala:106)
org.totalgrid.reef.models.EntityTreeQuery$QueryNode$$anonfun$7.apply(EntityTreeQuery.scala:183)
org.totalgrid.reef.models.EntityTreeQuery$QueryNode$$anonfun$7.apply(EntityTreeQuery.scala:178)
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194)
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194)
scala.collection.Iterator$class.foreach(Iterator.scala:652)
org.squeryl.dsl.AbstractQuery$$anon$1.foreach(AbstractQuery.scala:156)
scala.collection.IterableLike$class.foreach(IterableLike.scala:73)
org.squeryl.dsl.AbstractQuery.foreach(AbstractQuery.scala:27)
scala.collection.TraversableLike$class.map(TraversableLike.scala:194)
org.squeryl.dsl.AbstractQuery.map(AbstractQuery.scala:27)
org.totalgrid.reef.models.EntityTreeQuery$QueryNode.fillChildren(EntityTreeQuery.scala:178)
org.totalgrid.reef.models.EntityTreeQuery$$anonfun$resultsForQuery$1.apply(EntityTreeQuery.scala:240)
org.totalgrid.reef.models.EntityTreeQuery$$anonfun$resultsForQuery$1.apply(EntityTreeQuery.scala:240)
scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
scala.collection.immutable.List.foreach(List.scala:45)
org.totalgrid.reef.models.EntityTreeQuery$.resultsForQuery(EntityTreeQuery.scala:240)
org.totalgrid.reef.models.EntityTreeQuery$.resultsForQuery(EntityTreeQuery.scala:227)
org.totalgrid.reef.models.EntityTreeQuery$.protoTreeQuery(EntityTreeQuery.scala:302)
org.totalgrid.reef.services.core.EntityServiceModel.findRecords(EntityService.scala:148)
org.totalgrid.reef.services.core.EntityServiceModel.findRecords(EntityService.scala:72)
org.totalgrid.reef.services.framework.HasRead$class.performRead(ServiceCrud.scala:75)
org.totalgrid.reef.services.core.EntityService.performRead(EntityService.scala:65)
org.totalgrid.reef.services.framework.HasRead$class.read(ServiceCrud.scala:70)
org.totalgrid.reef.services.core.EntityService.read(EntityService.scala:65)
org.totalgrid.reef.services.framework.ServiceBehaviors$GetEnabled$$anonfun$get$1.apply(ServiceBehaviors.scala:47)
org.totalgrid.reef.services.framework.ServiceBehaviors$GetEnabled$$anonfun$get$1.apply(ServiceBehaviors.scala:45)
org.totalgrid.reef.services.metrics.DatabaseActionsCounter$$anonfun$transaction$1.apply(DatabaseActionsCounter.scala:31)
org.totalgrid.reef.services.metrics.DatabaseActionsCounter$$anonfun$transaction$1.apply(DatabaseActionsCounter.scala:29)
org.totalgrid.reef.services.framework.RequestContextSourceWithHeaders$$anonfun$transaction$1.apply(RequestContext.scala:113)
org.totalgrid.reef.services.framework.RequestContextSourceWithHeaders$$anonfun$transaction$1.apply(RequestContext.scala:110)
org.totalgrid.reef.services.DependenciesSource$$anonfun$transaction$1.apply(ServiceDependencies.scala:77)
org.totalgrid.reef.services.DependenciesSource$$anonfun$transaction$1.apply(ServiceDependencies.scala:77)
org.totalgrid.reef.services.framework.ServiceTransactable$$anonfun$1.apply(ServiceTransactable.scala:28)
org.squeryl.dsl.QueryDsl$class._using(QueryDsl.scala:46)
org.squeryl.dsl.QueryDsl$class._executeTransactionWithin(QueryDsl.scala:105)
org.squeryl.dsl.QueryDsl$class.transaction(QueryDsl.scala:60)
org.squeryl.PrimitiveTypeMode$.transaction(PrimitiveTypeMode.scala:40)
org.totalgrid.reef.persistence.squeryl.SessionDbConnection$$anonfun$1.apply(DbConnection.scala:59)
org.squeryl.dsl.QueryDsl$class._using(QueryDsl.scala:46)
org.squeryl.dsl.QueryDsl$class.using(QueryDsl.scala:38)
org.squeryl.PrimitiveTypeMode$.using(PrimitiveTypeMode.scala:40)
org.totalgrid.reef.persistence.squeryl.SessionDbConnection.transaction(DbConnection.scala:58)
org.totalgrid.reef.persistence.squeryl.SessionDbConnection.inTransaction(DbConnection.scala:69)
org.totalgrid.reef.services.framework.ServiceTransactable$.doTransaction(ServiceTransactable.scala:26)
org.totalgrid.reef.services.DependenciesSource.transaction(ServiceDependencies.scala:77)
org.totalgrid.reef.services.framework.RequestContextSourceWithHeaders.transaction(RequestContext.scala:110)
org.totalgrid.reef.services.metrics.DatabaseActionsCounter.transaction(DatabaseActionsCounter.scala:29)
org.totalgrid.reef.services.framework.ServiceBehaviors$GetEnabled$class.get(ServiceBehaviors.scala:45)
org.totalgrid.reef.services.core.EntityService.get(EntityService.scala:65)
org.totalgrid.reef.services.framework.ServiceBehaviors$GetEnabled$class.getAsync(ServiceBehaviors.scala:52)
org.totalgrid.reef.services.core.EntityService.getAsync(EntityService.scala:65)
org.totalgrid.reef.services.framework.AsyncContextRestService$class.respondAsync(ServiceEntryPoint.scala:50)
org.totalgrid.reef.services.core.EntityService.respondAsync(EntityService.scala:65)
org.totalgrid.reef.services.metrics.ServiceMetricsInstrumenter.respondAsync(ServiceMetricsInstrumenter.scala:53)
org.totalgrid.reef.services.framework.ServiceMiddleware.handleRequest(ServiceMiddleware.scala:64)
org.totalgrid.reef.services.framework.ServiceMiddleware.respond(ServiceMiddleware.scala:41)
org.totalgrid.reef.client.sapi.service.AsyncService$class.respond(AsyncService.scala:35)
org.totalgrid.reef.services.framework.ServiceMiddleware.respond(ServiceMiddleware.scala:33)
org.totalgrid.reef.client.impl.DefaultServiceBinding$$anon$1$$anonfun$onMessage$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(DefaultServiceBinding.scala:53)
org.totalgrid.reef.client.impl.DefaultServiceBinding.org$totalgrid$reef$client$impl$DefaultServiceBinding$$safely(DefaultServiceBinding.scala:72)
org.totalgrid.reef.client.impl.DefaultServiceBinding$$anon$1$$anonfun$onMessage$1.apply$mcV$sp(DefaultServiceBinding.scala:46)
net.agileautomata.executor4s.impl.FunRun.run(DecoratedExecutor.scala:28)
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
java.lang.Thread.run(Thread.java:662)

Daniel Evans

unread,
Oct 22, 2012, 2:45:19 PM10/22/12
to reef-...@googlegroups.com
Adam,

The Client objects themselves can be thought of as really a wrapper around the message headers it contains. The most important header is the auth token, which is what gets created when you call login() on connection and what gets copied when you call createClient() on Connection or spawn() on Client. The bottom line is the Client will keep working as long as the Connection is up and your auth token is good. Auth tokens do have timeouts, and the default is about a month.

The exception you posted seems to indicate there's something wrong with your database model or a bug in the model querying. For reference, what version of Reef are you running?

Thanks,

Adam Todorski

unread,
Oct 22, 2012, 2:51:12 PM10/22/12
to reef-...@googlegroups.com
Daniel,

Thanks for the explanation, I'm running Reef 0.4.8.

Adam
Reply all
Reply to author
Forward
0 new messages