Get connection problem after Orient Server restart

409 views
Skip to first unread message

Fabrizio Fortino

unread,
Aug 8, 2012, 5:42:14 AM8/8/12
to orient-...@googlegroups.com
Hi guys,

My application runs on JBoss AS 7.1.1 and talks with a remote OrientDB Server 1.1.0.
To acquire a connection I use

db0 = ODatabaseDocumentPool.global().acquire(dbName, username, password);

The CLIENT_CHANNEL_MAX_POOL value is 20.

If I restart the OrientDB Server the get connections become very slow. Looking at the client log I see a lot of:

WARNING [com.orientechnologies.orient.client.remote.OStorageRemote] (http--10.42.43.11-8080-12) Connection re-acquired transparently after 503ms and 1 retries: no errors will be thrown at application level

Moreover sometimes I have the following exceptions:

com.orientechnologies.orient.core.exception.OStorageException: Cannot create a connection to remote server address(es): [orient.lan:2424]
at com.orientechnologies.orient.client.remote.OStorageRemote.openRemoteDatabase(OStorageRemote.java:1306) [orientdb-client-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.client.remote.OStorageRemote.open(OStorageRemote.java:174) [orientdb-client-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.client.remote.OStorageRemoteThread.open(OStorageRemoteThread.java:64) [orientdb-client-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.raw.ODatabaseRaw.open(ODatabaseRaw.java:88) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:47) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.open(ODatabaseRecordAbstract.java:108) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:47) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTxPooled.<init>(ODatabaseDocumentTxPooled.java:42) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentPool.createResource(ODatabaseDocumentPool.java:39) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentPool.createResource(ODatabaseDocumentPool.java:20) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolBase$1.createNewResource(ODatabasePoolBase.java:67) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolBase$1.createNewResource(ODatabasePoolBase.java:56) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.common.concur.resource.OResourcePool.getResource(OResourcePool.java:66) [orient-commons-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolAbstract.acquire(ODatabasePoolAbstract.java:64) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolAbstract.acquire(ODatabasePoolAbstract.java:52) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolBase.acquire(ODatabasePoolBase.java:114) [orientdb-core-1.1.0.jar:1.1.0]

AND

com.orientechnologies.orient.core.exception.ORecordNotFoundException: The record with id '#2:0' not found
at com.orientechnologies.orient.core.record.ORecordAbstract.reload(ORecordAbstract.java:264) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.record.impl.ODocument.reload(ODocument.java:362) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.type.ODocumentWrapper.reload(ODocumentWrapper.java:82) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.type.ODocumentWrapperNoClass.reload(ODocumentWrapperNoClass.java:67) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared$1.call(OSchemaShared.java:389) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared$1.call(OSchemaShared.java:384) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.storage.OStorageAbstract.callInLock(OStorageAbstract.java:162) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.client.remote.OStorageRemoteThread.callInLock(OStorageRemoteThread.java:355) [orientdb-client-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.reload(OSchemaShared.java:384) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.metadata.schema.OSchemaProxy.reload(OSchemaProxy.java:140) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.metadata.OMetadata.reload(OMetadata.java:136) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTxPooled.reuse(ODatabaseDocumentTxPooled.java:48) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolBase$1.reuseResource(ODatabasePoolBase.java:72) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolBase$1.reuseResource(ODatabasePoolBase.java:56) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.common.concur.resource.OResourcePool.getResource(OResourcePool.java:56) [orient-commons-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolAbstract.acquire(ODatabasePoolAbstract.java:64) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolAbstract.acquire(ODatabasePoolAbstract.java:52) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolBase.acquire(ODatabasePoolBase.java:114) [orientdb-core-1.1.0.jar:1.1.0]

Any idea?

Thanks,
Fabrizio

Luca Garulli

unread,
Aug 8, 2012, 5:48:15 AM8/8/12
to orient-...@googlegroups.com
Hi Fabrizio,
is that exception on client or server side/log? After a restart of your client app everything works as expected?

Lvc@


--
 
 
 

Fabrizio Fortino

unread,
Aug 8, 2012, 5:55:44 AM8/8/12
to orient-...@googlegroups.com
Hi Luca,

The exceptions are on client side.
Right, after a client restart everything works well.

Fabrizio

Fabrizio Fortino

unread,
Aug 8, 2012, 5:58:07 AM8/8/12
to orient-...@googlegroups.com
Hi Luca,

The exceptions are on the client side.
Right, after a client app restart everything works well.

Fabrizio

On Wednesday, August 8, 2012 11:48:15 AM UTC+2, Lvc@ wrote:

Luca Garulli

unread,
Aug 8, 2012, 5:59:41 AM8/8/12
to orient-...@googlegroups.com
Ok,
can you write here the full stack trace? It's not clear if the record #2:0 not found is the cause or viceversa.

Lvc@

--
 
 
 

Fabrizio Fortino

unread,
Aug 8, 2012, 6:10:06 AM8/8/12
to orient-...@googlegroups.com
Luca,

Please notice that the 2 exceptions are not related between each other.

Here is the full stack trace of the second:

Caused by: com.orientechnologies.orient.core.exception.ORecordNotFoundException: The record with id '#2:0' not found
at com.orientechnologies.orient.core.record.ORecordAbstract.reload(ORecordAbstract.java:264) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.record.impl.ODocument.reload(ODocument.java:362) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.type.ODocumentWrapper.reload(ODocumentWrapper.java:82) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.type.ODocumentWrapperNoClass.reload(ODocumentWrapperNoClass.java:67) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared$1.call(OSchemaShared.java:389) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared$1.call(OSchemaShared.java:384) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.storage.OStorageAbstract.callInLock(OStorageAbstract.java:162) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.client.remote.OStorageRemoteThread.callInLock(OStorageRemoteThread.java:355) [orientdb-client-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.metadata.schema.OSchemaShared.reload(OSchemaShared.java:384) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.metadata.schema.OSchemaProxy.reload(OSchemaProxy.java:140) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.metadata.OMetadata.reload(OMetadata.java:136) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTxPooled.reuse(ODatabaseDocumentTxPooled.java:48) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolBase$1.reuseResource(ODatabasePoolBase.java:72) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolBase$1.reuseResource(ODatabasePoolBase.java:56) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.common.concur.resource.OResourcePool.getResource(OResourcePool.java:56) [orient-commons-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolAbstract.acquire(ODatabasePoolAbstract.java:64) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolAbstract.acquire(ODatabasePoolAbstract.java:52) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.ODatabasePoolBase.acquire(ODatabasePoolBase.java:114) [orientdb-core-1.1.0.jar:1.1.0]
at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source) [:1.6.0_24]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.6.0_24]
at java.lang.reflect.Method.invoke(Method.java:616) [rt.jar:1.6.0_24]
at org.jboss.as.ee.component.ManagedReferenceMethodInterceptorFactory$ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptorFactory.java:72) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:374) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:127) [jboss-as-weld-7.1.1.Final.jar:7.1.1.Final]
at org.jboss.as.weld.ejb.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:135) [jboss-as-weld-7.1.1.Final.jar:7.1.1.Final]
at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:36) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:36) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.as.jpa.interceptor.SBInvocationInterceptor.processInvocation(SBInvocationInterceptor.java:47) [jboss-as-jpa-7.1.1.Final.jar:7.1.1.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:82) [jboss-as-weld-7.1.1.Final.jar:7.1.1.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:21) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:53) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.as.ejb3.component.pool.PooledInstanceInterceptor.processInvocation(PooledInstanceInterceptor.java:51) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]
at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation-1.1.1.Final.jar:1.1.1.Final]
at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInOurTx(CMTTxInterceptor.java:228) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final]
... 45 more
Caused by: com.orientechnologies.orient.core.exception.ODatabaseException: Error on retrieving record #2:0 (cluster: default)
at com.orientechnologies.orient.core.db.raw.ODatabaseRaw.read(ODatabaseRaw.java:234) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.executeReadRecord(ODatabaseRecordAbstract.java:566) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.reload(ODatabaseRecordAbstract.java:244) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.reload(ODatabaseRecordAbstract.java:69) [orientdb-core-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.record.ORecordAbstract.reload(ORecordAbstract.java:257) [orientdb-core-1.1.0.jar:1.1.0]
... 92 more
Caused by: com.orientechnologies.orient.core.exception.ODatabaseException: Connection is closed
at com.orientechnologies.orient.client.remote.OStorageRemote.checkConnection(OStorageRemote.java:1423) [orientdb-client-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.client.remote.OStorageRemote.readRecord(OStorageRemote.java:388) [orientdb-client-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.client.remote.OStorageRemoteThread.readRecord(OStorageRemoteThread.java:145) [orientdb-client-1.1.0.jar:1.1.0]
at com.orientechnologies.orient.core.db.raw.ODatabaseRaw.read(ODatabaseRaw.java:228) [orientdb-core-1.1.0.jar:1.1.0]
... 96 more

Luca Garulli

unread,
Aug 8, 2012, 6:57:23 AM8/8/12
to orient-...@googlegroups.com
By reading this stack the main exception "The record with id '#2:0' not found" is caused by "com.orientechnologies.orient.core.exception.ODatabaseException: Connection is closed".

So the problem is that it cannot open the connection. Now we've to find what is the reason...

Lvc@

--
 
 
 

Fabrizio Fortino

unread,
Aug 8, 2012, 10:22:09 AM8/8/12
to orient-...@googlegroups.com
Hi Luca,

I think it would not be so easy to find the reason since the exception is not thrown systematically.

I would pay more attention on the message

WARNING [com.orientechnologies.orient.client.remote.OStorageRemote] (http--10.42.43.11-8080-12) Connection re-acquired transparently after 503ms and 1 retries: no errors will be thrown at application level

Is that normal to see so many messages with this warning? Every time the Orient Server is restarted the client app starts to print those warnings without stopping. The client application works but it becomes very slow. Sometimes the re-acquire connection takes more than 20 seconds.

Cheers,
Fabrizio

Luca Garulli

unread,
Aug 8, 2012, 11:13:05 AM8/8/12
to orient-...@googlegroups.com
Hi Fabrizio,
unless you've network problems this happens when operations take more time than the configured timeouts.

What kind of operations are you doing?

Lvc@

--
 
 
 

Fabrizio Fortino

unread,
Aug 8, 2012, 12:07:31 PM8/8/12
to orient-...@googlegroups.com
Hi Luca,

The weird thing is that I don't have any network problem. The operations are simple queries that take few ms to complete.
The messages appear only after a restart of the Orient Server, never before.

As stated before, I use the connection pool (ODatabaseDocumentPool.global().acquire(dbName, username, password);)
Couldn't it be that the connections in the pool are not replaced with the re-acquired one after the server restart?

Fabrizio

Luca Garulli

unread,
Aug 8, 2012, 12:32:35 PM8/8/12
to orient-...@googlegroups.com
Hi,
in theory all is designed to reconnect automatically. If this doesn't happen it's a bug. But I would need a test case to fix in short time.

Lvc@

--
 
 
 

Ford Guo

unread,
Aug 9, 2012, 4:02:35 AM8/9/12
to orient-...@googlegroups.com
I got such log:
2012-8-9 15:51:53 com.orientechnologies.common.log.OLogManager log
警告(Warn): Connection re-acquired transparently after 500ms and 1 retries: no errors will be thrown at application level

I think it should be related with re-connect

-Ford

Luca Garulli

unread,
Aug 9, 2012, 4:16:16 AM8/9/12
to orient-...@googlegroups.com
Yes,
you can ignore it but if happens too often probably you have to increase timeouts.

Lvc@

--
 
 
 

Fabrizio Fortino

unread,
Oct 17, 2012, 6:32:21 AM10/17/12
to orient-...@googlegroups.com
Hi Luca,

I still have troubles with this behaviour when the server restarts. It is not easy for me to create a test case but I think I have found the way to reproduce it using the Orient console executing the following steps.

  1. Start the OrientDB Server (test against release 1.2.0)
  2. Open the OrientDB Console
  3. Connect to the demo db using the remote protocol
    • orientdb> connect remote:localhost/demo admin admin
    • Connecting to database [remote:localhost/demo] with user 'admin'...OK
  4. Execute a simple query
    • orientdb> select from actor

    • ---+---------+--------------------+--------------------
    •   #| RID     |name                |out                 
    • ---+---------+--------------------+--------------------
    •   0|    #91:0|Tom Cruise          |[4]                 
    •   1|    #91:1|Nicol Kidman        |null                |[1]                 
    •   2|    #91:2|Meg Ryan            |[2]                 |[1]                 
    • ---+---------+--------------------+--------------------+--------------------

    • 3 item(s) found. Query executed in 0.017 sec(s).
  5. Restarts the server
  6. Execute the query at step 4
    • orientdb> select from actor
    • Oct 17, 2012 12:22:13 PM com.orientechnologies.common.log.OLogManager log
    • WARNING: Connection re-acquired transparently after 1014ms and 2 retries: no errors will be thrown at application level

    • ---+---------+--------------------+--------------------
    •   #| RID     |name                |out                 
    • ---+---------+--------------------+--------------------
    •   0|    #91:0|Tom Cruise          |[4]                 
    •   1|    #91:1|Nicol Kidman        |null                |[1]                 
    •   2|    #91:2|Meg Ryan            |[2]                 |[1]                 
    • ---+---------+--------------------+--------------------+--------------------

    • 3 item(s) found. Query executed in 1.023 sec(s).
  7. Execute more queries
From now on, every query execution has to re-acquire the connection (I thought it should happen only at step 6, after the server restart). As you can see from the execution time the query at step 6 (and all the other subsequent queries) takes more than 1 second against 0.017 of the original one.

Is this the expected behaviour?

Cheers,
Fabrizio

Luca Garulli

unread,
Oct 17, 2012, 6:54:07 AM10/17/12
to orient-...@googlegroups.com
Hi,
I've executed your test and the connection is re-acquired only once:

$ ./console.sh
OrientDB console v.1.3.0-SNAPSHOT (build @BUILD@) www.orientechnologies.com
Type 'help' to display all the commands supported.

Installing extensions for GREMLIN language v.2.2.0-SNAPSHOT

orientdb> connect remote:localhost/tinkerpop admin admin
Connecting to database [remote:localhost/tinkerpop] with user 'admin'...OK

orientdb> set limit -1
orientdb>  select count(*) from v

---+---------+--------------------
  #| RID     |count
---+---------+--------------------
  0|         |814
---+---------+--------------------

1 item(s) found. Query executed in 1.487 sec(s).

orientdb>  select count(*) from v

---+---------+--------------------
  #| RID     |count
---+---------+--------------------
  0|         |814
---+---------+--------------------

1 item(s) found. Query executed in 0.143 sec(s).

-- RESTARTED THE SERVER --

orientdb>  select count(*) from v
ott 17, 2012 12:48:54 PM com.orientechnologies.common.log.OLogManager log
WARNING: Connection re-acquired transparently after 887ms and 1 retries: no errors will be thrown at application level

---+---------+--------------------
  #| RID     |count
---+---------+--------------------
  0|         |814
---+---------+--------------------

1 item(s) found. Query executed in 2.121 sec(s).

orientdb>  select count(*) from v

---+---------+--------------------
  #| RID     |count
---+---------+--------------------
  0|         |814
---+---------+--------------------

1 item(s) found. Query executed in 0.151 sec(s).

orientdb>  select count(*) from v

---+---------+--------------------
  #| RID     |count
---+---------+--------------------
  0|         |814
---+---------+--------------------

1 item(s) found. Query executed in 0.151 sec(s).

orientdb>

I'm using 1.3.0-SNAPSHOT but nothing has changed on this. Can you execute multiple times the same query after the restart of server to show times and log messages?

Lvc@


--
 
 
 

Fabrizio Fortino

unread,
Oct 17, 2012, 7:09:19 AM10/17/12
to orient-...@googlegroups.com
Hi Luca,

Here is the console output. In this case I got 2 connection re-acquired. From the third one the WARNING is not shown anymore.

OrientDB console v.1.2.0 (build 12659) www.orientechnologies.com
Type 'help' to display all the commands supported.

orientdb> connect remote:localhost/demo admin admin                   
Connecting to database [remote:localhost/demo] with user 'admin'...OK

orientdb> select from actor       

---+---------+--------------------+--------------------
  #| RID     |name                |out                 
---+---------+--------------------+--------------------
  0|    #91:0|Tom Cruise          |[4]                 
  1|    #91:1|Nicol Kidman        |null                |[1]                 
  2|    #91:2|Meg Ryan            |[2]                 |[1]                 
---+---------+--------------------+--------------------+--------------------

3 item(s) found. Query executed in 0.015 sec(s).

orientdb> select from actor
Oct 17, 2012 1:05:38 PM com.orientechnologies.common.log.OLogManager log
WARNING: Connection re-acquired transparently after 911ms and 1 retries: no errors will be thrown at application level

---+---------+--------------------+--------------------
  #| RID     |name                |out                 
---+---------+--------------------+--------------------
  0|    #91:0|Tom Cruise          |[4]                 
  1|    #91:1|Nicol Kidman        |null                |[1]                 
  2|    #91:2|Meg Ryan            |[2]                 |[1]                 
---+---------+--------------------+--------------------+--------------------

3 item(s) found. Query executed in 0.931 sec(s).

orientdb> select from actor
Oct 17, 2012 1:05:42 PM com.orientechnologies.common.log.OLogManager log
WARNING: Connection re-acquired transparently after 1010ms and 2 retries: no errors will be thrown at application level

---+---------+--------------------+--------------------
  #| RID     |name                |out                 
---+---------+--------------------+--------------------
  0|    #91:0|Tom Cruise          |[4]                 
  1|    #91:1|Nicol Kidman        |null                |[1]                 
  2|    #91:2|Meg Ryan            |[2]                 |[1]                 
---+---------+--------------------+--------------------+--------------------

3 item(s) found. Query executed in 1.02 sec(s).

orientdb> select from actor

---+---------+--------------------+--------------------
  #| RID     |name                |out                 
---+---------+--------------------+--------------------
  0|    #91:0|Tom Cruise          |[4]                 
  1|    #91:1|Nicol Kidman        |null                |[1]                 
  2|    #91:2|Meg Ryan            |[2]                 |[1]                 
---+---------+--------------------+--------------------+--------------------

3 item(s) found. Query executed in 0.0080 sec(s).

Luca Garulli

unread,
Oct 17, 2012, 7:33:05 AM10/17/12
to orient-...@googlegroups.com
Hi,
probably you used 2 connections of the pool. The important is that once re-acquired is not lost again.

Lvc@

--
 
 
 

Fabrizio Fortino

unread,
Oct 17, 2012, 8:56:27 AM10/17/12
to orient-...@googlegroups.com
Hi Luca,

Attached a test class for the connection problem. It basically starts 30 threads that run forever and execute queries against the demo db (CLIENT_CHANNEL_MAX_POOL = 20). Restart the server in the middle of the test execution to see if the reacquire connection works (the step to execute the test are listed in the comments of the class).

What I have noticed is:
  • - if the restart takes a long time (lets say 10 seconds) the reacquire connection mechanism works
  • - if the restart is almost immediate the reacquire does not work producing several
    • ERROR 17-10 14:35:44,937 - unable to execute query: Cannot acquire lock on requested resource: remote:localhost/demo

Moreover, when the Server restarts sometimes I see the following print in the server log:
2012-10-17 14:53:36:391 INFO Received shutdown command from the remote client /127.0.0.1:53636 [ONetworkProtocolBinary]

Hope it helps,
Fabrizio
_ReacquireConnectionTest.java

Peter Berkman

unread,
Jun 29, 2013, 3:49:39 PM6/29/13
to orient-...@googlegroups.com
I am trying to load a flat-file with 150k unique records....
 
Same problem here, but I can reproduce with only 1 client thread connected and NO server restart.
 
1) using Java only (no console).
 
2) using "remote:localhost/myDB"
 
3) add a custom vertex type
 
4) add about 15 edge types
 
5) add about 150k vertexes.
 
6) add 150k edges back-and-forth
 
7) add 150k different edgest back-and-forth
 
8) add about 1000 varous back-and-forth edgest
 
9) during about the 3rd or 4th itteration of #8, your client will start getting these warnings:
 
Jun 29, 2013 12:33:56 PM com.orientechnologies.common.log.OLogManager log
WARNING
: Connection re-acquired transparently after 500ms and 1 retries: no errors will be thrown at application level

 
and, it will re-acquire the connection after EVER Edge creation after that - waiting 500ms each time....
 
I can send you the code for all of this, but the data has restrictions so, I cannot...
 
this is using OrientDB 1.4.2-SNAPSHOT and Tinkerpop 2.4.0-SNAPSHOT.
 

Peter Berkman

unread,
Jun 30, 2013, 5:44:53 PM6/30/13
to orient-...@googlegroups.com
some further information that may help.  I let it continue even with the client:
 
Jun 30, 2013 2:38:10 PM com.orientechnologies.common.log.OLogManager log
WARNING
: Connection re-acquired transparently after 515ms and 1 retries: no errors will be thrown at application level

warnings and after about 10 hours, the server threw this error (non-fatal):
 
Error on unload the tree: index #1:4
null
-> java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
-> java.util.TreeMap$ValueIterator.next(TreeMap.java:1145)
-> com.orientechnologies.orient.core.type.tree.OMVRBTreePersistent.unload(OMVRBTreePersistent.java:251)
-> com.orientechnologies.orient.core.index.OIndexMVRBTreeAbstract.onClose(OIndexMVRBTreeAbstract.java:921)
-> com.orientechnologies.orient.core.db.raw.ODatabaseRaw.callOnCloseListeners(ODatabaseRaw.java:755)
-> com.orientechnologies.orient.core.db.raw.ODatabaseRaw.close(ODatabaseRaw.java:535)
-> com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.close(ODatabaseWrapperAbstract.java:69)
-> com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.close(ODatabaseRecordAbstract.java:236)
-> com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.close(ODatabaseWrapperAbstract.java:69)
-> com.orientechnologies.orient.server.OClientConnection.close(OClientConnection.java:46)
-> com.orientechnologies.orient.server.OClientConnectionManager$1.run(OClientConnectionManager.java:72)
-> java.util.TimerThread.mainLoop(Timer.java:512)
-> java.util.TimerThread.run(Timer.java:462)com.orientechnologies.orient.core.exception.OStorageException: Error on unload the tree: index #1:4
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at com.orientechnologies.common.log.OLogManager.error(OLogManager.java:120)
        at com.orientechnologies.orient.core.type.tree.OMVRBTreePersistent.unload(OMVRBTreePersistent.java:270)
        at com.orientechnologies.orient.core.index.OIndexMVRBTreeAbstract.onClose(OIndexMVRBTreeAbstract.java:921)
        at com.orientechnologies.orient.core.db.raw.ODatabaseRaw.callOnCloseListeners(ODatabaseRaw.java:755)
        at com.orientechnologies.orient.core.db.raw.ODatabaseRaw.close(ODatabaseRaw.java:535)
        at com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.close(ODatabaseWrapperAbstract.java:69)
        at com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.close(ODatabaseRecordAbstract.java:236)
        at com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.close(ODatabaseWrapperAbstract.java:69)
        at com.orientechnologies.orient.server.OClientConnection.close(OClientConnection.java:46)
        at com.orientechnologies.orient.server.OClientConnectionManager$1.run(OClientConnectionManager.java:72)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
Caused by: java.util.ConcurrentModificationException
        at java.util.TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1100)
        at java.util.TreeMap$ValueIterator.next(TreeMap.java:1145)
        at com.orientechnologies.orient.core.type.tree.OMVRBTreePersistent.unload(OMVRBTreePersistent.java:251)
        ... 10 more
Exception in thread "Timer-0" java.lang.NullPointerException
        at com.orientechnologies.common.log.OLogManager.log(OLogManager.java:39)
        at com.orientechnologies.common.log.OLogManager.debug(OLogManager.java:55)
        at com.orientechnologies.orient.server.OClientConnectionManager$1.run(OClientConnectionManager.java:69)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)
com.orientechnologies.common.concur.OTimeoutException: Timeout on acquiring exclusive lock against resource: /127.0.0.1:59456
        at com.orientechnologies.common.concur.resource.OSharedResourceTimeout.acquireExclusiveLock(OSharedResourceTimeout.java:67)

 

Luca Garulli

unread,
Jul 1, 2013, 9:45:52 AM7/1/13
to orient-database
Hi,
I've fixed a bug in concurrency and pushed in hotfix-1.4.2 branch. About the disconnection problem maybe the timeouts are too low for your queries? Try to enlarge network.socketTimeout to more than 10 seconds (default is 10000 ms).

But this means you're executing commands that take more than 10 seconds? What query do you execute between 3 and 4? Do you look for vertex by your ID that is not indexed?

Lvc@



 

--
 
---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Peter Berkman

unread,
Jul 1, 2013, 11:16:56 AM7/1/13
to orient-...@googlegroups.com
Luca,
thanks - I'll try the patch.
just to be clear, this timeout occurs during some long commits, as well as, a creation of an index - both of which can take more than 10 seconds. I don't think the timeouts are working because I tried "0" and "220000" - no change in behavior.

Peter Berkman

unread,
Jul 1, 2013, 11:19:09 AM7/1/13
to orient-...@googlegroups.com
sorry, once the errors start happening, they never stop -- these happen during an insert of an Edge - takes less than 20 MS.
Reply all
Reply to author
Forward
0 new messages