Apparent Deadlock when inserting data

130 views
Skip to first unread message

Patrick Hoeffel

unread,
Mar 30, 2015, 4:02:39 PM3/30/15
to orient-...@googlegroups.com

OrientDB 2.0.5 on Windows Server 2012, client connecting from Win 7 Pro

I have a Java class that is inserting a vertex and then adding outbound edges. I'm doing queries to figure out what intermediate data needs to also be created in order for my edges to be built properly, but something is causing a deadlock, and I can only get it to clear by restarting OrientDB. Here is the stack trace from Eclipse when this happens:

If anyone has seen this before, please let me know.

Thanks,

Patrick

Colin

unread,
Mar 31, 2015, 12:52:59 PM3/31/15
to orient-...@googlegroups.com
Hi Patrick,

Are you sharing the same graph db connection among your threads doing the querying?

What's your design look like for writing/reading the database?

Thanks,

-Colin

Orient Technologies

The Company behind OrientDB

Patrick Hoeffel

unread,
Mar 31, 2015, 1:16:53 PM3/31/15
to orient-...@googlegroups.com
Colin,

It's only running on a single thread right now, so everything is running serially and synchronously.  The idea is that I get a JSON data document from a relational database (SQL Server), and that record has a number of FK values as fields.

1. Insert/Update the main record
2. Iterate over the FK fields. For each FK:
    a. Does the destination Class exist?
        If (yes) then does the Edge class exist?
           If (yes) then does the actual vertex instance of the destination class exist?
              If (no) then create vertex for destination end of the edge I'm about to create
              Does the instance of the Edge already exist?
                 If (no) then "CREATE EDGE RELATIONSHIP FROM (srcVertex) TO (destVertex)"
      return;

The deadlock is occurring on the main record, interestingly.

At first I was using a Transaction so that I could roll back the whole operation in the event of a failure. I changed it to use NoTx(), but it didn't help.

I'm using an UPDATE Account CONTENTS { <json> } UPSERT WHERE Id = "xxxxx" as the mechanism for inserting/updating the main record. Since that statement only returns a count of the records modified (1), I then go back and immediately SELECT FROM Account WHERE Id = "xxxxx".

I had the best luck when I put a graph.commit() between the UPDATE and the SELECT statements.

Is there a better way to do this? I coded a "graph.addVertex()", but I would still need to pre-check for existence and then UPDATE, because I don't see a single-call mechanism from Java for doing that, other than the UPSERT.

My first implementation was in a Javascript function which followed the same logic, but it was (still is occassionally) deadlocking too.

Any/all thoughts are welcome.

Thanks very much,

Patrick


--

---
You received this message because you are subscribed to a topic in the Google Groups "OrientDB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/orient-database/MMh_SxSdYKQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Patrick Hoeffel

Patrick Hoeffel

unread,
Mar 31, 2015, 4:07:28 PM3/31/15
to orient-...@googlegroups.com
Is there any chance that I'm seeing a side effect of issue #3786, which is fixed in version 2.0.6? It *does* occur during an UPSERT, after which I am adding one or more edges.

The issue is described as, "In case of concurrency, the OConcurrentModificationException should be caught and managed properly by applying changes on loaded record."

Thanks,

Patrick




On Tuesday, March 31, 2015 at 11:16:53 AM UTC-6, Patrick Hoeffel wrote:
Colin,

It's only running on a single thread right now, so everything is running serially and synchronously.  The idea is that I get a JSON data document from a relational database (SQL Server), and that record has a number of FK values as fields.

1. Insert/Update the main record
2. Iterate over the FK fields. For each FK:
    a. Does the destination Class exist?
        If (yes) then does the Edge class exist?
           If (yes) then does the actual vertex instance of the destination class exist?
              If (no) then create vertex for destination end of the edge I'm about to create
              Does the instance of the Edge already exist?
                 If (no) then "CREATE EDGE RELATIONSHIP FROM (srcVertex) TO (destVertex)"
      return;

The deadlock is occurring on the main record, interestingly.

At first I was using a Transaction so that I could roll back the whole operation in the event of a failure. I changed it to use NoTx(), but it didn't help.

I'm using an UPDATE Account CONTENTS { <json> } UPSERT WHERE Id = "xxxxx" as the mechanism for inserting/updating the main record. Since that statement only returns a count of the records modified (1), I then go back and immediately SELECT FROM Account WHERE Id = "xxxxx".

I had the best luck when I put a graph.commit() between the UPDATE and the SELECT statements.

Is there a better way to do this? I coded a "graph.addVertex()", but I would still need to pre-check for existence and then UPDATE, because I don't see a single-call mechanism from Java for doing that, other than the UPSERT.

My first implementation was in a Javascript function which followed the same logic, but it was (still is occassionally) deadlocking too.

Any/all thoughts are welcome.

Thanks very much,

Patrick


On Tue, Mar 31, 2015 at 10:52 AM, Colin wrote:
Hi Patrick,

Are you sharing the same graph db connection among your threads doing the querying?

What's your design look like for writing/reading the database?

Thanks,

-Colin

Orient Technologies

The Company behind OrientDB


On Monday, March 30, 2015 at 3:02:39 PM UTC-5, Patrick Hoeffel wrote:

OrientDB 2.0.5 on Windows Server 2012, client connecting from Win 7 Pro

I have a Java class that is inserting a vertex and then adding outbound edges. I'm doing queries to figure out what intermediate data needs to also be created in order for my edges to be built properly, but something is causing a deadlock, and I can only get it to clear by restarting OrientDB. Here is the stack trace from Eclipse when this happens:

If anyone has seen this before, please let me know.

Thanks,

Patrick

--

---


--
Patrick Hoeffel

Colin

unread,
Mar 31, 2015, 5:14:47 PM3/31/15
to orient-...@googlegroups.com
I believe that just affects the document API.

2.0.6 has just been released.  You might try it and see if the deadlock still occurs.

Please let me know.

-Colin

Emanuel

unread,
Apr 1, 2015, 7:55:28 AM4/1/15
to orient-...@googlegroups.com
Be aware that we have another thread that read on the same channel for handle the push request, that main thread is parked there waiting that the Async client  thread get the data from the network, understand that it's not is data, an give it back to the main thread.

so no deadlock there ;)
--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.

Patrick Hoeffel

unread,
Apr 2, 2015, 2:53:11 PM4/2/15
to orient-...@googlegroups.com
Well, the latest on this thread is that things work fine when I am INSERTing data that does not already exist, and it fails when I try to UPDATE data that does exist. I'm using a Javascript function to do the work, and I'm providing a single string (a JSON data payload as a single record in string format) as an input parameter (param name = "data"). Here is the part of the function that is failing. Perhaps someone can tell me what I'm doing wrong. (By the way, I have a UNIQUE_HASH_INDEX on Id).

  var cmd;
 
var result;
 
var dataJson = JSON.parse(data);
 
var srcId = dataJson["Id"];
 
  devlogInfo
("srcId=" + srcId);

 
try {
   
// Before doing anything else, persist the input document
   
if (vertexExistsInClass("Account", srcId)) {
     
// The vertex already exists, so this becomes an UPDATE operation
      cmd
= "UPDATE Account CONTENT " + data + " WHERE Id = '" + srcId + "'";
      devlogInfo
(cmd);
      db
.begin();
      result
= db.command(cmd);   // <== THIS OPERATION PRODUCES WHAT ACTS LIKE AN UNRECOVERABLE DEADLOCK, REQUIRING SERVER SHUTDOWN/RESTART
   
} else {
      devlogInfo
("CREATED VERTEX, Id = " + srcId);
      db
.begin();
      result
= db.save(data);    // <== THIS OPERATION WORKS GREAT
    }

    // Proceed with creation of Edges going out from this vertex, including creation of destination vertices if needed.


By replacing that code with the following, I am no longer seeing the deadlock.

    // Before doing anything else, persist the input document
    cmd
= "UPDATE Client CONTENT " + data + " UPSERT WHERE Id = '" + srcId + "'";
    result
= db.command(cmd);


Now, I have a result object, but because it is an UPDATE command, I can't get the RID out of the response object. Is there an easy way to do that in Javascript? With the RID, I should be able to query for the RID of the destination vertex and then make a call to determine whether or not an edge between the two nodes exists.

Does anyone have javascript code handy that does that efficiently?

Thanks,

Patrick

Patrick Hoeffel

unread,
Apr 3, 2015, 6:00:42 PM4/3/15
to orient-...@googlegroups.com
Can anyone tell me what the OrientDB concurrent access policy looks like? And will this logical flow still work when multiple threads are running through this logic at the same time? Is there a clean and bullet-proof way to guarantee that I will not get a deadlock trying to:
  1. Create vertices?
  2. Resolve resolve vertex Id keys to RIDs?
  3. Test for existence of vertices and edges by name/Id?
  4. Create edges?

Doing this in a function inside the database seemed like a good idea, and when it runs it is pretty fast for all that it does, but it seems to be colliding with itself sometimes, and I'm concerned about it scaling to multiple threads. I am feeding the functions via Mule.

If there is a better way, I would like very much to know it.


Thanks,

Patrick


On Tuesday, March 31, 2015 at 11:16:53 AM UTC-6, Patrick Hoeffel wrote:
To unsubscribe from this group and all its topics, send an email to orient-database+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Patrick Hoeffel

Patrick Hoeffel

unread,
Apr 3, 2015, 6:04:38 PM4/3/15
to orient-...@googlegroups.com
Also, upgrading to 2.0.6 does not make any difference. Still getting deadlocks, though fewer.

Patrick

Luca Garulli

unread,
Apr 4, 2015, 10:11:33 AM4/4/15
to orient-database
Hi Patrick,
Please could you try the "develop" branch, or 2.1-SNAPSHOT? I know some problem with concurrency has been resolved there and before to release the 2.1-rc1 (nex week) I'd like to be sure this kind of problems are resolved.

Lvc@


--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.

Patrick Hoeffel

unread,
Apr 14, 2015, 3:49:23 PM4/14/15
to orient-...@googlegroups.com
Hi, Luca,

I pulled and built the "develop" branch from GitHub, and then re-ran my test. Same result. I have one specific record that will lock every time on update. Initial insert works fine (using the "UPDATE Class CONTENT { } UPSERT WHERE Id = 'xxx' " sql command syntax).

I wonder if it could be an issue involving updates across clusters, since the structure of the data is such that class "Base" extends V, and "Account" extends "Base". Base is the class (and therefore the cluster) that contains the Id field, and it is the base class's Id field that is indexed using the unique Hash index. Any chance there could be a collision in coordination across clusters/indexes in that scenario?

Thanks,

Patrick

Patrick Hoeffel

unread,
Apr 15, 2015, 1:01:35 AM4/15/15
to orient-...@googlegroups.com
Update: I don't know if this necessarily "proves" anything, but I removed the inheritance from my Account class so that Account inherits directly from V. I added an Id property (type STRING) directly on the Account class and then created an index (UNIQUE_HASH_INDEX, SBTREE) directly on Account instead of on the superclass.

With these changes in place, I am no longer seeing the phenomenon that looked like a deadlock. This is not an ideal solution, but it is *a* solution, and I'll definitely take it for now.

Patrick

PS - the "develop" branch that I pulled and built was a 2.1 snapshot branch.

Patrick Hoeffel

unread,
Apr 15, 2015, 4:54:32 PM4/15/15
to orient-...@googlegroups.com
I put back the class hierarchy that had been in place previously, and the apparent deadlock behavior returned. Here is the class structure:

CREATE CLASS Base EXTENDS V ABSTRACT
CREATE
Property Base.Id STRING
CREATE
Property Base.DateCreated DATETIME
CREATE
Property Base.DateModified DATETIME

CREATE INDEX
Id ON Base (Id collate ci) UNIQUE_HASH_INDEX METADATA {ignoreNullValues : false}

CREATE CLASS
BaseMaster EXTENDS Base ABSTRACT
CREATE
Property BaseMaster.Name STRING
CREATE
Property BaseMaster.Code STRING

CREATE CLASS
Account EXTENDS BaseMaster


With the classes created and the index in place, I begin inserting data using HTTP to call a javascript function to actually insert the record:

http://localhost:2480/function/Demo/loadData_Account/{   "@class":"Account"   , "Id":"11111111-5022-e411-8534-00155d016d32"   , "Timestamp":"13129987427877781504"   , "Company":"22222222-4c22-e411-8534-00155d016d32"   , "Code":"012345"   , "Name":"Some Name"   , "Type":"1"   , "SubsidiaryType":"0"   , "PostingType":"0"   , "Status":"0" }


The javascript function generates and calls the SQL Command to perform an UPSERT:

UPDATE Account CONTENT {   "@class":"Account"   , "Id":"11111111-5022-e411-8534-00155d016d32"   , "Timestamp":"13129987427877781504"   , "Company":"22222222-4c22-e411-8534-00155d016d32"   , "Code":"012345"   , "Name":"Some Name"   , "Type":"1"   , "SubsidiaryType":"0"   , "PostingType":"0"   , "Status":"0" }
 UPSERT WHERE
Id = "11111111-5022-e411-8534-00155d016d32"

I have 100 sample records that I use for testing. When the Id does not exist and the operation resolves to an INSERT, then everything is fine. When the Id does exist and the operation resolves to an UPDATE, that is when I see the problem.

Clearly there is something about the way clusters work at the physical storage level that I don't understand. Could the problem be that I created properties and an index on an ABSTRACT class?

Any insights are appreciated.

Patrick
Reply all
Reply to author
Forward
0 new messages