TinkerPop 2.1.0 -- an apology to our supporting vendors.

229 views
Skip to first unread message

Marko Rodriguez

unread,
May 30, 2012, 9:27:05 AM5/30/12
to gremli...@googlegroups.com
Hi everyone,

TinkerPop 2 has some great changes in it, but in all those changes, there were some hasty moves that were made that led to a suboptimal release. To rectify this unfortunate situation, TinkerPop needs to work on a 2.1.0 release that will provide fixes to the following issues. Luckily, for most everyday users, these are non-issues. However, for our vendors, it is important we fix these sooner than later so they can get TinkerPop2 confidently into their products.

1. A forced transactional semantics that renders Neo4jGraph suboptimal for index retrievals. We plan to support vendors switching on/off "Features" as they see fit so the semantics of their core database can function as need be and thus, not cause any performance issues.

2. A hiding of the the constructors of the vendor's Vertex/Edge methods. For OrientDB's GREMLIN function and Neo4j's Gremlin Neo4jServer, they need a way to move easily between their core ODocument/Node to Blueprints Vertex. TinkerPop wants to hide all such Neo4jVertex, OrientEdge, etc. classes. However, in doing so, we removed the primary means by which the vendors drop into Blueprints and left them with a "hacky" way forward.

3. A faulty release of Rexster-Server having to do with manually releasing Rexster due to the size of its .zip file and CentralRepo rejecting it.

4. Rectify some issues in the TestSuite that still assume old transactional semantics.

As you can see, for most users, these are not show stoppers. However, for our vendors, they are.

Just a heads up to the community -- I apologize.

Thanks,
Marko.

http://markorodriguez.com

Luca Garulli

unread,
May 30, 2012, 10:29:08 AM5/30/12
to gremli...@googlegroups.com

Hi,
Probably is also disappeared the ability to turn off transaction. This is specially needed for batch import.

Lvc@

Sent from a touch phone: sorry for this fruit of crappy keyboard..

Marko Rodriguez

unread,
May 30, 2012, 6:47:17 PM5/30/12
to gremli...@googlegroups.com
Hi Luca,

Probably is also disappeared the ability to turn off transaction. This is specially needed for batch import.

I don't understand what you mean here? Can you explain please.

Thanks,
Marko.

Luca Garulli

unread,
May 31, 2012, 6:53:59 AM5/31/12
to gremli...@googlegroups.com

Hi,
With the previous release if I set the buffer size to 0 the transaction wasn't used and this is useful specially on massive insertion.

Lvc@

Sent from a touch phone: sorry for this fruit of crappy keyboard..

Marko Rodriguez

unread,
May 31, 2012, 8:40:46 AM5/31/12
to gremli...@googlegroups.com
Hi,

There are no "transaction buffer" anymore. Thus, now its as if the buffer size is always "0". So, to do a massive insert, you can either use BatchGraph or do:

g.startTransaction()
// lots of mutations
g.stopTransaction(SUCCESS)

One thing we have though is that is you didn't start your transaction on a mutation, it is automagically started for you.

HTH,
Marko.

Luca Garulli

unread,
May 31, 2012, 12:01:29 PM5/31/12
to gremli...@googlegroups.com
Hi Marko,
exactly. This is another undocumented change to the behavior of the API. In OrientDB you can work without Transaction to speed up everything. I don't know about Neo4j or other implementation.

Can we share a position where 0 or even -1 means "no transaction"? IMHO 0 (zero) is much more clear, because if the user want 1 transaction per each operation the right value is 1 (one). 

WDYT?
Lvc@

Matthias Broecheler

unread,
May 31, 2012, 1:13:07 PM5/31/12
to gremli...@googlegroups.com
Hey Luca,

could this be a database level configuration? It seems that one would like to "disable" transaction for speed improvements - such as when loading a lot of data - but that would be a dedicated script or something. In other words, it seems unlikely that a user would start a transaction without transactional support, then start a transaction with such support in the same method or part of code.
So, when opening an OrientDB graph you could specify that you want it to be non-transactional because its used for batch loading or whatever. When that is specified, OrientDB would essentially ignore transaction markers.

The benefit of this approach is that you don't extend the programmatic API and make the user have to think about what transactional level s/he wants for a particular transaction that is started. This is akin to relational databases where you specify the transactional isolation level in the config file and not when starting the transaction.

WDYT?
- Matthias
--
Matthias Broecheler, PhD
http://www.matthiasb.com
E-Mail: m...@matthiasb.com

Peter Neubauer

unread,
May 31, 2012, 1:23:28 PM5/31/12
to gremli...@googlegroups.com
mmh,
not sure. In Neo4j, you can specify the level of ACIDity per
transaction, which is pretty useful when dealing with small
transactions and deferring write flush to the OS. It's somewhere
between the BatchInserter and full transactionality, woudl be cool to
be able to configure that and even the batch size. Agree on that
totally non-transactionalty might be a bit to much magic per TX and
better as a upfront setup...

Cheers,

/peter neubauer

G:  neubauer.peter
S:  peter.neubauer
P:  +46 704 106975
L:   http://www.linkedin.com/in/neubauer
T:   @peterneubauer

If you can write, you can code - @coderdojomalmo
If you can sketch, you can use a graph database - @neo4j

Matthias Broecheler

unread,
May 31, 2012, 1:40:41 PM5/31/12
to gremli...@googlegroups.com
Yes, that's a good point, Peter! I agree that there are some configuration settings that make sense on the transaction level. In addition to the acidity level, some people like to specify transactions to be read only to ensure that subsequent code cannot mutate the database but without having to declare the entire database instance as read only.

The question is: how much of that should be general blueprints and how much should be implementation specific. What I like about the current model is that there is no assumption as to what the transactional characteristics are. All Blueprints is concerned with is that all operations on the database occur within a transaction and that a transaction has a start and an end. And it conveniently starts a transaction with the first operation and shuts down all transaction with shutdown (although the latter is a little dangerous when going from development to production). That makes the interface very easy and clean: start - stop. It is then up to the implementation to give meaning to the word "transaction" - does it mean read isolation, full consistency, or even no transactional support at all?

When you start configuring transactions, it seems that you would be making some assumption about how transactions are handled by the implementation, i.e. it becomes vendor specific pretty quickly. So, rather than having support for individual transaction configuration on the blueprints level, you could have:

Neo4jGraph g = new ...
g.startTransaction(Neo4jTransactionConfig config)

or some other special method in the particular implementation. 
WDYT?

Luca Garulli

unread,
May 31, 2012, 2:56:52 PM5/31/12
to gremli...@googlegroups.com
Hi,
the implementations that implement the TransactionalGraph interface are transactional right? This means that who doesn't implement it has not transaction.

Well. What about using OrientDB but without transaction? WDYT about providing a new method in the TransactionalGraph interface to tell to the implementation the same you do with RDBMS?

TransactionalGraph.setAutoCommit( boolean )

Lvc@

Marko Rodriguez

unread,
May 31, 2012, 3:36:26 PM5/31/12
to gremli...@googlegroups.com
Hey,

exactly. This is another undocumented change to the behavior of the API.

We discuss Blueprints 2 transactions here:


In OrientDB you can work without Transaction to speed up everything. I don't know about Neo4j or other implementation.
Can we share a position where 0 or even -1 means "no transaction"? IMHO 0 (zero) is much more clear, because if the user want 1 transaction per each operation the right value is 1 (one). 

So when we had transaction buffer in Blueprints 1.x, you were doing something more "low level" in OrientDB that was "no transaction" beyond the auto-commit work of OrientGraph?

Thanks,
Marko.

Marko Rodriguez

unread,
May 31, 2012, 3:38:17 PM5/31/12
to gremli...@googlegroups.com
Hi,

How did OrientGraph 1.x deal with it?

Second, we could have an OrientNonTransactionalGraph that simply implements Graph and KeyIndexableGraph. Or, we can introduce BufferGraph which is the notion of "transaction buffer" which is ultimately what was in Blueprints 1.x.

Thanks Luca,
Marko.

Luca Garulli

unread,
May 31, 2012, 4:01:26 PM5/31/12
to gremli...@googlegroups.com
On 31 May 2012 21:36, Marko Rodriguez <okram...@gmail.com> wrote:
Hey,
exactly. This is another undocumented change to the behavior of the API.
We discuss Blueprints 2 transactions here:


Ok, the WiKi is aligned but I don't remember a thread in this ML about this change or probably I missed it.
 
In OrientDB you can work without Transaction to speed up everything. I don't know about Neo4j or other implementation.
Can we share a position where 0 or even -1 means "no transaction"? IMHO 0 (zero) is much more clear, because if the user want 1 transaction per each operation the right value is 1 (one). 

So when we had transaction buffer in Blueprints 1.x, you were doing something more "low level" in OrientDB that was "no transaction" beyond the auto-commit work of OrientGraph?

Probably, but the semantic of txBuffer = 0 was not explicit before.

Lvc@

 

Luca Garulli

unread,
May 31, 2012, 4:02:34 PM5/31/12
to gremli...@googlegroups.com
On 31 May 2012 21:38, Marko Rodriguez <okram...@gmail.com> wrote:
Hi,

How did OrientGraph 1.x deal with it?

Second, we could have an OrientNonTransactionalGraph that simply implements Graph and KeyIndexableGraph.

It's ok for me.
 
Or, we can introduce BufferGraph which is the notion of "transaction buffer" which is ultimately what was in Blueprints 1.x.

Isn't similar to the Batch Graph?

Lvc@

Marko Rodriguez

unread,
May 31, 2012, 4:09:51 PM5/31/12
to gremli...@googlegroups.com
Hi,

Or, we can introduce BufferGraph which is the notion of "transaction buffer" which is ultimately what was in Blueprints 1.x.

Isn't similar to the Batch Graph?

There was a BufferGraph, but it was removed at the last minute. BufferGraph was a "wrapper" that implemented the behavior in Blueprints 1.x. That behavior being:

g.setMaxBufferSize(1000);
while(...) {
           g.addVertex();
        }
        
Where every 1000 mutations, the transaction is committed. Moreover, with setMaxBufferSize(1), every transaction is committed. Finally, in Blueprints 1.x, setMaxBufferSize(0) is equivalent to what is TransactionalGraph now.

g.startTransaction()
// do lots of stuffs and make sure you don't blow your memory!
g.stopTransaction(SUCCESS)
// woo hoo! you didn't blow memory :)

Is BufferGraph was your are interested in? -- as that is what gives you identical behavior to Blueprints 1.x TransactionalGraph.

Thoughts?,
Marko.

Luca Garulli

unread,
May 31, 2012, 5:05:18 PM5/31/12
to gremli...@googlegroups.com
Yes,
for me is ok as long as buffer = 0 means NO TX. Is it ok?

Lvc@
Reply all
Reply to author
Forward
0 new messages