OrientDB first step-by-step tutorial

1,942 views
Skip to first unread message

Luca Garulli

unread,
Dec 16, 2010, 5:02:16 AM12/16/10
to orient-database
Hi,
the first two parts of the step-by-step tutorial about the usage of OrientDB are online in Italian language:

Guida all'uso di OrientDB: introduzione al mondo NoSQL -> http://goo.gl/crVlu

Guida all'uso di OrientDB: primo utilizzo -> http://goo.gl/Me4qC

New parts will be published every 5-10 days.

Any volunteer for the English translation?

Lvc@

Raffaele Guidi

unread,
Jan 3, 2011, 5:45:14 PM1/3/11
to orient-...@googlegroups.com
Here it is. Now you have to write more ;)
OrientDB tutorial - first steps.doc

Luca Garulli

unread,
Jan 3, 2011, 6:09:08 PM1/3/11
to orient-database
Hi Raffaele,
thank you! I think the best would be to publish it on D-Zone, Javalobby or something else. WDYT?

Lvc@

Raffaele Guidi

unread,
Jan 4, 2011, 4:26:51 AM1/4/11
to orient-...@googlegroups.com
I would suggest to publish it on your own site and then write a small article on dzone but also infoq and theserverside.com (my favourite). I would be happy to write it, but I think a more complete tutorial (somthing like "your first app with orientdb") with sample code would get more traction. I was investigating on a nosql back-end to plug as a 3d level storage into my own project, DirectMemory (https://github.com/raffaeleguidi/DirectMemory/wiki) when I found your "request for contribution". You give me some advice to integrate it and I help you with the article? Deal? ;)

Raffaele Guidi

unread,
Jan 4, 2011, 9:03:25 AM1/4/11
to orient-...@googlegroups.com
Oh, by the way, I just need a hint on how to write a simple key/value store where the value is a byte array and a couple of queries for eviction and count of items, nothing too engaging.

Ciao,
   R

tomas fufa

unread,
Jan 4, 2011, 11:38:06 AM1/4/11
to orient-...@googlegroups.com
Hello Raffaele,
I read your last mail and I went to check it... I was looking for something like BigMemory before "getting in touch" with OrientDB... and I also thought about the same thing: why don't we create an open source "BigMemory"?... hehehe

Actually I started from this site: http://www.kdgregory.com/index.php?page=java.byteBuffer
And the nice tutorial: http://www.kdgregory.com/programming/java/ByteBuffer_JUG_Presentation.pdf)

Just in case it's useful for you!
I would help you with the coding itself, but lately I'm crazy busy trying to learn OrientDB ;) (Thanx for the doc!)

BTW: the first thing I did was looking for "ByteBuffer" occurrences in Luca's code...
* OBase64Utils.java (6 matches)
* OFileClassic.java (19 matches)
* OFileMMap.java (3 matches)
* OMMapBufferEntry.java (3 matches)

...another thing: I'm italian too, even if I'm living in Spain: it's nice to see how many of us are putting some effort on this NoSQL thing :) (I'm thinking also in Salvatore Sanfilippo with his redis.io)

Raffaele Guidi

unread,
Jan 4, 2011, 2:38:33 PM1/4/11
to orient-...@googlegroups.com
Well, thanks for your interest, I already knew those papers (if you notice they are referenced in the project's wiki), I started from there and took a slightly different approach, allocating memory in large pages (in the GB range) and then "slicing" it in small pieces. This of course can lead to fragmentation but I think that performance improvement is well worth the risk - I am managing to keep put and get operations (even with 20/30 concurrent threads) in the tenth of millisecond range. But, getting back to OrientDB, if you are learning it you can give a hand to write a small sample application! :) Keep up with the good italian job!!!

Ciao,
   R

Luca Garulli

unread,
Jan 4, 2011, 6:21:32 PM1/4/11
to orient-database
Hi,
are you talking about something like Terracotta product? In this case it could be feasible using OrientDB as underlying big memory since it already maps POJOs to the disk using memory mapping technique. In this case the format to write the POJOs is not directly binary but a sort of compressed JSON. This is because the schema-less feature.

Why to use OrientDB as base technology for a BigMemory like product instead of implement an ad-hoc product? Maybe because:
  • OrientDB has a lot of code to manipulate byte buffer in efficient way
  • You can execute queries against objects
  • there was something else, but I can't remember :-)
Ciao,
Lvc@

Tomas Espeleta

unread,
Jan 4, 2011, 6:37:15 PM1/4/11
to OrientDB
Hello Luca,
maybe it-s a bit off/topic, but I have a question related to this: is
there some kind of asymmetry between how POJOs are stored, and
OGraphVertex are stored? I've just found examples of how to deal with
POJOs in ObjectDatabase... but I have the impression that vertexes
are, on the contrary, just k/v containers?

Can I extend OGraphVertex, register its class, and have its POJO
properties treated as vertex properties?

If not, it would be interesting! :) it could be an object/graph db OR
a document/graph db, and the user could choose between these two
uses....

Tomas Espeleta

unread,
Jan 4, 2011, 6:53:18 PM1/4/11
to OrientDB
> Well, thanks for your interest, I already knew those papers (if you notice
> they are referenced in the project's wiki)
Ooops ;) I see, you're working hard on it!

> But, getting back to
> OrientDB, if you are learning it you can give a hand to write a small sample
> application! :) Keep up with the good italian job!!!

Also a FAQ would be interesting. Things like "how to truncate a db"
that are present in this forum....

I'll try to "be present" and learn fast. Right now I'm just a newbie,
but I believe in this project!! It's a great idea, indeed. If I have
time to go deeper into the code and be something more than a newbie
I'll try also to share and collaborate a bit

Luca Garulli

unread,
Jan 4, 2011, 6:55:22 PM1/4/11
to orient-database
Hi,
in reality every high-level structure like Object and Graph are stored as Documents. You can work with POJOs (through the ODatabaseObjectTx class) and then query them as documents and vice-versa. The same is for Graph elements: they're mapped as Documents as well. So you could create some POJOs that bind the vertexes and edges just calling the POJO properties as the graph elements (outEdges, inEdges, in, out).

How can be so fast if graph are mapped as documents? Because OrientDB documents treats links as direct connections like graphs do. This is the reason why OrientDB is categorized as Document-Graph DBMS.

Lvc@

Tomas Espeleta

unread,
Jan 4, 2011, 8:06:32 PM1/4/11
to OrientDB
Thanks a lot!! Very clear!

> So you could create some POJOs that
> bind the vertexes and edges just calling the POJO properties as the graph
> elements (outEdges, inEdges, in, out).

I had understood the layered architecture, and that both are based on
documents underneath, but this was the brick I missed: it was not
clear to me that there was a compatible mapping object-document and
vertex-document.
...just name the properties in the same way the ODatabaseGraphTx
expects them... that's simple and brilliant!! ;)

Raffaele P. Guidi

unread,
Jan 5, 2011, 5:08:27 AM1/5/11
to orient-...@googlegroups.com
Yes, DirectMemory is (aims to be?) an alternative to Terracotta BigMemory (tm). I know OrientDB is much efficient in managing and serializing objects, and that's why I would like to integrate it in DirectMemory, which, although I started writing the BigMemory stuff, is growing like a kind of cache abstraction layer (implementing caching specific semantic and logic, which OrientDB doesn't offer) over pluggable storage engines (the BigMemory-like stuff is now a storage engine). 

The cache itself manages the first (in heap) layer, which is of course the faster and acts as a buffer for other layers, eviction (also pluggable, of both expired and over-quota items, where the quotas can be specified for every layer), performance monitoring, etc. DirectMemory uses pluggable serializers (I have two as of today, one based on standard serialization and one pased on protostuff-runtime, more efficient and that doesn't require object to implement Serializable) and I also have written a (simple and experimental) disk storage engine. Next steps should be implementing a NoSQL backend (OrientDB would be just perfect, but I was also thinking about Voldemort) and a network distribution aspect (of course OrientDB could also be a solution to this).

In any case: did you manage to test OrientDB with 10/20GB ram? That's the real challenge of BigMemory (tm) and DirectMemory, as well of course ;)

Luca Garulli

unread,
Jan 5, 2011, 1:53:39 PM1/5/11
to orient-database
You already did a lot of work on it!

Maybe another interesting feature is the fetch plan. Sometime you don't need to load an entire tree/graph of objects but only someone and others as lazy. OrientDB offer Fetch Plans for it: http://code.google.com/p/orient/wiki/FetchingStrategies

Lvc@

Raffaele Guidi

unread,
Jan 5, 2011, 2:20:51 PM1/5/11
to orient-...@googlegroups.com
>> You already did a lot of work on it!

I try to keep myself busy ;)

>> another interesting feature is the fetch plan

Uhm, I'm not sure how this could appy to a cache... can you explain further? My only fetching strategy is "keep in the heap what has been more recently added or used and push other stuff down the chain in slower but larger storages (off-heap->disk->network, etc...)" and the only improvement I can think of is using "frequently" instead of "recently".

Getting back to OrientDB integration - would this code work? And is it the right (more performant) way to store a byte array?
ODocument entry = new ODocument(db, "CacheEntry");
byte [] buffer = //omitted
entry
.field( "buffer", buffer );
entry.field( "key", myKey ); //...etc...
entry.save();
Thanks,
    R

Luca Garulli

unread,
Jan 7, 2011, 11:06:52 AM1/7/11
to orient-database
On 5 January 2011 20:20, Raffaele Guidi <raffaele...@gmail.com> wrote:

>> another interesting feature is the fetch plan

Uhm, I'm not sure how this could appy to a cache... can you explain further? My only fetching strategy is "keep in the heap what has been more recently added or used and push other stuff down the chain in slower but larger storages (off-heap->disk->network, etc...)" and the only improvement I can think of is using "frequently" instead of "recently".

In a remote scenario could be useful to have all the interested objects in one shoot. Example: you need 100 objects and to avoid to call 100 times the server to obtain them you could use a fetch-plan to tell "please give me this one and all the connected up to 5th level" in a single call.
 

Getting back to OrientDB integration - would this code work? And is it the right (more performant) way to store a byte array?
ODocument entry = new ODocument(db, "CacheEntry");
byte [] buffer = //omitted
entry
.field( "buffer", buffer );
entry.field( "key", myKey ); //...etc...
entry.save();
There is something rawer than ODocument: ORecordFlat and ORecordBytes. I suggest you to use the second one since eats directly byte[] but you can't execute query against it. Could be useful used in conjuction of Documents. Documents keep querable meta-data and RecordBytes the raw content.
 
Thanks,
    R

Lvc@

Raffaele P. Guidi

unread,
Jan 7, 2011, 12:23:19 PM1/7/11
to orient-...@googlegroups.com
>> in a remote scenario could be useful to have all the interested objects in one shoot

Oh, I see, I'll take this into account when I will implement the network storage, although entries are, at the moment, atomic and not related one to each other, but it could be a future enhancement

>> [...] ORecordFlat and ORecordBytes. I suggest you to use the second one since eats directly byte[] but you can't execute query against it

Perfect, that's exactly what I was looking for. I'll let you know if integration progresses.

Meanwhile, feel free to give me a shout if you need translation work ;)

Ciao,
   R

Raffaele Guidi

unread,
Jan 9, 2011, 11:36:43 AM1/9/11
to OrientDB
I'm sorry but I have some problems with ORecordBytes. Following
snippet:

db = new ODatabaseBinary("local:c:\\temp\\data\\db");
ORecordBytes recBytes = new ORecordBytes(db, buffer);
recBytes.save();

doesn't write the entry to the db and leaves recBytes in stale state
(with identity isNew and !isValid) and I couldn't find any example nor
documentation about it.

Can you give me a hint?

Thanks,
R
Reply all
Reply to author
Forward
0 new messages