Slideshow on GraphDBs, Tinkerpop and OrientDB

99 views
Skip to first unread message

Riccardo Tasso

unread,
Nov 9, 2015, 4:45:24 AM11/9/15
to orient-...@googlegroups.com, gremli...@googlegroups.com
Hi,
  I just want to let you know that last week I gave a presentation on graph databases, the tinkerpop family and orientdb.

You can find the slides online on: http://raymanrt.github.io/graphdbs4jug
You can also find sources and (within some days) the code examples on: https://github.com/raymanrt/graphdbs4jug

Every kind of feedback and contribution is wellcome.

Cheers,
   Riccardo

scott molinari

unread,
Nov 9, 2015, 6:53:10 AM11/9/15
to OrientDB, gremli...@googlegroups.com
Hi Riccardo,

Nice stuff. Still going through the slides, but I just found the second link on http://raymanrt.github.io/graphdbs4jug/#/2/1 also links to the MongoDB blog post.

Scott  

Riccardo Tasso

unread,
Nov 9, 2015, 7:00:50 AM11/9/15
to orient-...@googlegroups.com
Thanks Scott,
   this was fixed.

Riccardo

--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

scott molinari

unread,
Nov 9, 2015, 7:04:43 AM11/9/15
to OrientDB, gremli...@googlegroups.com
Got a question. 

Would making the ingredients for the pizzas as properties in the pizza vertexes also be possible and still have the querying power in your example? Or is it really smarter to make the ingredients their own vertexes? How does one decide to do one (properties) or the other (added vertexes)?

Scott

Riccardo Tasso

unread,
Nov 9, 2015, 7:13:08 AM11/9/15
to orient-...@googlegroups.com
Probably you right Scott, it really depends on what you have to do in your application with those "ingredients".

They should benefit to be treated as vertex if they have more properties, or if they are linked to some other vertex...but hey, it's just to explain some the query!

Cheers,
   Riccardo

--

scott molinari

unread,
Nov 9, 2015, 7:39:47 AM11/9/15
to orient-...@googlegroups.com
Ok. Just wanted to be sure. 

Also, do you agree with the point Lukas Eder was making in his article "Stop Claiming that you’re Using a Schemaless Database"?

I tend to disagree on the "schema is a good thing that you always want to have" argument. He is ignoring the schema migration and code refactoring issues present with SQL. This blog post series (also from MongoDB) makes the issue very, very clear. https://www.mongodb.com/blog/post/mongodb-vs-sql-day-1-2 

I agree, on the other hand, with the fact that there always needs to be a schema. It is just a difference between the code determining the correctness of the schema or the database imposing schema, which has to be kept up-to-date with the code, which means "versioning" of schema, which becomes and additional hassle for agile development. The fact ORMs are built to make Objects match RDBMSs just goes to show the real issue. It amazes me how controversial the discussion are about using ORMs. Everyone who argues about the usefulness of ORMs hasn't seen the light of storing and retrieving object state at will and what it means to the simplicity of programming. 

Scott   

normanLinux

unread,
Nov 16, 2015, 1:15:04 PM11/16/15
to OrientDB
Schema is important.  But the real killer is the lack of normalisation in databases such as Mongo.  

This is not a problem in a graph database like Orient because the edges can make direct links to normalised fields.  In a database like Mongo you are likely to encounter all of the problems that normalisation solves: It is common for people to mistype field contents.  When these are significant fields such as customer name you can believe that you are correctly accessing your information but end up either missing records or including spurious, unrelated, records.

scott molinari

unread,
Nov 17, 2015, 2:00:22 AM11/17/15
to orient-...@googlegroups.com
Like I said, schema always has to be controlled in some way. Mongo forces schema to be in the code, which I think is an advantage, because it puts the storage of state and all its complexities in the background.

Scott

normanLinux

unread,
Nov 17, 2015, 5:28:07 AM11/17/15
to OrientDB
It can be quite an advantage for relatively trivial projects.  Imagine, however, that you have a project with several programmers working on disparate aspects of your application.  team A decides to add new fields for their requirements, Team B adds different fields.  Neither is aware of the changes made by the other and each could miss out on potentially useful data as they don't have the full picture - to say nothing of records with one set of extra fields, others with both sets and yet others with neither.

But the biggest counter argument to most NoSQL databases is still the lack of normalisation, which means data getting out of sync. 

I started my computing career in 1968 and have always heard - and often seen in practice - the two statements:

Whenever data is duplicated it can get out of sync.
If data can get out of sync it will get out of sync.

Normalisation is not, as some seem to think, about saving disk space.  It is about data integrity.  Without normalisation you cannot ensure data integrity. 

How do you ensure, for example, that you have retrieved all of the sales records for a particular product if people have entered the product name differently at different times?  Often in more than one incompatible way? Or, for example, have entered the country name in many ways? E.g. the country Ireland can legitimately be called Ireland, Republic of Ireland, Eire and even (for some rather older people) Southern Ireland

This is not, of course, a problem with a database such as Orient where you would have a vertex per product (or country) with edges linking to them - perfect normalisation and much faster than foreign key links in the collection of indexed files that are laughingly referred to as a 'relational database'.

(In an earlier incarnation I worked with the GE/Honeywell IDS database - which bore strong similarities to the new generation of graph databases.  This was referred to at the time as a relational database.  When I then encountered the more common database types my first reaction was "that's not a database! It's a collection of indexed files!". )

Ata Annamamedov

unread,
Nov 17, 2015, 6:59:49 AM11/17/15
to OrientDB
@normanLinux +1

scott molinari

unread,
Nov 17, 2015, 8:53:27 AM11/17/15
to OrientDB
How do you ensure, for example, that you have retrieved all of the sales records for a particular product if people have entered the product name differently at different times?  Often in more than one incompatible way? Or, for example, have entered the country name in many ways? E.g. the country Ireland can legitimately be called Ireland, Republic of Ireland, Eire and even (for some rather older people) Southern Ireland

If you ask me, these kinds of problems aren't solved with data normalization and only indirectly with schema. It is more a matter of application/ business logic and specifying enforcing standardized data entry.

I agree with "if data is copied, it can get out of sync". Making sure data stays in sync is the price to pay, when it comes to getting the great advantages of not having object relational impedance mismatching. 

As for two teams working on the same project. It is very rare that two teams work on the same part of the same project. If they do, they most certainly need to collaborate. And, any good dev team today works with continuous integration and will be working into a staging or development branch of the project. So, collisions in code should be rare. Thus, if the schema is in the code, there are no database issues. The data storage is pushed into the background. This isn't the case with schema defined in the database. When there is schema, then there needs to also be schema versioning, which adds an added level of difficulty to agile development and functional QA testing.

Scott
Reply all
Reply to author
Forward
0 new messages