Using ArangoDB as data warehouse?

337 views
Skip to first unread message

Biolcades

unread,
Dec 18, 2015, 11:29:36 AM12/18/15
to ArangoDB
We are investigating currently the use of graph databases for a data warehouse.

We would like to integrate different sources, like mySQL, Dynamics CRM, ERP data into the graph database.

Typical scenario:
  • contact of a company is at the same time employee of an agency
  • having multiple companies to manage
  • after a while moving positions and want to track history of former companies were contact was employed.
Then this data connected to other information such as external company information, invoices, etc.

One issue is that we need to keep the database concept flexible for new sources to be integrated.

Has someone experience with ArangoDB as a data warehouse and if so, what are your thoughts?

Claudio Huyskens

unread,
Dec 22, 2015, 3:39:05 AM12/22/15
to ArangoDB
we have been using ArangoDB in our business entity graph for 2 years now and are extremely satisfied. Regarding specs, we handle 1m organization entities, 1.6m person profiles and nearly 2m person-to-organization relations.
UIn terms of data model, we appreciate the ability to express edges as documents, i.e. we add validFrom and validTo attribites to a person's role. From my understanding that is similar to your usecase. 
Besides a super flexible data model, ArangoDB is great for data ingestion from different sources and platforms as well as for a variety of data access tasks.

John De Goes

unread,
Dec 27, 2015, 1:14:13 PM12/27/15
to ArangoDB

This use case would be facilitated by having a ArangoDB connector for SlamData, an open source analytics & reporting tool for NoSQL databases.

I'm interested in finding more contributors for such a project. Please contact me if you think you might be interested.

Regards,

John

Max Neunhöffer

unread,
Dec 28, 2015, 3:42:23 AM12/28/15
to ArangoDB
Is there any documentation available about what would be needed to build an ArangoDB connector for SlamData? I would find it interesting to gauge how much effort would be needed to get such a thing up and running.
  Best regards,
    Max.

Julian May

unread,
Dec 28, 2015, 7:43:43 AM12/28/15
to ArangoDB
offtopic: +1 for SlamData connector - we have so much legacy sql-centric knowledge and integrations, being able to (read)-access the data with T-SQL would be absolutely amazing

John De Goes

unread,
Dec 28, 2015, 11:27:30 AM12/28/15
to ArangoDB

There's no guide yet, although that's something I mean to do in the new year. There are three steps involved:
  1. Creating a data structure that represents the full range of queries possible on ArangoDB. In SlamData terminology, this is the "physical plan". This would be a good fit for someone who knows AQL well and can create a formal model of it.
  2. Create a translator from the logical plan data structure to the ArangoDB physical plan. This is my own area of expertise.
  3. Finally, creating a lightweight backend which can read and write data, and configure an ArangoDB backend based on a connection URI ("arangodb://foo.com?setting1=value1...").
You can see what's involved in the Quasar code hosted on Github (http://github.com/quasar-analytics/quasar).

As I mentioned before, I am personally quite interested in an ArangoDB connector, and I'm a contributor to SlamData. So if we get enough interest, I am happy to mentor and assist anyone with the development of the connector.

I know I would switch to using ArangoDB for several projects if SlamData ran on top the database. But without a means to do analytics and reporting, I have only played around with ArangoDB for pet projects.

Regards,

John
Reply all
Reply to author
Forward
0 new messages