Announcement: new OrientDB-ETL project, no more code to import data in OrientDB!

381 views
Skip to first unread message

Luca Garulli

unread,
Jul 3, 2014, 12:06:45 PM7/3/14
to orient-database
Hi guys,
I'm glad to announce a new project under the Orient Technologies umbrella: OrientDB-ETL!


Thanks to OrientDB-ETL project, moving data from/to OrientDB is super fast and require no more code writing: only a JSON file.

So far these are the available components:
  • Extractors:
    • File
  • Transformers:
    • CSV, to convert rows in documents
    • Link, to transform JOIN in links
    • Merge, to merge the parsed document with an existent
    • Vertex, to convert a document in Vertex
    • Edge, to create edges
    • Skip, to skip records based on expressions
    • Code, to execute arbitrary Javascript code
  • Loaders:
    • OrientDB, to save into a OrientDB database
OrientDB-ETL module is in beta status, and will be final with OrientDB v2.0 (September). To use it against OrientDB 2.0-SNAPSHOT follow the "Installation" instructions.

Below an example to import DBPedia csv files (http://wiki.dbpedia.org/DBpediaAsTables):

{
  config: {
    verbose: true,
    fileDirectory: "/temp/databases/dbpedia_csv/",
    fileName: "Person.csv.gz"
  },
  begin: [
   { let: { name: "$filePath",  value: "$fileDirectory.append( $fileName )"} },
   { let: { name: "$className", value: "$fileName.substring( 0, $fileName.indexOf('.') )"} }
  ],
  extractor : {
    line: { path: "$filePath", lock : true }
  },
  transformers : [
   { csv: { separator: ",", nullValue: "NULL", skipFrom: 1, skipTo: 3 } },
   { merge: { joinFieldName:"URI", lookup:"V.URI" } },
   { vertex: { class: "$className"} }
  ],
  loader : {
    orientdb: {
      dbURL: "plocal:/temp/databases/dbpedia",
      dbUser: "admin",
      dbPassword: "admin",
      dbAutoCreate: true,
      tx: false,
      batchCommit: 1000,
      dbType: "graph",
      indexes: [{class:"V", fields:["URI:string"], type:"UNIQUE" }]
    }
  }
}


Lvc@

Olakunle Olaniyi

unread,
Sep 22, 2014, 12:42:36 PM9/22/14
to orient-...@googlegroups.com
Hi Luca,
I'd like to start by saying thanks for the explanation on this so far. I am running orientdb 2.0-M1 and this is not working for me as well. It shows the joinvalue alright and they are correct but the edges are not be created. Have an explanation why this is so? Your response will be much appreciated.

Olakay

P.S : the log could be see below :


OrientDB etl v.2.0-M1 (build @BUILD@) www.orientechnologies.com
BEGIN ETL PROCESSOR
[1:vertex] DEBUG Transformer input: {aid:1,name:Qualitrail - 1}
[1:vertex] DEBUG Transformer output: v(Alien)[#12:10]
[1:edge] DEBUG Transformer input: v(Alien)[#12:10]
[1:edge] DEBUG WARNING: index Person.mid not found. Lookups could be really slow
[1:edge] DEBUG joinValue=1, lookupResult=Person#11:0{mid:1,name:Qualitrail - 1,in_hasLink:[size=1]} v2
[1:edge] DEBUG Transformer output: v(Alien)[#12:10]
[3:vertex] DEBUG Transformer input: {aid:2,name:Qualitrail - 2}
[3:vertex] DEBUG Transformer output: v(Alien)[#12:11]
[3:edge] DEBUG Transformer input: v(Alien)[#12:11]
[3:edge] DEBUG joinValue=2, lookupResult=Person#11:1{mid:2,name:Qualitrail - 2} v1
[3:edge] DEBUG Transformer output: v(Alien)[#12:11]
[5:vertex] DEBUG Transformer input: {aid:3,name:Qualitrail - 3}
[5:vertex] DEBUG Transformer output: v(Alien)[#12:12]
[5:edge] DEBUG Transformer input: v(Alien)[#12:12]
[5:edge] DEBUG joinValue=3, lookupResult=Person#11:2{mid:3,name:Qualitrail - 3} v1
[5:edge] DEBUG Transformer output: v(Alien)[#12:12]
[7:vertex] DEBUG Transformer input: {aid:4,name:Qualitrail - 4}
[7:vertex] DEBUG Transformer output: v(Alien)[#12:13]
[7:edge] DEBUG Transformer input: v(Alien)[#12:13]
[7:edge] DEBUG joinValue=4, lookupResult=Person#11:3{mid:4,name:Qualitrail - 4} v1
[7:edge] DEBUG Transformer output: v(Alien)[#12:13]
[9:vertex] DEBUG Transformer input: {aid:5,name:Qualitrail - 5}
[9:vertex] DEBUG Transformer output: v(Alien)[#12:14]
[9:edge] DEBUG Transformer input: v(Alien)[#12:14]
[9:edge] DEBUG joinValue=5, lookupResult=Person#11:4{mid:5,name:Qualitrail - 5} v1
[9:edge] DEBUG Transformer output: v(Alien)[#12:14]
[11:vertex] DEBUG Transformer input: {aid:6,name:Qualitrail - 6}
[11:vertex] DEBUG Transformer output: v(Alien)[#12:15]
[11:edge] DEBUG Transformer input: v(Alien)[#12:15]
[11:edge] DEBUG joinValue=6, lookupResult=Person#11:5{mid:6,name:Qualitrail - 6} v1
[11:edge] DEBUG Transformer output: v(Alien)[#12:15]
[13:vertex] DEBUG Transformer input: {aid:7,name:Qualitrail - 7}
[13:vertex] DEBUG Transformer output: v(Alien)[#12:16]
[13:edge] DEBUG Transformer input: v(Alien)[#12:16]
[13:edge] DEBUG joinValue=7, lookupResult=Person#11:6{mid:7,name:Qualitrail - 7} v1
[13:edge] DEBUG Transformer output: v(Alien)[#12:16]
[15:vertex] DEBUG Transformer input: {aid:8,name:Qualitrail - 8}
[15:vertex] DEBUG Transformer output: v(Alien)[#12:17]
[15:edge] DEBUG Transformer input: v(Alien)[#12:17]
[15:edge] DEBUG joinValue=8, lookupResult=Person#11:7{mid:8,name:Qualitrail - 8} v1
[15:edge] DEBUG Transformer output: v(Alien)[#12:17]
[17:vertex] DEBUG Transformer input: {aid:9,name:Qualitrail - 9}
[17:vertex] DEBUG Transformer output: v(Alien)[#12:18]
[17:edge] DEBUG Transformer input: v(Alien)[#12:18]
[17:edge] DEBUG joinValue=9, lookupResult=Person#11:8{mid:9,name:Qualitrail - 9} v1
[17:edge] DEBUG Transformer output: v(Alien)[#12:18]
[19:vertex] DEBUG Transformer input: {aid:10,name:Qualitrail - 10}
[19:vertex] DEBUG Transformer output: v(Alien)[#12:19]
[19:edge] DEBUG Transformer input: v(Alien)[#12:19]
[19:edge] DEBUG joinValue=10, lookupResult=Person#11:9{mid:10,name:Qualitrail - 10} v1
[19:edge] DEBUG Transformer output: v(Alien)[#12:19]
END ETL PROCESSOR

Luca Garulli

unread,
Sep 22, 2014, 1:14:26 PM9/22/14
to orient-database
Hi,
Seems there is no Loader configured. Could you post also your JSON file?

Lvc@

--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Olakunle Olaniyi

unread,
Sep 22, 2014, 5:48:18 PM9/22/14
to orient-...@googlegroups.com
Hello ,
I have 2 json files (person.json and alien.json) and they are as below :

=======person.json================
{
  "config": {
    "log": "debug"
  },
  "extractor" : {
    "jdbc": { "driver": "org.postgresql.Driver",
              "url": "jdbc:postgresql://localhost/testdb",
              "userName": "postgres",
              "userPassword": "password1",
              "query": "select * from person" }
  },
  "transformers" : [
    { "vertex": { "class": "Person"} }
  ],
   "loader" : {
    "orientdb": {
      "dbURL": "plocal:/home/orientdb/db/databases/db_test",
      dbUser: "admin",
      dbPassword: "admin",
      dbAutoDropIfExists: false,
      dbAutoCreate: true,
      tx: false,
      wal: false,
      batchCommit: 1000,
      dbType: "graph",
      indexes: [{class:"Person", fields:["pid:string"], type:"UNIQUE_HASH_INDEX" }]
    }
  }
}





======alien.json=====================
{
  "config": {
    "log": "debug"
  },
  "extractor" : {
    "jdbc": { "driver": "org.postgresql.Driver",
              "url": "jdbc:postgresql://localhost/testdb",
              "userName": "postgres",
              "userPassword": "password1",
              "query": "select * from alien" }
  },
  "transformers" : [
    { "vertex": { "class": "Alien"} },
    { "edge": { "class": "hasLink", "direction" : "in",
            "joinFieldName": "aid",
            "lookup":"Person.pid", "unresolvedLinkAction":"CREATE"} }
  ],
  "loader" : {
    "orientdb": {
      "dbURL": "plocal:/home/orientdb/db/databases/db_test",
      dbUser: "admin",
      dbPassword: "admin",
      dbAutoDropIfExists: false,
      dbAutoCreate: true,
      tx: false,
      wal: false,
      batchCommit: 1000,
      dbType: "graph",
      indexes: [{class:"Alien", fields:["aid:string"], type:"UNIQUE_HASH_INDEX" }]
    }
  }
}

--

---
You received this message because you are subscribed to a topic in the Google Groups "OrientDB" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/orient-database/Xp38b55AbpU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to orient-databa...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Olakunle

Curtis Mosters

unread,
Sep 23, 2014, 12:00:15 AM9/23/14
to orient-...@googlegroups.com
I tested ETL 2 months ago and there I had the same problem, that the ETL did not create any edges.

Will test again when 2.0 is officially out.

Olakunle Olaniyi

unread,
Sep 23, 2014, 5:48:19 AM9/23/14
to orient-...@googlegroups.com
ok, I tried it again, i think it only creates an edge from alien to person where there was no match and it had to create that vertex.  A sample result set is as below.
Any ideas what might be wrong ?


orientdb {db=test}> select from alien 

----+------+----+----------+--------
#   |@RID  |aid |aname     |in_hates
----+------+----+----------+--------
0   |#12:0 |1   |Alien - 1 |null    
1   |#12:1 |2   |Alien - 2 |null    
2   |#12:2 |3   |Alien - 3 |null    
3   |#12:3 |4   |Alien - 4 |null    
4   |#12:4 |5   |Alien - 5 |null    
5   |#12:5 |6   |Alien - 6 |null    
6   |#12:6 |7   |Alien - 7 |null    
7   |#12:7 |8   |Alien - 8 |null    
8   |#12:8 |9   |Alien - 9 |null    
9   |#12:9 |10  |Alien - 10|null    
10  |#12:10|11  |Alien - 11|null    
11  |#12:11|12  |Alien - 12|null    
12  |#12:12|13  |Alien - 13|null    
13  |#12:13|14  |Alien - 14|null    
14  |#12:14|15  |Alien - 15|null    
15  |#12:15|16  |Alien - 16|null    
16  |#12:16|17  |Alien - 17|null    
17  |#12:17|18  |Alien - 18|null    
18  |#12:18|19  |Alien - 19|null    
19  |#12:19|20  |Alien - 20|null    
20  |#12:20|1   |Alien - 1 |null    
21  |#12:21|2   |Alien - 2 |null    
22  |#12:22|3   |Alien - 3 |null    
23  |#12:23|4   |Alien - 4 |null    
24  |#12:24|5   |Alien - 5 |null    
25  |#12:25|6   |Alien - 6 |null    
26  |#12:26|7   |Alien - 7 |null    
27  |#12:27|8   |Alien - 8 |null    
28  |#12:28|9   |Alien - 9 |null    
29  |#12:29|10  |Alien - 10|null    
30  |#12:30|11  |Alien - 11|null    
31  |#12:31|12  |Alien - 12|null    
32  |#12:32|13  |Alien - 13|null    
33  |#12:33|14  |Alien - 14|null    
34  |#12:34|15  |Alien - 15|null    
35  |#12:35|16  |Alien - 16|null    
36  |#12:36|17  |Alien - 17|null    
37  |#12:37|18  |Alien - 18|null    
38  |#12:38|19  |Alien - 19|null    
39  |#12:39|20  |Alien - 20|null    
40  |#12:40|22  |Alien - 2 |[size=1]
41  |#12:41|23  |Alien - 3 |[size=1]
42  |#12:42|24  |Alien - 4 |[size=1]
43  |#12:43|25  |Alien - 5 |[size=1]
44  |#12:44|26  |Alien - 6 |[size=1]
45  |#12:45|27  |Alien - 7 |[size=1]
46  |#12:46|28  |Alien - 8 |[size=1]
47  |#12:47|29  |Alien - 9 |[size=1]
48  |#12:48|30  |Alien - 10|[size=1]
49  |#12:49|31  |Alien - 11|[size=1]
50  |#12:50|32  |Alien - 12|[size=1]
51  |#12:51|33  |Alien - 13|[size=1]
52  |#12:52|34  |Alien - 14|[size=1]
53  |#12:53|35  |Alien - 15|[size=1]
54  |#12:54|36  |Alien - 16|[size=1]
55  |#12:55|37  |Alien - 17|[size=1]
56  |#12:56|38  |Alien - 18|[size=1]
57  |#12:57|39  |Alien - 19|[size=1]
58  |#12:58|40  |Alien - 20|[size=1]
----+------+----+----------+--------




orientdb {db=test}> select from person


----+------+----+-----------+---------
#   |@RID  |pid |pname      |out_hates
----+------+----+-----------+---------
0   |#11:0 |1   |Person - 1 |null     
1   |#11:1 |2   |Person - 2 |null     
2   |#11:2 |3   |Person - 3 |null     
3   |#11:3 |4   |Person - 4 |null     
4   |#11:4 |5   |Person - 5 |null     
5   |#11:5 |6   |Person - 6 |null     
6   |#11:6 |7   |Person - 7 |null     
7   |#11:7 |8   |Person - 8 |null     
8   |#11:8 |9   |Person - 9 |null     
9   |#11:9 |10  |Person - 10|null     
10  |#11:10|11  |Person - 11|null     
11  |#11:11|12  |Person - 12|null     
12  |#11:12|13  |Person - 13|null     
13  |#11:13|14  |Person - 14|null     
14  |#11:14|15  |Person - 15|null     
15  |#11:15|16  |Person - 16|null     
16  |#11:16|17  |Person - 17|null     
17  |#11:17|18  |Person - 18|null     
18  |#11:18|19  |Person - 19|null     
19  |#11:19|20  |Person - 20|null     
20  |#11:20|22  |null       |[size=1] 
21  |#11:21|23  |null       |[size=1] 
22  |#11:22|24  |null       |[size=1] 
23  |#11:23|25  |null       |[size=1] 
24  |#11:24|26  |null       |[size=1] 
25  |#11:25|27  |null       |[size=1] 
26  |#11:26|28  |null       |[size=1] 
27  |#11:27|29  |null       |[size=1] 
28  |#11:28|30  |null       |[size=1] 
29  |#11:29|31  |null       |[size=1] 
30  |#11:30|32  |null       |[size=1] 
31  |#11:31|33  |null       |[size=1] 
32  |#11:32|34  |null       |[size=1] 
33  |#11:33|35  |null       |[size=1] 
34  |#11:34|36  |null       |[size=1] 
35  |#11:35|37  |null       |[size=1] 
36  |#11:36|38  |null       |[size=1] 
37  |#11:37|39  |null       |[size=1] 
38  |#11:38|40  |null       |[size=1] 
----+------+----+-----------+---------

Luca Garulli

unread,
Sep 25, 2014, 8:31:02 PM9/25/14
to orient-database
Hi Guys,
I've just fixed an issue on edge creation. Can you pull the project and rebuild it? Current version of ETL module is 1.1-SNAPSHOT.

Lvc@

Curtis Mosters

unread,
Sep 26, 2014, 3:07:33 AM9/26/14
to orient-...@googlegroups.com
Hi Luca, it seem to work now.

But just tested with leightweight edges.

http://i.imgur.com/K18qTWm.png

Olakunle Olaniyi

unread,
Sep 26, 2014, 7:27:29 AM9/26/14
to orient-...@googlegroups.com
Hello,
Thanks, I shall try it out as soon as I can scramble to a nearby console. 
I have yet another interesting question. How do I migrate an array data type from a relational database to OrientDB?

Thanks
Reply all
Reply to author
Forward
0 new messages