Tree Structure import problems with ETL

126 views
Skip to first unread message

kova...@gmail.com

unread,
Apr 15, 2015, 12:45:54 PM4/15/15
to orient-...@googlegroups.com
Hello!
I'm a newbie to OrientDB, I'm currently writing my thesis about graph databases.

I have some data which I would like to import into OrientDB.
I have info about Users:
ID, LAST_YEAR_INCOME, DATE_OF_BIRTH, STATE
0, 10000, 1990.08.11, Arizona
1, 12234, 1976.11.07, Missouri
2, 21322, 1978.01.01, Minnesota
3, 33333, 1960.05.05, Iowa

And data about relationships between them:
ID, PARENT_ID
0, 0
1, 0
2, 0
3, 1

I would like to create a tree structure from it.
I have written the following ETL json files:

users_etl.json
{
  "source": { "file": { "path": "users.csv" } },
  "extractor": { "row": {} },
  "transformers": [
    { "csv": {} },
    { "vertex": { "class": "User" } }
  ],
  "loader": {
    "orientdb": {
       "dbURL": "plocal:/home/user/orientdb/databases/thesis",
       "dbType": "graph",
       "classes": [
         {"name": "User", "extends": "V"},
 {"name": "ParentOf", "extends": "E"}
       ], "indexes": [
         {"class":"User", "fields":["ID:Long"], "type":"UNIQUE" },
       ]
    }
  }
}

And rel_etl.json:
{
  "source": { "file": { "path": "rel.csv" } },
  "extractor": { "row": {} },
  "transformers": [
    { "csv": {} },
    { "edge": { "class": "ParentOf",
                "joinFieldName": "PARENT_ID",
                "lookup": "User.ID",
                "direction": "out",
            }
        }
  ],
  "loader": {
    "orientdb": {
       "dbURL": "plocal:/home/user/orientdb/databases/thesis",
       "dbType": "graph",
       "classes": [
         {"name": "ParentOf", "extends": "E"}
       ]
    }
  }
}

I would like to do User extends V object connected by ParentOf extends E objects,
by ID --> PARENT_ID connection.

The User objects are imported successfully but I can't get it working on the Edges.
I've found this thread, but I can't get the command working either unfortunately.


Could you please help me getting this import done?
What am I doing wrong? The CSV import documentation one the site is kind of poorish, I can't really get it working.

Thank you!

kova...@gmail.com

unread,
Apr 18, 2015, 4:50:22 AM4/18/15
to orient-...@googlegroups.com
Hello again!

Finally I managed to solve the issue. As I have no resource about the importing I will keep using this method.
Finally I needed one CSV file, with ID, PARENT_ID, etc. fields.
The the etl config file below worked for me:
{
  "source": { "file": { "path": "users.csv" } },
  "extractor": { "row": {} },
  "transformers": [
    { "csv": {} },
    { "vertex": { "class": "User" } },
{ "edge": {
"class": "ParentOf",
"joinFieldName": "PARENT_ID",
"direction": "in",
"lookup": "User.ID"
}
}
  ],
  "loader": {
    "orientdb": {
       "dbURL": "plocal:/path/to/db",

       "dbType": "graph",
       "classes": [
         {"name": "User", "extends": "V"},
{"name": "ParentOf", "extends": "E"}
       ], "indexes": [
         {"class":"User", "fields":["ID:Long"], "type":"UNIQUE" },
         {"class":"User", "fields":["PARENT_ID:Long"], "type":"NOTUNIQUE" },
       ]
    }
  }
}

I hope it will help others in the future!

Luca Garulli

unread,
Apr 18, 2015, 11:08:57 AM4/18/15
to orient-...@googlegroups.com
Hi,
Thanks for posting it. I thought this is a recurrent use case, so I published your use case in the documentation:


I'm sure other developers will find it useful ;-)

Lvc@


--

---
You received this message because you are subscribed to the Google Groups "OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to orient-databa...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Best Regards,

Luca Garulli
CEO at Orient Technologies LTD
the Company behind OrientDB

Reply all
Reply to author
Forward
0 new messages