ETL: using gremlin commands and a few documentation issues

90 views
Skip to first unread message

Kyle

unread,
Apr 16, 2015, 8:40:40 PM4/16/15
to orient-...@googlegroups.com
I used orientdb 2.0.7 for all the below.

First, some documentation issues then questions regarding using gremlin in etl scripts

I have attached etl  json and a test input file. the json assume the test_nodes.tsv is at /tmp/test_nodes.tsv and creates a plocal db in /tmp.

The json has several command blocks that cause errors, you can reproduce the different errors I discuss below by swapping around the positioning of the blocks.

Overall what I want is a working example of using a gremlin command in etl, ideally one that also uses variables bound using sql commands.

---------
1) DOCUMENTATION ISSUES
-----

i) loader param useLightweightEdges is not documented

this issue documents:
Supported this new parameter in loader settings:
useLightweightEdges: false

This is not documented at:

I actually ended up doing:

  "begin":[
 {"console":{

"commands" : [
"connect plocal:/tmp/test_db;",
"alter database custom useLightweightEdges=true;",
"disconnect;"
]
}
}
  ],

as a workaround before discovering that in the closed issues!!

-----
ii) I can't any explicit mentions of power to specify custom output for transformers

This issue, the code and various examples document that you can specify custom "output" to bind variables.

I can't find it explicitly stated in the documentation that this feature exists, it would be nice if 
mentioned this at the top of the page.
I apologize if I missed where it explicitly stated this feature exist
-----

iii) Specifying the language as "groovy" is giving me results for simple expressions.

{
"command": {
"language":"groovy",
"command":"1 +1 ",
"output":"foo"
}
},

[0:command] DEBUG executed command=groovy.1 +1, result=2

this is not documented.

--------------------------------

2) QUESTIONS

OK, so now actually etl help questions:

i)  I can't get any gremlin command to work, even ones that work for "groovy"

having:
{
"command": {
"language":"gremlin",
"command":"1 +1 ",
"output":"foo"
}
},

results in this error:

Error in Pipeline execution: com.orientechnologies.orient.core.command.OCommandExecutorNotFoundException: Cannot find a command executor for the command request: gremlin.1 +1
com.orientechnologies.orient.core.command.OCommandExecutorNotFoundException: Cannot find a command executor for the command request: gremlin.1 +1
        at com.orientechnologies.orient.core.command.OCommandManager.getExecutor(OCommandManager.java:102)
        at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.command(OAbstractPaginatedStorage.java:1170)
        at com.orientechnologies.orient.core.command.OCommandRequestTextAbstract.execute(OCommandRequestTextAbstract.java:63)
        at com.orientechnologies.orient.etl.transformer.OCommandTransformer.executeTransform(OCommandTransformer.java:70)
        at com.orientechnologies.orient.etl.transformer.OAbstractTransformer.transform(OAbstractTransformer.java:37)
        at com.orientechnologies.orient.etl.OETLPipeline.execute(OETLPipeline.java:108)
        at com.orientechnologies.orient.etl.OETLProcessor.executeSequentially(OETLProcessor.java:480)
        at com.orientechnologies.orient.etl.OETLProcessor.execute(OETLProcessor.java:288)
        at com.orientechnologies.orient.etl.OETLProcessor.main(OETLProcessor.java:160)
ETL process halted: com.orientechnologies.orient.etl.OETLProcessHaltedException: com.orientechnologies.orient.core.command.OCommandExecutorNotFoundException: Cannot find a command executor for the command request: gremlin.1 +1
 
----
ii) How to get a reference to the graph?

shows using orient.getGraph() but that has been returning errors for me, even if I specify 
"groovy" as the command langauge.

Ideally, the variable "g" would be auto-bound to the graph instance for gremlin commands during etl.
----

iii) is it possible to use output from sql command inside of a gremlin command?

If I do 
{
"command": {
"language":"groovy",
"command":"${v1}.iterator()[0]",
"output":"foo"
}
},

I get the error:

Error in Pipeline execution: com.orientechnologies.orient.core.command.script.OCommandScriptException: Error on evaluation of the script library. Error: org.codehaus.groovy.control.MultipleCompilationErrorsException: startup failed:
Script1.groovy: 1: unexpected token: @ @ line 1, column 55.
   ient.core.sql.q...@136.itera

I think the sql result is turned into a toString() representation when the variables are expanded, not sure about this though.

-------


I apologize if I have missed anything in the documentation. If you need further examples/elaboration I will be happy to give any. 
Sorry for the wall of text. : )


Thanks, 
kyle

q_for_group.json
test_nodes.tsv
Reply all
Reply to author
Forward
0 new messages