[Neo4j] Feedback after evaluation

119 views
Skip to first unread message

Dmytrii Nagirniak

unread,
Dec 8, 2011, 3:48:59 AM12/8/11
to Neo4j user discussions
Hi Guys,

Just want to quickly give some feedback on the neo4j after some evaluation.

Overall, I like the neo4j a lot, but have decided not to use it.

Probably the primary reason is the tooling around Ruby. Don't get me wrong. The neo4j.rb is just amazing. It really is.
I even contributed couple of Pull Requests (and those were accepted).

The problem is in neo4j Java roots. The only option for me was to use JRuby (will say a word on REST later).
But unfortunately choosing JRuby is just too troublesome and give much more headache comparing to "normal" C/MRI Ruby.

Everything is so much harder (even speed is x times slower).

On "normal" (MRI 1.9.3) Ruby I run all the specs immediately, after saving a file. Immediate feedback.
With JRuby I'd have to wait for almost half a minute. TDD is gone. Not good enough. But this is just the first "issues" that I faced.

A lot of other libraries just don't work with.
There are always small walls on my way that I have to break through, that would never happen with normal Ruby.
It just gives me a lot of pain.

Unfortunately I couldn't see a lot of value in the REST API either.
The core operations that are taken for granted with native bindings (traversals using poor Ruby constructs) would require to execute HTTP request (that's what SELECT N+1 in SQL world is).
Or otherwise I would have to wrap all the logic in the traversal queries. It would significantly overcomplicate the system with HTTP handling logic.

Also there are no decent HTTP restful clients. The only one is neography - that works pretty well, but doesn't give me any abstraction similar to neo4j.rb.
There is also neology. I declare it dead, I couldn't even run tests because a dependent gem was removed from the author's own github repository.
So I even had no way to fix any issues there.

Last one - architect4r - good idea behind. But the abstractions are leaky. You can't make system more or less performant without resorting to HTTP.
I also did minor contribution to it (accepted PR). But since then the author never replied to my tweets, neither he replied to emails.
Maybe he's just sick or something else, but that's what we have.
And there were failing specs all over the place.

So I decided to write another REST library (http:://github.com/dnagir/morpheus), but then gave up realising that you just cannot have a proper abstraction over HTTP.
(I'll probably kill off that repo).


So all in all, to summarise: I am giving up on neo4j because it forces me into Java world to leverage its full power.
I could have agreed on that if I would be a Java dev. But there is nothing in this world that can convince me to choose Java instead of Ruby (maybe other langs in the future).

But what DO have to mention is the dedication of people around neo4j. Everybody tried their best to help.
And that feels like everybody within Neo Technologies has common vision and is really passionate and keen to help.
I can't remember any other company that would be so dedicated.

I would really love to use neo4j, but unfortunately I can't do that until it will be available as native binding for C Ruby.

And as a lost note, I want to say THANKS to the neo4j community for the great and amazing support you all guys give.

Cheers,
Dmytrii
http://www.ApproachE.com


_______________________________________________
NOTICE: THIS MAILING LIST IS BEING SWITCHED TO GOOGLE GROUPS, please register and consider posting at https://groups.google.com/forum/#!forum/neo4j

Neo4j mailing list
Us...@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Andreas Ronge

unread,
Dec 8, 2011, 3:30:32 PM12/8/11
to Neo4j user discussions
Hi

Thanks for your great feedback !
The tooling support for JRuby has worked well for me and would
probably still use JRuby even if there was a good native MRI neo4j
wrapper.
But I have a java background and might not be spoiled with that
instant feedback loop of doing behaviour driven development using MRI
Ruby. The JVM never has time to warm up when running the RSpec tests.
I guess it will always be slower than MRI for running tests (but not
otherwise). Neo4j.rb has 1400 RSpecs (with very few mocks). JRuby will
run them in about 1-2 minutes.
I guess the other problems like having a rails console with write
access to the database is solvable.
Thanks again for your feedback and I do agree with your criticism but
for me it's not that problematic.

Cheers
Andreas

Michael Hunger

unread,
Dec 8, 2011, 3:43:35 PM12/8/11
to Neo4j user discussions
Isn't there something like the background jvm thingy that exists for groovy, scala and other languages? A server that the current process connects to, sends code over and runs it in the JVM?

What happens if your run the 1400 rpec tests several times?

Perhaps we should ask Charles Nutter for his feedback on these issues?

Which older gems were problematic for you?

Michael

Andreas Ronge

unread,
Dec 8, 2011, 4:15:49 PM12/8/11
to Neo4j user discussions
Yes there is - nailgun.
Charles has already been involved in many of the problems that Dmytrii
had with JRuby.
For a good summery of Dmytrii's problem, check his twitter feed.
For RSpec JRuby performance and Charles response, see
https://gist.github.com/1423288

/Andreas

Max De Marzi Jr.

unread,
Dec 8, 2011, 5:22:37 PM12/8/11
to ne...@googlegroups.com, Neo4j user discussions
Is HTTP as a protocol the problem? Maybe, it does have some advantages
though.

I think we all agree the REST API is not finished yet.
We talked last week about Batch operations as a poor-man's replacement for
Transactions and the concerns that brought up.

I think the REST API will get there eventually... or (since it had a short
brush with death two months ago) be replaced with a full Cypher language
with data operations (INSERT, UPDATE, DELETE SQL equivalents)

In the mean time, my solution has been Polyglot Persistence using
ActiveRecord callbacks.

after_create :create_node

def create_node
if self.valid?
self.node = $neo.create_node("identity_id" => self.id)["self"].split("/).last
self.save
end

after_save :create_relationship

def create_relationship
from = self.grantor.node
to = self.requester.node
$neo.create_relationship("vouched", from, to)
end

You don't have to limit yourself to one database.
Chances are you'll need Redis anyway, so you might as well think in terms of multiple storage units from the beginning.

Dmytrii Nagirniak

unread,
Dec 8, 2011, 6:30:25 PM12/8/11
to ne...@googlegroups.com, Neo4j user discussions

On 09/12/2011, at 8:15 AM, Andreas Ronge wrote:

> Yes there is - nailgun.
> Charles has already been involved in many of the problems that Dmytrii
> had with JRuby.
> For a good summery of Dmytrii's problem, check his twitter feed.
> For RSpec JRuby performance and Charles response, see
> https://gist.github.com/1423288


Yes, that's true.
The summary is that JRuby will never be on par with MRI in terms of startup time (Even with nailgun).
It also doesn't support Spork because JRuby can't fork processes.

And this is extremely valuable and important.

Dmytrii Nagirniak

unread,
Dec 8, 2011, 6:34:34 PM12/8/11
to ne...@googlegroups.com, Neo4j user discussions

On 09/12/2011, at 7:43 AM, Michael Hunger wrote:
> What happens if your run the 1400 rpec tests several times?

I don't get the point of running it several times. I need to run one/couple of spec as soon as possible and see the feedback.


> Which older gems were problematic for you?

It's not about OLDER gems. We are talking about existing and well maintained.
Here is a quick recap of just migrating over to JRuby using the most standard stack of technologies.
http://blog.approache.com/2011/11/issues-switching-to-jruby-from-mri-19.html

(Although I made it work at the end of the day, should write follow up post).


And here you can read the discussion of more people having issues with JRuby.
https://groups.google.com/group/rails-oceania/browse_thread/thread/1e382d367c1c55f7?hl=en

The bottom line is that it is pain and way too much time is spent on dealing with issues that you just don't have at all in MRI Ruby.

Dmytrii Nagirniak

unread,
Dec 8, 2011, 6:41:35 PM12/8/11
to ne...@googlegroups.com, Neo4j user discussions
On 09/12/2011, at 9:22 AM, Max De Marzi Jr. wrote:

Is HTTP as a protocol the problem? Maybe, it does have some advantages though.

Yes. Definitely. There are always pros/cons for everything.


I think we all agree the REST API is not finished yet.  
We talked last week about Batch operations as a poor-man's replacement for Transactions and the concerns that brought up.

I think the REST API will get there eventually... or (since it had a short brush with death two months ago)  be replaced with a full Cypher language with data operations (INSERT, UPDATE, DELETE SQL equivalents)

In the mean time, my solution has been Polyglot Persistence using ActiveRecord callbacks.  

Yes, that is something I may consider. That's probably the most pragmatic approach.
But currently simple WITH RECURSIVE (http://www.postgresql.org/docs/8.4/static/queries-with.html) SQL may do the job for me.

after_create :create_node def create_node if self.valid? self.node = $neo.create_node("identity_id" => self.id)["self"].split("/).last self.save end

after_save :create_relationship def create_relationship from = self.grantor.node to = self.requester.node $neo.create_relationship("vouched", from, to) end

I thought about something like this and it definitely may work. But I will try to use single DB as long as possible.

You don't have to limit yourself to one database.  
Chances are you'll need Redis anyway, so you might as well think in terms of multiple storage units from the beginning.

You definitely have good point here.

Vivek Prahlad

unread,
Dec 8, 2011, 10:44:58 PM12/8/11
to Neo4j user discussions
Hi Dmytrii,

I would like to point out that spork does actually work with JRuby. From
what I can see, guard and guard-spork are now supported with JRuby as well.
Please take a look at the supported platform list here:
https://github.com/guard/guard-spork and here:
https://github.com/guard/guard

Here's my Gemfile entry: gem 'spork', '~> 0.9.0.rc'

I'm using spork, but not guard, BTW.

Cheers,
Vivek

Dmytrii Nagirniak

unread,
Dec 9, 2011, 1:12:11 AM12/9/11
to ne...@googlegroups.com, Neo4j user discussions
On 09/12/2011, at 2:44 PM, Vivek Prahlad wrote:

I would like to point out that spork does actually work with JRuby. From
what I can see, guard and guard-spork are now supported with JRuby as well.
Obviously, it doesn't. I don't even get to guard. And of course not with neo4j.

This video shows everything (3x sped up to save you time not watching how jruby starts and I type).
I would love to see the same video where it actually works.

And this is the problem. I am wasting my time to fix some weird issues with JRuby. All that "just works" on MRI.

Anyway, I hope someday I'll be able to make it all work.

Vivek Prahlad

unread,
Dec 9, 2011, 1:45:52 AM12/9/11
to ne...@googlegroups.com
This discussion is probably now offtopic for this list, but I think the issue is with your JRuby options. I'm using JRuby in 1.8 mode. You'll also need to turn on Objectspace support, as the warning in your console mentions. My JRUBY_OPTS is '-X+O'.

Please take a look at this discussion, for example: http://youtrack.aws.intellij.net/issue/RUBY-9129?query=assigned+to%3A+%7Bno+user%7D

I'm just pointing out that Spork does indeed work for me (have been using it for the past 3 months or so). If it's not working for you, then it's quite obvious that there must be differences in our settings? :-)

Please feel free to contact me off-list in case you need further help,

Thanks,
Vivek

Dmytrii Nagirniak

unread,
Dec 9, 2011, 2:43:07 AM12/9/11
to ne...@googlegroups.com

On 09/12/2011, at 5:45 PM, Vivek Prahlad wrote:

> This discussion is probably now offtopic for this list, but I think the issue is with your JRuby options.

Yeah. Indeed. I'll join the JRuby mailing list for that.

But thanks for the help anyway.

Cheers.

Peter Neubauer

unread,
Dec 9, 2011, 3:55:23 AM12/9/11
to ne...@googlegroups.com
Cool Dmytrii,
I think JRuby should really not be the point that drives you away from
Java based projects.

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

brew install neo4j && neo4j start
heroku addons:add neo4j

Message has been deleted

espeed

unread,
Dec 9, 2011, 5:16:44 AM12/9/11
to us...@lists.neo4j.org
On Thursday, December 8, 2011 2:48:59 AM UTC-6, Dmytrii Nagirniak wrote:

Unfortunately I couldn't see a lot of value in the REST API either.
The core operations that are taken for granted with native bindings
(traversals using poor Ruby constructs) would require to execute HTTP
request (that's what SELECT N+1 in SQL world is).
Or otherwise I would have to wrap all the logic in the traversal queries. It
would significantly overcomplicate the system with HTTP handling logic.


Hi Dmytrii -

Neo4j Server has a built-in Gremlin scripting engine that enables REST
clients to execute transactions in a single HTTP request.

Gremlin is a domain-specific language for graphs written in Groovy. If you
were using a relational database, you would use its domain-specific
language, which is SQL. Same idea.

For the last few weeks, I have been working on Bulbs 0.3, which is a Python
REST client for Neo4j Server, and it has a library of Gremlin templates in a
YAML file.

The Python methods do named variable substitution on the Gremlin templates
and then execute them via the Neo4j Server Gremlin extension.

Here's an example:

gremlin.yaml
https://gist.github.com/1450859

element.py
https://gist.github.com/1450871

You can see the create_indexed_vertex Gremlin script has JSON args. Python
lists and dicts are converted to JSON, and then on the server side, the
Gremlin script converts them into Groovy maps and lists.

Marko is working on adding the JSONSlurper library import to Gremlin so you
won't have to do the import each time
(https://github.com/tinkerpop/gremlin/issues/259).


> So I decided to write another REST library
> (http:://github.com/dnagir/morpheus),
> but then gave up realising that you just cannot have a proper abstraction
> over HTTP.
> (I'll probably kill off that repo).

Groovy is pretty simple. Consider reviving Morpheus and using Gremlin for
scripting -- you'll get all the power of native Ruby and Neo4j without the
Java.

When Bulbs 0.3 is released, I'll post the full gremlin.yaml, and you should
be able to use it in Ruby without any mods since it's just YAML.

- James

--
View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Feedback-after-evaluation-tp3569774p3572548.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.

Peter Neubauer

unread,
Dec 9, 2011, 5:30:06 AM12/9/11
to ne...@googlegroups.com, Neo4j user discussions
Yes,
I agree. Basically, with Cypher for declarative and optimizable
queries, and Gremlin/Groovy for the power user or fine-tuned
traversals, the REST API could possibly be very minimalistic. Just my
2c.

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

brew install neo4j && neo4j start
heroku addons:add neo4j

On Fri, Dec 9, 2011 at 11:03 AM, James Thornton
<james.t...@gmail.com> wrote:
>
>
>
>
>
> On Thursday, December 8, 2011 2:48:59 AM UTC-6, Dmytrii Nagirniak wrote:
>>
>>

>> Unfortunately I couldn't see a lot of value in the REST API either.
>> The core operations that are taken for granted with native bindings
>> (traversals using poor Ruby constructs) would require to execute HTTP
>> request (that's what SELECT N+1 in SQL world is).
>> Or otherwise I would have to wrap all the logic in the traversal queries.
>> It would significantly overcomplicate the system with HTTP handling logic.
>
>

> Hi Dmytrii -
>
> Neo4j Server has a built-in Gremlin scripting engine that enables REST

> clients execute transactions in a single HTTP request.


>
> Gremlin is a domain-specific language for graphs written in Groovy. If you
> were using a relational database, you would use its domain-specific

> languages, which is SQL. Same idea.


>
> For the last few weeks, I have been working on Bulbs 0.3, which is a Python

> REST client for Neo4j Server, and I has a library of Gremlin templates in a


> YAML file. The Python methods do named variable substitution on the Gremlin
> templates and then execute them via the Neo4j Server Gremlin extension.
>
> Here's an example:
>
> gremlin.yaml
> https://gist.github.com/1450859
>
> element.py
> https://gist.github.com/1450871
>
> You can see the create_indexed_vertex Gremlin script has JSON args. Python
> lists and dicts are converted to JSON, and then on the server side, the
> Gremlin script converts them into Groovy maps and lists.
>
> Marko is working on adding the JSONSlurper library import to Gremlin so you
> won't have to do the import each time
> (https://github.com/tinkerpop/gremlin/issues/259).
>
>

>> So I decided to write another REST library
>> (http:://github.com/dnagir/morpheus),
>> but then gave up realising that you just cannot have a proper abstraction
>> over HTTP.
>> (I'll probably kill off that repo).
>

> Groovy is pretty simple. Consider reviving Morpheus and using Gremlin for
> scripting -- you'll get all the power of native Ruby and Neo4j without the
> Java.
>
> When Bulbs 0.3 is released, I'll post the full gremlin.yaml, and you should
> be able to use it in Ruby without any mods since it's just YAML.
>
> - James
>
>
>
>
>
>
>
>>

Michael Hunger

unread,
Dec 9, 2011, 5:35:18 AM12/9/11
to Neo4j user discussions
you should use native gremlin params where they can be used. otherwise you'll blow the scriptengine in the plugin and loose lots of performance

M

mobile mail please excuse brevity and typos

espeed

unread,
Dec 9, 2011, 5:44:41 AM12/9/11
to us...@lists.neo4j.org

Michael Hunger wrote

>
> you should use native gremlin params where they can be used. otherwise
> you'll blow the scriptengine in the plugin and loose lots of performance
>

Hi Michael -

What do you mean exactly? After the JSON param converts to a map, everything
is a native Groovy param.

- James

--
View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Feedback-after-evaluation-tp3569774p3572589.html

Michael Hunger

unread,
Dec 9, 2011, 10:00:27 AM12/9/11
to Neo4j user discussions
I understood you were just templating the params in there (string replacement) which would result in different groovy strings for every set of different parameters.

Is this correct?

Michael

espeed

unread,
Dec 9, 2011, 2:35:50 PM12/9/11
to us...@lists.neo4j.org

Michael Hunger wrote

>
> I understood you were just templating the params in there (string
> replacement) which would result in different groovy strings for every set
> of different parameters. Is this correct?
>

The string replacement is done on the client side. When the script is
presented to the extension, it looks like a normal Gremlin script.

For example, the update_indexed_vertex YAML script has three params: nodeId,
properties, indexName. But all the string replacement is done by the client.

By the time the Gremlin extension gets it, the nodeId is an integer, which
is used to look up the node object:

node = g.getRawGraph().getNodeById(5)

The properties are a JSON string, which are immediately converted to a
Groovy map so they can be iterated upon:

Map<String, Object> properties = slurper.parseText('{"age":35,"name":"James
Thornton"}')

The indexName is a string, which is used to look up the actual index:

index = manager.forNodes('people')

Here is what the Gremlin extension actually sees:
https://gist.github.com/1452942

Do see an issue with that?

- James

--
View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Feedback-after-evaluation-tp3569774p3573842.html

Michael Hunger

unread,
Dec 9, 2011, 5:03:42 PM12/9/11
to Neo4j user discussions
Ouch :)

The gremlin plugin (as does the cypher plugin) take an map (a json map) as "params", each of which which you then can refer to everywhere in the gremlin script (key == variable name, value == value).

So no need for JSONSlurper.

The real problem here lies in the fact that each of your statements (with different parameters) will cause the groovy-scripting engine to generate a new groovy class, which will in turn
* cause PermGen OutOfMemory errors as those can't be garbage collected as long as the script engine is around
* will take about factor 100-1000 longer to execute (parsing, class generation, loading and such).

For the first issue we have a work-around in the plugin which recreates the script-engine every 500 requests (should probably be configurable) but this is less than optimal.

The second problem will hit you all the time.

Michael

See: http://docs.neo4j.org/chunked/milestone/gremlin-plugin.html#rest-api-send-a-gremlin-script-with-variables-in-a-json-map

Am 09.12.2011 um 20:35 schrieb espeed:

espeed

unread,
Dec 9, 2011, 6:11:04 PM12/9/11
to us...@lists.neo4j.org

Michael Hunger wrote

>
> The gremlin plugin (as does the cypher plugin) take an map (a json map) as
> "params", each of which which you then can refer to everywhere in the
> gremlin script (key == variable name, value == value). So no need for
> JSONSlurper.
>

Ahh, that's different than Rexster's Gremlin extension
(https://github.com/tinkerpop/rexster/wiki/Gremlin-Extension). That makes
things easy.


Michael Hunger wrote


>
> The real problem here lies in the fact that each of your statements (with
> different parameters) will cause the groovy-scripting engine to generate a
> new groovy class, which will in turn
> * cause PermGen OutOfMemory errors as those can't be garbage collected as
> long as the script engine is around
> * will take about factor 100-1000 longer to execute (parsing, class
> generation, loading and such).
>
> For the first issue we have a work-around in the plugin which recreates
> the script-engine every 500 requests (should probably be configurable) but
> this is less than optimal. The second problem will hit you all the time.
>

What do you recommend for the second problem? And are you saying it will be
100-1000 longer than a normal Groovy or 100-1000 longer than Java?

Last week when I emailed you, you were looking into a way to store a custom,
server-side Gremlin library. Would that help?

- James

--
View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Feedback-after-evaluation-tp3569774p3574372.html

espeed

unread,
Dec 9, 2011, 7:26:56 PM12/9/11
to us...@lists.neo4j.org
Michael -

If I understand you correctly, then this modified Gremlin script and request
format should solve "problem 2":

Gremlin Script:
https://gist.github.com/1453964

Script Engine REST Request that uses "params":
https://gist.github.com/1453966

- James

--
View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Feedback-after-evaluation-tp3569774p3574507.html

espeed

unread,
Dec 9, 2011, 9:13:13 PM12/9/11
to us...@lists.neo4j.org
Michael -

What if each Gremlin script was scoped inside a Groovy function?

Example: https://gist.github.com/1454298

Would that help keep things clean?

- James

--
View this message in context: http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-Feedback-after-evaluation-tp3569774p3574631.html

Dmytrii Nagirniak

unread,
Dec 9, 2011, 5:53:31 PM12/9/11
to Neo4j user discussions, Neo4j user discussions

On 10/12/2011, at 9:03, Michael Hunger <michael...@neotechnology.com> wrote:
> For the first issue we have a work-around in the plugin which recreates the script-engine every 500 requests (should probably be configurable) but this is less than optimal.

That is just rediculous workaround as to me! You drop everything that the engine carefully tried to compile and optimize! This defeats the point of the optimization.
Why not use a priority queue/list/cache and drop the entries that are least used only?

Michael Hunger

unread,
Dec 9, 2011, 7:37:30 PM12/9/11
to Neo4j user discussions
Exactly, looks good.

Thanks James

I looked again into the source code of the GroovyScriptEngine and so far there is no public way of removing older scripts. They also use a HashMap and not a LinkedHashMap with LRU enabled for storing them.

One solution I could think of is to have two script engines, one for one-off shots while will be thrown away regularly. And another which will contain the scripts that have been used at least twice (or x-times) and will stay around forever (or probably throw an PermGen OOM on usage).

The only other option would be to duplicate the GroovyScriptEngine functionality and handle everything ourselves. I don't think that's a suitable way to go.

I like neither. And have no resources right now to update the GremlinPlugin, feel free to fork update and issue and pull request, or at least a GitHub issue.

Cheers

Michael

Michael Hunger

unread,
Dec 10, 2011, 10:52:13 AM12/10/11
to Neo4j user discussions
Hmm I think that works,

but it won't help with the OOM, as groovy compares the script contents to check if it is the same script

Michael

Michael Hunger

unread,
Dec 9, 2011, 6:12:48 PM12/9/11
to Neo4j user discussions
Because they are internal to the groovy script engine?

And can't be garbage collected as it is still around and holds handles to those classes.
Believe me I would love if there was another way.

Cheers

Michael

Dmytrii Nagirniak

unread,
Dec 9, 2011, 7:02:19 PM12/9/11
to Neo4j user discussions
On 10/12/2011, at 10:12 AM, Michael Hunger wrote:

> And can't be garbage collected as it is still around and holds handles to those classes.
> Believe me I would love if there was another way.

It should really be addressed properly.
It just sounds like "it's too hard, so we we'll just restart everything". Not the way to go, not.

I'm sure there are lots of issues, but dropping all the optimisations is just not good enough IMO.

Reply all
Reply to author
Forward
0 new messages