Re: Gremlin Script Engine Workarounds

110 views
Skip to first unread message

Michael Hunger

unread,
Dec 12, 2011, 7:30:20 AM12/12/11
to James Thornton, ne...@googlegroups.com
Yep, both are correct, and the map is not only growing unbounded (which is not an issue) but holds onto the classes generated by the script engine so that they cannot be gc'ed. And also as long as there is just one ClassLoader used for all the classes, none of them can be collected anyway.

See here: http://zeroturnaround.com/blog/rjc201/

So a solution would be to use a size-limited LRU cache with weak references and a class-loader per generated class inside of the GroovyScriptEngine.

Probably put up an issue on the github repo but I'm not sure how likely it is that those are fixed, also there is no way of "removing" a script from that cache (regardless of the gc-classloader leak issues).

Cheers

Michael

Am 12.12.2011 um 12:36 schrieb James Thornton:

> Hey Michael -
>
> Thanks for the info. I looked at the GSE source to try and understand
> the problem better. Please let me know if these are the problem areas
> you are referring to...
>
>
>> But it doesn't really help us as new scripts always result in new classes,
>> even when just a single character is changed.
>
> Is this what you mean when you say, "new scripts always result in new classes"?
>
> https://github.com/russel/groovy/blob/master/src/main/org/codehaus/groovy/jsr223/GroovyScriptEngineImpl.java#L109
>
>
>> And he didn't look into the GSE implementation - where this scriptCache resides :(
>
> And is this the scriptCache (classMap) that's causing the problem by
> growing unbounded?:
>
> https://github.com/russel/groovy/blob/master/src/main/org/codehaus/groovy/jsr223/GroovyScriptEngineImpl.java#L338
>
> Thanks.
>
> - James

James Thornton

unread,
Dec 12, 2011, 5:44:03 PM12/12/11
to ne...@googlegroups.com, James Thornton
I created a ticket for it here:


- James

Michael Hunger

unread,
Dec 12, 2011, 5:54:37 PM12/12/11
to ne...@googlegroups.com, James Thornton
Thanks !

Michael

blackdrag

unread,
Dec 13, 2011, 12:00:43 PM12/13/11
to Neo4j
Hi,

I think instead of having James as communication bridge in between you
and me (on the Groovy side) and potentially others, I thought I come
here to see if we cannot find a suitable solution - also because I
think James cannot answer all questions I have.

First the disclaimer, that I know a bit about neo4j, but never looked
into it in more detail and never worked with it. So for example I
don't know if you actually have a special reason to use the JSR223
engine here. If you could describe your requirements for the script
engine a little, I could think about what is the best solution.

How about it Michael?

bye blackdrag

On 12 Dez., 23:54, Michael Hunger <michael.hun...@neotechnology.com>
wrote:

Michael Hunger

unread,
Dec 13, 2011, 1:47:12 PM12/13/11
to ne...@googlegroups.com
Hi blackdragon,

sure, would love that.

Here is the scenario.

The tinkerpop stack offers a lower level graph traversal implementation called Pipes and higher level DSL's in different languages

Gremlin in Groovy was first.
We're including a server extension in neo4j-server that accepts gremlin strings as post data and executes them with the GroovyScriptEngine
(actually a GremlinScriptEngine but that is just a simple subclass).

The problem arises if users generate a lot of different scripts (> 500 is enough for medium sized heaps - 1G) which are all stored and hold onto by the script
engine. Our workaround so far are
* killing the script-engine every 500 requests. Disadvantage is that you loose all precompiled classes that are executed more often than once
* encouraging people to use groovy/gremlin variables which will reduce the # of different scripts instead of inlining them into the script code

I see 3 Problems there:

* cache is just a hashmap, not a weak-hashmap so it holds on to the scripts and script-classes
* cache is not a bound LRU cache so that older entries are not evicted
* there is a probably single classloader used for the classes so due to the classloader<->class bidirectional reference as long as the classloader is still used it can't be collected and also its classes

In the future we'd like to provide dynamic language server extension that are probably JSR223 based (but probably that comes with too many disadvantages).

Hope that explains our use-case.

Cheers

Michael

Jochen Theodorou

unread,
Dec 13, 2011, 7:37:35 PM12/13/11
to ne...@googlegroups.com
Am 13.12.2011 19:47, schrieb Michael Hunger:
> Hi blackdragon,
>
> sure, would love that.
>
> Here is the scenario.
>
> The tinkerpop stack offers a lower level graph traversal
> implementation called Pipes and higher level DSL's in different
> languages
>
> Gremlin in Groovy was first. We're including a server extension in
> neo4j-server that accepts gremlin strings as post data and executes
> them with the GroovyScriptEngine (actually a GremlinScriptEngine but
> that is just a simple subclass).

If we - in Groovy - talk about GroovyScriptEngine we mean the one in
groovy.util, you mean the jsr 223 engine. And I guess you use that one
because you wan to support multiple languages through the jsr 223
interface. The problem with that interface is, that it does bind your
hands quite a lot in terms of configuration.

> The problem arises if users generate a lot of different scripts (>
> 500 is enough for medium sized heaps - 1G) which are all stored and
> hold onto by the script engine. Our workaround so far are * killing
> the script-engine every 500 requests. Disadvantage is that you loose
> all precompiled classes that are executed more often than once *
> encouraging people to use groovy/gremlin variables which will reduce
> the # of different scripts instead of inlining them into the script
> code

so on one hand you want to hold on to scripts, on the other you don't
want too many.

> I see 3 Problems there:
>
> * cache is just a hashmap, not a weak-hashmap so it holds on to the
> scripts and script-classes
> * cache is not a bound LRU cache so that older entries are not evicted
> * there is a probably single classloader used for the classes so due
> to the classloader<->class bidirectional reference as long as the
> classloader is still used it can't be collected and also its classes

WeakHashMap is not the right structure, in a WeakHashMap the keys are
weak, the classes are stored as values. But we have a map for that,
since we use something like that to store meta classes. As for the
loader, GroovyClassLoader uses a new sub loader for each compilation, so
that part is no problem. But that doesn't help as long as something
holds onto the classes of course.

> In the future we'd like to provide dynamic language server extension
> that are probably JSR223 based (but probably that comes with too many
> disadvantages).
>
> Hope that explains our use-case.

I see. So you don't insist on jsr223. That is good ;) If I give you a
program using GroovyClassLoader or something like that, combined with a
LRU cache, would that be then ok for you?

bye blackdrag

--
Jochen "blackdrag" Theodorou - Groovy Project Tech Lead
blog: http://blackdragsview.blogspot.com/
german groovy discussion newsgroup: de.comp.lang.misc
For Groovy programming sources visit http://groovy-lang.org

Michael Hunger

unread,
Dec 13, 2011, 7:48:42 PM12/13/11
to ne...@googlegroups.com
Hi Jochen,

Thanks a lot,

Sounds good, the current code is here:

https://github.com/neo4j/cypher-plugin

the gremlin-script-engine comes from the tinkerpop-stack:

https://github.com/tinkerpop/gremlin

Cheers

Michael

James Thornton

unread,
Dec 13, 2011, 7:51:51 PM12/13/11
to ne...@googlegroups.com


On Tuesday, December 13, 2011 6:48:42 PM UTC-6, Michael Hunger wrote:

Sounds good, the current code is here: https://github.com/neo4j/cypher-plugin

Thanks Jochen. 

Michael, you mean the current code is here?: https://github.com/neo4j/gremlin-plugin

- James
 

Michael Hunger

unread,
Dec 13, 2011, 8:20:35 PM12/13/11
to ne...@googlegroups.com
The code that uses the script engine, yes.

Ouch, right sorry, wrong completion in browser tab bar :)

Cool that you figured it out.

Michael
Reply all
Reply to author
Forward
0 new messages