Gracefully handling null values with properties()

899 views
Skip to first unread message

Tim Schultz

unread,
Apr 21, 2021, 12:05:33 AM4/21/21
to Gremlin-users
Hi all,

Having an issue with vertices containing NULL values in some cases where I didn't properly handle them within my data loader logic.  Calls like valueMap(), elementMap(), etc. all error out with NullPointerExceptions.  

I'm at the point where I have millions of vertices and it's not feasible to find and fix each instance in the graph where a NULL value is present.

In particular, a query that is heavily used on my system is also failing at the properties() step:

g.V().has(..., ...).outE().has(T.label, within(..., ..., ...)).inV().dedup().local(properties().group().by(key()).by(value())).collect().groupBy{it.type}

Before I fix these loading issues and blast away this huge graph and reload it, is there any way I can gracefully handle null values in properties() ?

The coalesce/constant pattern works well when using project, but I need to use properties() in this case to handle different types of nodes with different k/v pairs.

Thanks for the help in advance!

Regards,

Tim

Bassem Naguib

unread,
Apr 21, 2021, 12:38:53 AM4/21/21
to Gremlin-users
Hi Tim,

Gremlin does not allow null property values. How did you manage to load graph data with null property values? And which database are you using?

Stephen Mallette

unread,
Apr 21, 2021, 5:48:11 AM4/21/21
to gremli...@googlegroups.com
Gremlin will have support for null traversers in 3.5.0 which should be releasing shortly. Note that the intention of this upcoming feature is more about being able to reference null within a traversal rather than requiring graph databases to internally store null values in the database. Whether graphs store null is entirely up to them. 

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/e914532f-f6f8-4c0c-815b-b5b82a282c8dn%40googlegroups.com.

Tim Schultz

unread,
Apr 21, 2021, 9:16:57 AM4/21/21
to Gremlin-users
Thanks for the reply!  

I should have mentioned - I am using Goblin (https://goblin.readthedocs.io/en/latest/) to load my data, as the ETL scripts are all in Python.

Every (String) property defaults to an empty string "" - I'm guessing what is happening (in the below example), is that when I parse out "journal" for this specific entry, the result is None.  I'm guessing based on your reply, this should never happen.

There is metadata attached to the node, however.  (This is not a "ghost node").

gremlin> g.V(268480736).project('x').by('uri')
==>[x:PMID:19546251]
gremlin> g.V(268480736).project('x').by('journal')
The property does not exist as the key has no associated value for the provided element: v[268480736]:journal
gremlin> g.V(268480736).elementMap()
java.lang.NullPointerException

Is there a way to iterate through every node and either 1.) remove ones which are problematic and its associated edges or 2.) "fix" these values by replacing with a constant?

Thanks for the help!

Tim Schultz

unread,
Apr 21, 2021, 9:22:41 AM4/21/21
to Gremlin-users
Any thoughts on how to iterate through all nodes to identify which of them are causing these issues?  I'm guessing they would need to be removed completely along with their associated edges.


gremlin> g.V(268480736).project('x').by('uri')
==>[x:PMID:19546251]
gremlin> g.V(268480736).project('x').by('journal')
The property does not exist as the key has no associated value for the provided element: v[268480736]:journal
gremlin> g.V(268480736).elementMap()
java.lang.NullPointerException

Thanks for the help!

Tim Schultz

unread,
Apr 21, 2021, 9:47:33 AM4/21/21
to Gremlin-users
Ok - so I think I figured out what's causing this.  (Although how to fix, not clear).

In the previous example, the 'journal' field is one I never explicitly added to an index in my schema (guess I overlooked that field).  The value of the field is NOT null (my ETL script is in fact checking for nulls before committing to the graph) --- it's likely an empty string.

SOME nodes have a value in the 'journal' field and work without issue.  The ones that do not error out with NullPointerException (although the values should be empty string).

My understanding is that if you do not explicitly specify how a field should be indexed, (JanusGraph in this case) will determine the value type and default to a standard index -- but perhaps this is platform-specific.

I can replicate this with one other field for another node type --- same issue, not explicitly added to and index in my schema.

Wondering what an optimal fix is for the nodes that have been already added to the graph.

Thanks!
On Wednesday, April 21, 2021 at 5:48:11 AM UTC-4 spmal...@gmail.com wrote:

Stephen Mallette

unread,
Apr 21, 2021, 10:22:30 AM4/21/21
to gremli...@googlegroups.com
I don't understand why you get a NullPointerException doing an elementMap() - what's the rest of that stack trace?

Tim Schultz

unread,
Apr 21, 2021, 10:48:06 AM4/21/21
to Gremlin-users
gremlin> g.V(2543768).valueMap()
java.lang.NullPointerException
Type ':help' or ':h' for help.
Display stack trace? [yN]y
java.lang.NullPointerException
        at org.janusgraph.graphdb.database.EdgeSerializer.parseRelation(EdgeSerializer.java:127)
        at org.janusgraph.graphdb.database.EdgeSerializer.readRelation(EdgeSerializer.java:73)
        at org.janusgraph.graphdb.transaction.RelationConstructor.readRelation(RelationConstructor.java:70)
        at org.janusgraph.graphdb.transaction.RelationConstructor$1.next(RelationConstructor.java:57)
        at org.janusgraph.graphdb.transaction.RelationConstructor$1.next(RelationConstructor.java:45)
        at org.apache.tinkerpop.gremlin.process.traversal.step.map.PropertyMapStep.map(PropertyMapStep.java:100)
        at org.apache.tinkerpop.gremlin.process.traversal.step.map.PropertyMapStep.map(PropertyMapStep.java:52)
        at org.apache.tinkerpop.gremlin.process.traversal.step.map.MapStep.processNextStart(MapStep.java:37)
        at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
        at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:197)
        at org.apache.tinkerpop.gremlin.console.Console$_closure3.doCall(Console.groovy:255)
        at sun.reflect.GeneratedMethodAccessor66.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:101)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:263)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1041)
        at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:37)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:127)
        at org.codehaus.groovy.tools.shell.Groovysh.setLastResult(Groovysh.groovy:463)
        at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43)
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:190)
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:58)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:168)
        at org.codehaus.groovy.tools.shell.Groovysh.execute(Groovysh.groovy:201)
        at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.super$3$execute(GremlinGroovysh.groovy)
        at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:101)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1217)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:144)
        at org.apache.tinkerpop.gremlin.console.GremlinGroovysh.execute(GremlinGroovysh.groovy:83)
        at org.codehaus.groovy.tools.shell.Shell.leftShift(Shell.groovy:120)
        at org.codehaus.groovy.tools.shell.Shell$leftShift$1.call(Unknown Source)
        at org.codehaus.groovy.tools.shell.ShellRunner.work(ShellRunner.groovy:93)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$work(InteractiveShellRunner.groovy)
        at sun.reflect.GeneratedMethodAccessor62.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:101)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1217)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:144)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:164)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.work(InteractiveShellRunner.groovy:138)
        at sun.reflect.GeneratedMethodAccessor61.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:43)
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:190)
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:58)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:160)
        at org.codehaus.groovy.tools.shell.ShellRunner.run(ShellRunner.groovy:57)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.super$2$run(InteractiveShellRunner.groovy)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:101)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1217)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuperN(ScriptBytecodeAdapter.java:144)
        at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.invokeMethodOnSuper0(ScriptBytecodeAdapter.java:164)
        at org.codehaus.groovy.tools.shell.InteractiveShellRunner.run(InteractiveShellRunner.groovy:97)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:234)
        at org.apache.tinkerpop.gremlin.console.Console.<init>(Console.groovy:168)
        at org.codehaus.groovy.vmplugin.v7.IndyInterface.selectMethod(IndyInterface.java:234)
        at org.apache.tinkerpop.gremlin.console.Console.main(Console.groovy:502)

Tim Schultz

unread,
Apr 21, 2021, 12:03:14 PM4/21/21
to Gremlin-users
Wondering if since I didn't explicitly add the key to an index, if by default it's setting as .unique() -- and all subsequent empty string values are breaking that constraint?

So there may be on node with journal = '' and the rest are effectively "broken" ?

Stephen Mallette

unread,
Apr 22, 2021, 11:16:11 AM4/22/21
to gremli...@googlegroups.com
This looks like more of a JanusGraph issue given the stacktrace. I'm not sure how you best go about finding these issue offhand and fixing them. Perhaps you should take this question to the JanusGraph mailing lists if you don't find much traction here.

Rajkumar Sangule

unread,
Aug 2, 2023, 2:58:58 AM8/2/23
to Gremlin-users
g.V(268480736).project('x').by(coalesce(values('journal'),constant('null')))
Reply all
Reply to author
Forward
0 new messages