startsWith and endsWith

692 views
Skip to first unread message

Guy Ellis

unread,
Feb 15, 2017, 1:44:45 PM2/15/17
to Gremlin-users
I'm trying to work out how to get the following to work:

g.V().has(<prop-name>, startsWith/endsWith(<some-text>))

So if I have:

graph.addVertex('myname', 'The quick brown fox JUMPS oVeR the lazy dog');

I would expect these to match:

g.V().has('TEXT_key_prop', textContainsRegex('^The quick'))
g.V().has('TEXT_key_prop', textContainsRegex('^The'))

I'm using Titan with Elasticsearch.

Because of all the combinations of indexes and ways to query this I created a script that
generated a bunch of groovy that is in this gist to test as many permutations of the query
as possible and I'm unable to find any that will match the startsWith or endsWith.

Daniel Kuppitz

unread,
Feb 15, 2017, 3:10:34 PM2/15/17
to gremli...@googlegroups.com
Hi Guy,

I believe ^ and $ are always added by Titan. This should work for you:

g.V().has('TEXT_key_prop', textContainsRegex('The.*')) // startsWith
g.V().has('TEXT_key_prop', textContainsRegex('.*dog')) // endsWith

Cheers,
Daniel


--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/38e91396-e711-4196-a408-053e9dc49ace%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Guy Ellis

unread,
Feb 16, 2017, 9:21:19 AM2/16/17
to Gremlin-users
Hi Daniel - thank you! That gets me a lot closer.

What I'm looking for is a case insensitive match on any number of characters. If we isolate the problem to Starts With and have the phrase "The quick brown fox ..." in the graph and the terms that we want to check against are:

1. The.*
2. The quick.*
3. the.*
4. the quick.*
5. brown.*

Terms 1 through 4 should all match because it should be case insensitive and spaces should have no impact. Term 5. should fail to match because the phrase does not start with "brown"

The match results I am seeing for the 3 types of index with textContainsRegex are:

TEXT: 1,3
STRING: 1
TEXTSTRING: 1,3

and with textRegex:

TEXT: 3,4
STRING: 3,4
TEXTSTRING: 3,4

What we have is that textRegex covers the case where there's a space but does not do case insensitive.
textContainsRegex is case insensitive but will not include a space in the match.

Is there a solution to this?

On Wednesday, February 15, 2017 at 3:10:34 PM UTC-5, Daniel Kuppitz wrote:
Hi Guy,

I believe ^ and $ are always added by Titan. This should work for you:

g.V().has('TEXT_key_prop', textContainsRegex('The.*')) // startsWith
g.V().has('TEXT_key_prop', textContainsRegex('.*dog')) // endsWith

Cheers,
Daniel

On Wed, Feb 15, 2017 at 7:44 PM, Guy Ellis <wildf...@gmail.com> wrote:
I'm trying to work out how to get the following to work:

g.V().has(<prop-name>, startsWith/endsWith(<some-text>))

So if I have:

graph.addVertex('myname', 'The quick brown fox JUMPS oVeR the lazy dog');

I would expect these to match:

g.V().has('TEXT_key_prop', textContainsRegex('^The quick'))
g.V().has('TEXT_key_prop', textContainsRegex('^The'))

I'm using Titan with Elasticsearch.

Because of all the combinations of indexes and ways to query this I created a script that
generated a bunch of groovy that is in this gist to test as many permutations of the query
as possible and I'm unable to find any that will match the startsWith or endsWith.

--
You received this message because you are subscribed to the Google Groups "Gremlin-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-user...@googlegroups.com.

Daniel Kuppitz

unread,
Feb 16, 2017, 9:52:37 AM2/16/17
to gremli...@googlegroups.com
Hmm, perhaps try textRegex("(?i)the.*").

Cheers,
Daniel


To unsubscribe from this group and stop receiving emails from it, send an email to gremlin-users+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gremlin-users/b26f0003-edc1-4656-bef2-7d1acee96867%40googlegroups.com.

HadoopMarc

unread,
Feb 16, 2017, 10:33:18 AM2/16/17
to Gremlin-users
Hi Guy,

Does not ElasticSearch keep a case-insensitive index, that is lowercase?  That means that only 3, 4 are expected to give a result.

Cheers,    Marc

Op donderdag 16 februari 2017 15:52:37 UTC+1 schreef Daniel Kuppitz:

Guy Ellis

unread,
Feb 16, 2017, 10:40:15 AM2/16/17
to Gremlin-users
Thanks Daniel - Yes that works with the TEXT index only. I think that I can work with that for now. Thank you!

Guy Ellis

unread,
Feb 16, 2017, 10:44:55 AM2/16/17
to Gremlin-users
Marc - I thought that ES was case insensitive by default including the search term you provided. i.e. shouldn't matter what you pass in. I could be wrong and will research this further. Daniel's suggestion of putting the case insensitivity into the Regex worked. 
Reply all
Reply to author
Forward
0 new messages