lucene query of many or query throws stack overflow exception

86 views
Skip to first unread message

Li Li

unread,
May 31, 2012, 10:27:45 PM5/31/12
to ne...@googlegroups.com
hi
I have query START n=node:districtIndex(\"+(id:3 id:4 id:5 id:6
..........) +type1:district.
I must match type district and id are in a set. if expressed by
sql: type1='district' and id in(3,4,5,.....)
If there are many ids(I use 100), it will throw exception:
2012-5-31 19:38:46 com.sun.jersey.spi.container.ContainerResponse
mapMappableContainerException
5065 Severe: The exception contained within MappableContainerException
could not be mapped to a response, r e-throwing to the HTTP
container
5066 java.lang.StackOverflowError
5067 at java.util.regex.Pattern$8.isSatisfiedBy(Pattern.java:4783)
5068 at java.util.regex.Pattern$CharProperty.match(Pattern.java:3345)
5069 at java.util.regex.Pattern$Branch.match(Pattern.java:4114)
5070 at java.util.regex.Pattern$GroupHead.match(Pattern.java:4168)
5071 at java.util.regex.Pattern$Loop.match(Pattern.java:4295)
5072 at java.util.regex.Pattern$GroupTail.match(Pattern.java:4227)
5073 at java.util.regex.Pattern$BranchConn.match(Pattern.java:4078)
5074 at java.util.regex.Pattern$CharProperty.match(Pattern.java:3345)
5075 at java.util.regex.Pattern$Branch.match(Pattern.java:4114)
5076 at java.util.regex.Pattern$GroupHead.match(Pattern.java:4168)
...............................

by my own lucene knowledge. lucene is not designed for searching
many terms. it's designed for fewer words but this words occur in many
documents. for the purpose of ranking, it will record how many terms
are matched in disjunction query(or query). but for neo4j, its
overhead.
one solution for this is use filters. e.g. group all this ids
into a keywords filters. that will much faster.
But I don't know how to express filter in neo4j. anyone could help?

Michael Hunger

unread,
Jun 1, 2012, 2:52:29 AM6/1/12
to ne...@googlegroups.com
There was a discussion with Aseem Kishore on that on GitHub, don't know if it helps you.

https://github.com/neo4j/community/issues/494

Michael

Li Li

unread,
Jun 1, 2012, 3:06:02 AM6/1/12
to ne...@googlegroups.com
thanks. using parameter will solve this exception.
but I think lucene is not good at this kind of thing like many terms'
disjunction query.
I have read the codes of lucene's boolean query. it is usual use case
is a few terms(usually less than 10). and each term's posting is very
long. Using filter will make it a bit better. but I don't think it's a
perfect solution. We don't need score here. we just want to know which
doc is matched.

Michael Hunger

unread,
Jun 1, 2012, 3:14:37 AM6/1/12
to ne...@googlegroups.com
What is the use case of looking up that many users at once?

Perhaps a different modeling or query would help to circumvent the lucene query in the first place.

Michael

Li Li

unread,
Jun 1, 2012, 4:27:46 AM6/1/12
to ne...@googlegroups.com
I use neo4j to integrate many data resources. I will use a id property
to record the original database id.
when the database changed, it will tell neo4j. So I need find this
nodes in neo4j and update something.

On Fri, Jun 1, 2012 at 3:14 PM, Michael Hunger
Reply all
Reply to author
Forward
0 new messages