Slow Performance?

41 views
Skip to first unread message

Yissachar Radcliffe

unread,
Feb 27, 2013, 4:48:48 PM2/27/13
to lucene-a...@googlegroups.com
I've been playing around with this library to see if it is suitable for my project and one big issue I am running up against is the performance.

I have around 9 million documents that I need to index, and it takes me 2.5 hours to index everything on the dev appserver. For comparison, using regular Lucene to index the same data on my filesystem takes 3 minutes. I will only need to reindex this data periodically and push to AppEngine so this is not a huge issue, but would be nice to have better speed.

The more important performance issue I am worried about is the searching speed. A search on one field (multiple words + wildcard at the end) usually takes 300-500 ms to complete if it is a simple search (1 word + wildcard) and can sometimes take 1+ seconds if using multiple words as the search string. For comparison, Lucene on my filesystem completes the same search in 50-100ms. Normally I would consider 500ms search on 9 million documents very good, but this is for autocomplete so processing time really needs to be less than 200ms in order to be useful (as there well already be some extra time wasted in latency).

Is there anything that can be done to improve the performance?

Fabio Grucci

unread,
Mar 6, 2013, 9:39:19 AM3/6/13
to lucene-a...@googlegroups.com

Hello Yissachar, I know the performance issues, I'm constantly working on it but it's not so simple. I'm considering adding the Memcached support into this project but it doesn't bring much performance advantage and in some scenarios everything becomes slightly slower than without cache. Unfortunately working with app engine is not so quick and every little step should be tested again and again, this is the price for an high scalability platform like app engine. Spite of myself I must say that LAE can never reach the performance that Lucene reaches on a file system because there are some limitation in app engine like, for instance, the use of threads, however in LAE there is still scope for improvements and that's what I'm working on.

Now I'm working to improve LAE performance as well in particular every index search operation.

Thank you for using Lucene AppEngine!

Fabio Grucci

unread,
Mar 8, 2013, 6:10:20 AM3/8/13
to lucene-a...@googlegroups.com
Hello Yissachar,

I'm close to releasing a new version with some performance improvements (should be available in less than two weeks in maven repository with a more permissive license). See the attachment if you want to try it in preview...
lucene-appengine-4.1.0-SNAPSHOT.jar

Yissachar Radcliffe

unread,
Mar 18, 2013, 11:18:47 AM3/18/13
to lucene-a...@googlegroups.com
Sorry for the late reply - it looks like I turned off my email notifications by mistake.

For the time being I've moved to a Lucene index hosted outside of App Engine, but I will re-test with the LAE version when I have a chance.

I understand the general AppEngine issues and can empathize - I've run up against some of my own. Due the performance issues LAE might not work for my current project but there are definitely tons of cases where the convenience of having Lucene directly in AppEngine will win over a slight performance degradation.

Thanks for the update and for changing the license to be more permissive!
Reply all
Reply to author
Forward
0 new messages