Search options

26 views
Skip to first unread message

Ivan Sagalaev

unread,
May 11, 2007, 11:38:20 AM5/11/07
to django...@googlegroups.com
Hello!

I'd like to ask everyone's opinion on implementing a search
functionality in an app. The app is a forum that tends to be simple and
pluggable. Now I'm on a quest of picking a right solution for searching
and have stuck.

My current thoughts and decision:

- Searching using "like" db queries is too simplistic and tends to be
slower over time.
- Database-specific solutions (MySQL search, Postgres TSearch2) kill
portability.
- PyLucene is too large to work in-process (20 MB in memory). Also it
doesn't work with Python's threading (segfaulting the whole process on
import). A solution would be a dedicated PyLucene process.
- Xapian looks good but I didn't actually try it yet. I've heard though
that it doesn't implement locking of index database and this should be
done manually. Not a rocket science but complicates the solution a bit.
I've also seen recommendation to run it in a dedicated process.

So my questions are:

- Am I doomed to have a separate server? This complicates things a lot
and I very much inclined to use some in-process thing
- Are there any solutions on a scale between simplistic "likes" and
sophisticated indexers like Lucene?

Jeremy Dunck

unread,
May 11, 2007, 11:52:54 AM5/11/07
to django...@googlegroups.com
On 5/11/07, Ivan Sagalaev <Man...@softwaremaniacs.org> wrote:
>
> - Am I doomed to have a separate server? This complicates things a lot
> and I very much inclined to use some in-process thing

Probably. :)

> - Are there any solutions on a scale between simplistic "likes" and
> sophisticated indexers like Lucene?

http://www.osreviews.net/reviews/misc/hyperestraier
http://cheeseshop.python.org/pypi/estraiernative/0.2

http://swish-e.org/
http://cheeseshop.python.org/pypi/Swish-E/0.5

Joseph Heck

unread,
May 11, 2007, 1:57:08 PM5/11/07
to django...@googlegroups.com
We determined that Postgres was portable enough to any platform we'd
host on, and went with TSearch2 and have been pretty happy. Having
done hard-core search work in a previous life (www.singingfish.com), I
know it isn't everything you can get in the search world, but it was
sufficient for our needs. I personally feel that TSearch2 falls very
nicely between better than the simplistic "like"+wildcard SQL
statements and a sophisticated indexing engine like Lucene.

If you're willing to go "search server", you might even consider SOLR
(lucene based search server with a web api). Especially if you scale
out your front end's (the django app servers) horizontally in a large
environment, it becomes appealing. How many front-end's you have
actually becomes something to seriously consider, because the likes of
PyLucene, Xapian, and others all have search related indices that then
need to be kept up to date and available to the searcher processes.

ol...@survex.com

unread,
May 11, 2007, 2:46:50 PM5/11/07
to Django users
On May 11, 4:38 pm, Ivan Sagalaev wrote:
> - Xapian looks good but I didn't actually try it yet. I've heard though
> that it doesn't implement locking of index database and this should be
> done manually.

You've heard incorrect information then, since Xapian most definitely
does implement database locking.

Cheers,
Olly

Reply all
Reply to author
Forward
0 new messages