My database right now is MySQL so I can't implement the Postgres
tsearch2 either.
Does anyone have any suggestions / implementations that they have used
and liked?
Thanks!
Why didn't you like pylucene? I only worked with (Java) Lucene but this was a
very pleasant experience. Since that project I really prefer using an external
search engine instead of getting a much stronger dependency on a specific database.
fs
IIRC pylucene has its own threading implementation that does not work
well with built-in threads used by CherryPy. Google would tell you more
about this and they may have solved that issue since then...
- Sylvain
Has no one used full-text search with Turbogears yet?
I would have used MySQL's full-text searching but I would really like
the stemming feature, which I don't think it has. I prefer external
search engines as well, because I agree that dependence on a specific
database is a bit too rigid for my tastes.
So far it's worked out nicely and without all the bother of using
PyLucene (which was my first choice) which I tried to get working with
TG a long while ago and failed due to the aforementioned threading
issues.
Lee
--
Lee McFadden
blog: http://www.splee.co.uk
work: http://fireflisystems.com
skype: fireflisystems
I've had good luck with (hype)estraier. I'd recommend doing any
writes to the index (when updating it) in a separate process because
of blocking issues (or implement something yourself using something
like [1])
Alberto
-Jeff
If you can run Postgres on another machine, you could still make use of
Fozzy:
http://microapps.sourceforge.net/fozzy/
Which is basically a simple REST wrapper on tsearch2.
(One caveat is that I haven't gotten around to porting it to TG
1.0. It shouldn't be hard to do, but it does need to be done if you
don't want to go through the trouble of getting TG 0.8.9 running).
--
anders pearson : http://www.columbia.edu/~anders/
C C N M T L : http://www.ccnmtl.columbia.edu/
weblog : http://thraxil.org/
I had a good experience with Xapian. Since it is not thread safe, I
wrote a XML RPC server with Twisted Python that handle the search and
return the results. The TurboGears application can call the XML RPC
using the standard xmlrpclib.
I can post the code if interested.
On Jan 24, 7:21 am, anders pearson <and...@columbia.edu> wrote:
> On 2007-01-23 02:17:36 -0000, chiangf wrote:
>
> > My database right now is MySQL so I can't implement the Postgres
> > tsearch2 either.If you can run Postgres on another machine, you could still make use of
> Fozzy:
>
> http://microapps.sourceforge.net/fozzy/
>
> Which is basically a simple REST wrapper on tsearch2.
>
> (One caveat is that I haven't gotten around to porting it to TG
> 1.0. It shouldn't be hard to do, but it does need to be done if you
> don't want to go through the trouble of getting TG 0.8.9 running).
>
> --
> anders pearson :http://www.columbia.edu/~anders/
> C C N M T L :http://www.ccnmtl.columbia.edu/
> weblog :http://thraxil.org/
>
> application_pgp-signature_part
> 1KDownload
Frank
(looks at code)
It looks like a homegrown solution, as it doesn't seem to import
anything but sqlobject, the model.py file for Docudo, and the time
module.
Looks fairly simple, but I had tested it out a few times and it seemed
to work well. About 7 functions and a list of stop words in under 200
lines of code. Very nice. Of course this was for a specific
application where we knew everything worth indexing would be in the
database, and how it would be stored, but it seems it's not a huge
task to "roll your own" (depending on your application).
Since the Docudo SVN server is MIA, I can't run "svn blame", but it'd
be a good guess that Ronald Jaramillo wrote it (he wrote almost all
the more involved bits of Docudo).
Kevin Horn
I just want to say that I have a working implementation of PyLucene
that works with TG and CherryPy. I am using it on my own blog, but
other than that is not widely tested yet. It also currently only
supportes English stemming, etc.
I am planning on releasing it soon as TurboLucene, just as soon as I
extract it from my project and generalize it.
Basically, the way it sorts out the whole threading issue (without
monkey-patching CherryPy) is to create a separate indexer thread and
search-thread-factory thread at initialization time and then just send
them messages telling them what you want. It actually works quite
well and is wrapped up in a simple and clean interface.
Anyway, I expect to release it in the next couple weeks, but if anyone
wants to see the code sooner, feel free to e-mail me.
Hope this helps,
Krys
Really awesome! Looking forwarding to using it :-)
fs
> > well and is wrapped up in a simple and clean interface.Really awesome! Looking forwarding to using it :-)
>
> fs
Couldn't make it compile (think)
> pylucene
Barfs on CP threads
> merquery
Again wouldn't compile I think
> , etc.), but it seemed that either no one could get
> it working (hype), didn't like it (pylucene), or it hasn't actually
> started yet (merquery).
I suggest you bite the bullet and use MySQL FULLTEXT search with
handwritten SQL. I stupidly set my app up with InnoDB tables which don't
support FTS, in the end I ended up using PyLucene and a simple XML-RPC
server to access it with. Ridiculous.
-Rob
You can just easy_install TurboLucene to get it.
The website is http://dev.krys.ca/turbolucene/.
It still needs a lot of work, but it is functional. Docs are still to
come, but the source is well commented and there really is not a lot
to it.
This is my first Open Source project, so I would really appreciate
feedback and suggestions. (Go easy on me!) :-D
Anyway, I hope someone finds it useful.
Enjoy!
Krys
Anay
http://www.thesamet.com/blog/2007/02/04/pumping-up-your-applications-with-xapian-full-text-search/
On Jan 26, 6:48 am, "Kevin Horn" <kevin.h...@gmail.com> wrote:
> There was a basic textsearchin Docudo...
>
> (looks at code)
>
> It looks like a homegrown solution, as it doesn't seem to import
> anything but sqlobject, the model.py file for Docudo, and the time
> module.
>
> Looks fairly simple, but I had tested it out a few times and it seemed
> to work well. About 7 functions and a list of stop words in under 200
> lines of code. Very nice. Of course this was for a specific
> application where we knew everything worth indexing would be in the
> database, and how it would be stored, but it seems it's not a huge
> task to "roll your own" (depending on your application).
>
> Since the Docudo SVN server is MIA, I can't run "svn blame", but it'd
> be a good guess that Ronald Jaramillo wrote it (he wrote almost all
> the more involved bits of Docudo).
>
> Kevin Horn
>