Is Woosh dead? Can anyone help? Is there a live fork?

233 views
Skip to first unread message

Geraldo Xexeo

unread,
Nov 17, 2022, 1:12:48 PM11/17/22
to Whoosh

It seems that Woosh stale at 2.4.7, and 3.0 was never seen as I looked at this messages.

The bickbucket reposity is dead;

What can be done? Any volunteers to bring 2.4.7 at least to a maintained state?

Jerry

dles...@gmail.com

unread,
Nov 17, 2022, 6:28:55 PM11/17/22
to Whoosh
Because Whoosh development was already stalled in 2018, and its original developer (Matt Chaput) was not responding to inquiries, a few developers cloned it to whoosh-community/whoosh on GitHub, and even migrated and triaged the issues from BitBucket. The developers were quite active for a while. Matt Chaput then showed up in #544 and expressed his intent to work in the new repository as well. I think this was great news to everyone. But apparently he had trouble merging a new development branch he had been working on. He later created a parallel mchaput/whoosh repository on GitHub, perhaps as a temporary measure when the BitBucket repo became unavailable, but AFAIK no explanation was communicated. It is also unclear (at least to me) what Matt Chaput's plans are with the new branch, and where any contributions to the current branch would stand with respect to that new branch. From there, things seemed to be left in a state of ambiguity and lack of communications. At this moment, people appear to be posting issues to either repository, possibly not knowing which is the main one, and in neither repository are issues really being attended to.

I am under the impression that no one wants to commit efforts to the whoosh-community/whoosh repo because of the current uncertainties. Matt Chaput did amazing work with Whoosh, and it looks like no one would want to do anything that would interfere with his plans whatever they are. It seems to me that just a tiny bit of direction and communication from Matt Chaput might motivate a community of developers to pick up where things were left in 2020.

I hope I'm not misrepresenting any facts or hurting anyone's sensibilities. I'm a Whoosh user and enthusiast, and this is my understanding of the situation solely from reading this forum and GitHub issues!

David.

Michael Avrukin

unread,
Nov 20, 2022, 1:00:13 PM11/20/22
to Whoosh
This is the sense that I'm getting as well from reading the lists.

I'd like to offer that the community coalesce behind one branch.  If Matt isn't being responsive, then as a community we need to take a step forward, this is an important library that has value to organizations.  We at Verifas (I as the co-founder) would like to rely on this library for our core-tech, we look forward to a library that has at least some level of community support and I think we'd be happy to step in and sponsor that, if necessary, but there has to be commitment around one branch.  

Michael

enidegayalew

unread,
Nov 20, 2022, 1:08:30 PM11/20/22
to who...@googlegroups.com
Ok,what reason you have to use this  liberary


--
You received this message because you are subscribed to the Google Groups "Whoosh" group.
To unsubscribe from this group and stop receiving emails from it, send an email to whoosh+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/whoosh/85dfe112-793c-41a8-a56f-4c0cb49b988dn%40googlegroups.com.

Michael Avrukin

unread,
Nov 20, 2022, 2:05:52 PM11/20/22
to who...@googlegroups.com
Our stack is entirely python, we are indexing lots of short text snippets for searching.  We can bring up / purchase an elastic instance, but it winds up being more overhead than what we are interested in dealing with at the moment (tried the Elastic route).  Whoosh gives us the simplicity of a tight python integration and our dataset at the moment isn't too big and fits entirely in memory. 

Michael


Roger Binns

unread,
Nov 21, 2022, 11:02:31 AM11/21/22
to who...@googlegroups.com
On Sun, 20 Nov 2022, at 11:05, Michael Avrukin wrote:
> Our stack is entirely python, we are indexing lots of short text
> snippets for searching.

Have you looked into the full text search functionality in SQLite?

https://www.sqlite.org/fts5.html

I've used whoosh in the past because I needed to control the scoring when doing a query. BM25 is not very good when the text is a word or two, and I needed to include other factors.

Roger

Michael Avrukin

unread,
Nov 21, 2022, 11:30:02 AM11/21/22
to who...@googlegroups.com
That's right, the simple sqlite matching / searching won't work for us.  The text is more like 2 - 5 paragraphs (maybe) and we do need to be able to do phrase matching, not just single terms, along with a pluggable scoring model for additional data we know about the documents.

My other approach is to well... invest in bringing up an elastic instance or bringing up a dedicated Java lucene server on the side - neither is currently appealing.  

Michael

--
You received this message because you are subscribed to the Google Groups "Whoosh" group.
To unsubscribe from this group and stop receiving emails from it, send an email to whoosh+un...@googlegroups.com.

Geraldo Xexeo

unread,
Nov 21, 2022, 1:51:57 PM11/21/22
to Whoosh
As it is, Whoosh it is known to have problems after some size, and it is also reported that it becomes too slow.
There are other alternatives, such as Sphinx, but it seems that all will require the same effort, or higher, than establishing a server with a REST service implementing Lucene, either through Elastic Search or Solr.
Also, a "handmade" server application in Java would work. Lucene is easy to use in Java.
I have never been able to install PyLucene on Windows, but maybe it is easier on Linux.
That's why we need Whoosh back!

Jerry


enidegayalew

unread,
Nov 22, 2022, 7:59:10 AM11/22/22
to who...@googlegroups.com
ok good, i have to use whoosh for question answering , index my text file



Mailtrack Sender notified by
Mailtrack
11/22/22, 03:57:49 PM

David Lowry-Duda

unread,
Nov 22, 2022, 9:59:25 AM11/22/22
to who...@googlegroups.com
On Tue, Nov 22, 2022 at 03:59:00PM +0300, enidegayalew wrote:
>ok good, i have to use whoosh for question answering , index my text
>file

Whoosh allows a pretty simple setup and is written in pure python. It
seems very stable (I've been using it without much change for many
years). As a fast-enough, good-enough search system, whoosh is great.

- DLD
Reply all
Reply to author
Forward
0 new messages