index.version_in doesn't work for me

26 views
Skip to first unread message

Helmut Jarausch

unread,
Sep 16, 2012, 5:51:50 AM9/16/12
to who...@googlegroups.com
Hi,

as as I understand, whoosh.index.version_in(<Directory>) returns the
Whoosh's version and the format version
of an index in <Directory>.

Unfortunately this fails for me (or what am I missing?)


#!/usr/bin/python3

import re
import whoosh.index
IndexDir= '/home/jarausch/Bib_NDev/Wh_Index'

if whoosh.index.exists_in(IndexDir) :
(release_version_in,format_version_in)=
whoosh.index.version_in(IndexDir)

"""
Traceback (most recent call last):
File "Index_Version.py", line 9, in <module>
(release_version_in,format_version_in)=
whoosh.index.version_in(IndexDir)
File "/usr/lib64/python3.2/site-packages/whoosh/index.py", line
181, in version_in
return version(storage, indexname=indexname)
File "/usr/lib64/python3.2/site-packages/whoosh/index.py", line
209, in version
return (ix.release, ix.version)
AttributeError: 'FileIndex' object has no attribute 'release'
"""
print("{0}\t\t{1}".format(release_version_in,format_version_in))

else :
print("No (valid) index in {}".format(IndexDir))



This is with
changeset: 1320:d9d9aa561d1f
tag: tip
user: Matt Chaput <ma...@whoosh.ca>
date: Fri Sep 14 15:05:05 2012 -0400


Many thanks for a hint,
Helmut.

Matt Chaput

unread,
Sep 19, 2012, 3:54:33 PM9/19/12
to who...@googlegroups.com
On 16/09/2012 5:51 AM, Helmut Jarausch wrote:
> Hi,
>
> as as I understand, whoosh.index.version_in(<Directory>) returns the
> Whoosh's version and the format version
> of an index in <Directory>.
>
> Unfortunately this fails for me (or what am I missing?)

Sorry, fixed now :)


Helmut Jarausch

unread,
Sep 20, 2012, 3:33:49 AM9/20/12
to who...@googlegroups.com
On 09/19/2012 09:54:33 PM, Matt Chaput wrote:
> Sorry, fixed now :)

Hi Matt,

the current documentation is still saying :
whoosh.index._index_version
The version number of the index format which this version of Whoosh
writes

I haven't found this anymore in the sources from today.

How can I find out which index version Whoosh would create?

Furthermore, is there any API to get some statistics about an index
like the number of records being indexed?

Thanks for this wonderful Whoosh,
Helmut.

Matt Chaput

unread,
Sep 24, 2012, 1:43:13 PM9/24/12
to who...@googlegroups.com
Hi, sorry, I lost track of this email.

> How can I find out which index version Whoosh would create?

Can you tell me why do you want to know? I'd prefer to keep this as an
implementation detail subject to change. I think including it in the
result of index.version() was a mistake on my part.

> Furthermore, is there any API to get some statistics about an index
> like the number of records being indexed?

The Searcher and IndexReader APIs have lots of methods to get collection
and term statistics (most of the methods on Searcher are just copied
from an underlying IndexReader).

# Number of documents in the index, deleted or undeleted
searcher.doc_count_all()

# Number of undeleted documents
searcher.doc_count()

See the docs here:

http://whoosh.readthedocs.org/en/latest/api/reading.html

Cheers,

Matt

Helmut Jarausch

unread,
Sep 24, 2012, 2:13:13 PM9/24/12
to who...@googlegroups.com
On 09/24/2012 07:43:13 PM, Matt Chaput wrote:
> Hi, sorry, I lost track of this email.
>
>> How can I find out which index version Whoosh would create?
>
> Can you tell me why do you want to know? I'd prefer to keep this as
> an implementation detail subject to change. I think including it in
> the result of index.version() was a mistake on my part.

I'd like to compare the version that the current index was built with
(using whoosh.index.version_in)
with the version the current Whoosh would use for recommending
rebuilding of the index at the next opportunity.



>> Furthermore, is there any API to get some statistics about an index
>> like the number of records being indexed?
>
> The Searcher and IndexReader APIs have lots of methods to get
> collection and term statistics (most of the methods on Searcher are
> just copied from an underlying IndexReader).
>
> # Number of documents in the index, deleted or undeleted
> searcher.doc_count_all()
>
> # Number of undeleted documents
> searcher.doc_count()
>
> See the docs here:
>
> http://whoosh.readthedocs.org/en/latest/api/reading.html
>

Thanks for that,

cheers,
Helmut.

Matt Chaput

unread,
Sep 24, 2012, 3:28:12 PM9/24/12
to who...@googlegroups.com
> I'd like to compare the version that the current index was built
> with (using whoosh.index.version_in) with the version the current
> Whoosh would use for recommending rebuilding of the index at the
> next opportunity.

For special cases like the Whoosh 2 -> Whoosh 3 transition, where even
though W3 should be backwards-compatible you might want to recommend
reindexing to take advantage of improvements, I would recommend using
the actual Whoosh version number:

release, _ = index.version_in(indexdir)
if release[0] < 3:
# Recommend re-indexing

If you're just worried about whether the current version will work with
the index at all, however, you should just try to open the index and
catch the IndexVersionError exception. I just checked in a change to the
documentation for version() and
version_in() to say:

"""
You should avoid attaching significance to the second number (the index
version). This is simply a version number for the TOC file and probably
should not have been exposed in a public interface. The best way to
check if the current version of Whoosh can open an index is to actually
try to open it and see if it raises a ``whoosh.index.IndexVersionError``
exception.
"""

Sorry about the confusion!

Cheers,

Matt


Helmut Jarausch

unread,
Sep 25, 2012, 3:23:02 AM9/25/12
to who...@googlegroups.com
>--
>
>
>
>
Reply all
Reply to author
Forward
0 new messages