IOError in writer.commit()

49 views
Skip to first unread message

emil

unread,
Mar 13, 2012, 4:15:34 AM3/13/12
to Whoosh
Hi, I am pretty new to whoosh. I was trying to index some documents in
a directory. When I do commit() I get an IOError. I am displaying some
extra outputs too to make it readable.

Enter the index directory name:: myindex
enter the source of documents:: /home/emil/workspace/python/project/
index/documents
indexing....
ALLAboutSpyware.txt
ConvertingMoviesToPspFormat.txt
BacktrackingEMAILMessages.txt
BitTorrentTutorials.txt
ABasicGuidetotheInternet.txt
Traceback (most recent call last):
File "index.py", line 31, in <module>
writer.commit()
File "/usr/local/lib/python2.7/dist-packages/whoosh/filedb/
filewriting.py", line 534, in commit
self.generation, self.segment_number, new_segments)
File "/usr/local/lib/python2.7/dist-packages/whoosh/filedb/
fileindex.py", line 99, in _write_toc
stream = storage.create_file(tempfilename)
File "/usr/local/lib/python2.7/dist-packages/whoosh/filedb/
filestore.py", line 79, in create_file
fileobj = open(path, mode)
IOError: [Errno 2] No such file or directory: 'myindex/_MAIN_1.toc.
1331625897.72'

This is my code.

from whoosh.fields import Schema, TEXT, KEYWORD, ID, STORED
from whoosh.analysis import StemmingAnalyzer
from whoosh import index
import os,os.path
import codecs

schema = Schema(path=ID(unique=True,stored=True),content=TEXT)
dir = raw_input('Enter the index directory name:: ')
#print dir
if not os.path.exists(dir):
print 'creating dir', dir, '...'
os.mkdir(dir)

myindex = index.create_in(dir,schema)
writer = myindex.writer()
doc_source_path = str(raw_input('enter the source of documents:: '))
#os.chdir("/home/emil/workspace/python/project/index/documents")
os.chdir(doc_source_path)
print 'indexing....'
for file in os.listdir("."):
#print file
filename = "/home/emil/workspace/python/project/index/documents/" +
str(file)
fileobj=open(filename,'rb')
text=fileobj.read()
#f = codecs.open(filename, 'r', encoding='utf-8')
#body = f.read()
#print body
print unicode(file)
writer.add_document(path=unicode(filename),content=unicode(text))

writer.commit()

Can you guys please tell me what is the reason for this?


Chris Wilson

unread,
Mar 13, 2012, 10:17:00 AM3/13/12
to Whoosh
Hi Emil,

On Tue, 13 Mar 2012, emil wrote:

> Hi, I am pretty new to whoosh. I was trying to index some documents in a
> directory. When I do commit() I get an IOError. I am displaying some
> extra outputs too to make it readable.
>
> Enter the index directory name:: myindex
> enter the source of documents:: /home/emil/workspace/python/project/
> index/documents

You supply a relative path to the index directory; and then you chdir(),
which makes that path invalid:

> os.chdir(doc_source_path)

If you have to allow entering a relative path on the command line, use
os.path.join to convert it to an absolute path before handing it to
Whoosh.

Cheers, Chris.
--
Aptivate | http://www.aptivate.org | Phone: +44 1223 967 838
Future Business, Cam City FC, Milton Rd, Cambridge, CB4 1UY, UK

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

emil joswin

unread,
Mar 13, 2012, 1:38:29 PM3/13/12
to who...@googlegroups.com
--
You received this message because you are subscribed to the Google Groups "Whoosh" group.
To post to this group, send email to who...@googlegroups.com.
To unsubscribe from this group, send email to whoosh+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/whoosh?hl=en.

Thanks a lot Chris. :)

clach04

unread,
Mar 13, 2012, 6:38:17 PM3/13/12
to Whoosh

On Mar 13, 6:17 am, Chris Wilson <ch...@aptivate.org> wrote:
> If you have to allow entering a relative path on the command line, use
> os.path.join to convert it to an absolute path before handing it to
> Whoosh.

The other option is to simply call os.path.abspath() on the input path
to get an absolute path .... obviously do this before calling
chdir ;-).

I have a (very) small demo app at http://code.google.com/p/pyopensearch/
which may be worth checking out as it covers indexing and searching as
a complete working example (don't be put off by the JSON stuff, you
can ignore it). I try and keep it small so that it is easy to grok. It
doesn't attempt to cover updating the index.

Chris
Reply all
Reply to author
Forward
0 new messages