Hi
I get the following error when trying to run the lsa model:
2011-07-31 10:25:08,801 : INFO : loaded corpus index from c:\gensim
\
wiki_tfidf.mm.in
dex
2011-07-31 10:25:08,801 : INFO : initializing corpus reader from c:
\gensim\wiki_tfid
f.mm
2011-07-31 10:25:08,928 : INFO : accepted corpus with 3501556
documents, 100000
features, 542748074 non-zero entries
MmCorpus(3501556 documents, 100000 features, 542748074 non-zero
entries)
2011-07-31 10:25:08,937 : INFO : using serial LSI version on this node
2011-07-31 10:25:08,938 : INFO : updating model with new documents
2011-07-31 10:25:08,979 : INFO : preparing a new chunk of documents
2011-07-31 10:25:56,012 : INFO : using 100 extra samples and 2 power
iterations
2011-07-31 10:25:56,884 : INFO : 1st phase: constructing (100000, 300)
action ma
trix
2011-07-31 10:26:27,049 : INFO : orthonormalizing (100000, 300) action
matrix
Traceback (most recent call last):
File "lsa.py", line 13, in <module>
lsi = gensim.models.lsimodel.LsiModel(corpus=mm, id2word=id2word,
num_topics
=200)
File "c:\python27\lib\site-packages\gensim-0.8.0-py2.7.egg\gensim
\models\lsimo
del.py", line 310, in __init__
self.add_documents(corpus)
File "c:\python27\lib\site-packages\gensim-0.8.0-py2.7.egg\gensim
\models\lsimo
del.py", line 366, in add_documents
update = Projection(self.num_terms, self.num_topics, job)
File "c:\python27\lib\site-packages\gensim-0.8.0-py2.7.egg\gensim
\models\lsimo
del.py", line 117, in __init__
power_iters=P2_EXTRA_ITERS, extra_dims=P2_EXTRA_DIMS)
File "c:\python27\lib\site-packages\gensim-0.8.0-py2.7.egg\gensim
\models\lsimo
del.py", line 642, in stochastic_svd
q, r = matutils.qr_destroy(y) # orthonormalize the range
File "c:\python27\lib\site-packages\gensim-0.8.0-py2.7.egg\gensim
\matutils.py"
, line 284, in qr_destroy
a = numpy.asfortranarray(la[0])
File "c:\python27\lib\site-packages\numpy\core\numeric.py", line
408, in asfor
tranarray
return array(a, dtype, copy=False, order='F', ndmin=1)
MemoryError
I am using the following code:
import logging, gensim
logging.basicConfig(format='%(asctime)s : %(levelname)s : %
(message)s', level=logging.INFO)
id2word = gensim.corpora.Dictionary.load_from_text('c:\gensim
\_wordids.txt')
mm = gensim.corpora.MmCorpus('c:\gensim\_
tfidf.mm')
lsi = gensim.models.lsimodel.LsiModel(corpus=mm, id2word=id2word,
num_topics=200)
lsi.save('c:\gensim\wikilsamodel.lsa')
lsi.print_topics(10)
Any help would be much appreciated
Aneesha