Exception when run LSAMain

19 views
Skip to first unread message

Qi Song

unread,
Jul 16, 2015, 8:20:51 AM7/16/15
to s-spac...@googlegroups.com

Hi,

I'm trying to run LSAMain on a file which contains 2.3 million rows. Each row contains several words. I installed svdlibc. When I try to run LSAMain on this file, I get the following errors. However, if I reduce the file to 2million rows, LSAMain can give an output. I'm not sure the reason of this, maybe the space limit?


Jul 15, 2015 3:44:16 PM edu.ucla.sspace.common.GenericTermDocumentVectorSpace processSpace

INFO: performing log-entropy transform

Jul 15, 2015 3:44:16 PM edu.ucla.sspace.matrix.LogEntropyTransform$LogEntropyGlobalTransform <init>

INFO: Computing the total row counts

Jul 15, 2015 3:46:15 PM edu.ucla.sspace.matrix.LogEntropyTransform$LogEntropyGlobalTransform <init>

INFO: Computing the entropy of each row

Jul 15, 2015 3:46:19 PM edu.ucla.sspace.matrix.LogEntropyTransform$LogEntropyGlobalTransform <init>

INFO: Scaling the entropy of the rows

Jul 15, 2015 3:46:21 PM edu.ucla.sspace.lsa.LatentSemanticAnalysis processSpace

INFO: reducing to 300 dimensions

Jul 15, 2015 4:04:30 PM edu.ucla.sspace.matrix.factorization.SingularValueDecompositionLibC factorize

WARNING: svdlibc exited with error status.  stderr:


Exception in thread "main" java.lang.NullPointerException

at edu.ucla.sspace.matrix.factorization.AbstractSvd.dataClasses(AbstractSvd.java:95)

at edu.ucla.sspace.lsa.LatentSemanticAnalysis.processSpace(LatentSemanticAnalysis.java:465)

at edu.ucla.sspace.mains.GenericMain.processDocumentsAndSpace(GenericMain.java:514)

at edu.ucla.sspace.mains.GenericMain.run(GenericMain.java:443)

at edu.ucla.sspace.mains.LSAMain.main(LSAMain.java:167)



I wonder if someone can help me with this problem.


Bests~

Qi Song 

EECS, Washington State University

David Jurgens

unread,
Jul 16, 2015, 9:03:33 AM7/16/15
to s-spac...@googlegroups.com
Hi Qi,

  It looks like you might be triggering some case where SVDLIBC isn't generating the output matrices.  It's not obvious what is going on just yet, but I think we can figure it out. Could you answer the following questions:
  • Do you see any message like "svdlibc exited with error status" in your logs?  If SVDLIBC returns with a non-zero exit status, it doesn't throw an exception (though maybe it should...) but this message will print out
  • How much memory do you have on your system?  Also what operating system are you running?
  • When the svd is running, can you suspend the program and check whether the temporary matrix files are being generated.  If you're on a unix system, they will be in /tmp and should have names like "svdlibc..." It would be good to know just how much space these are taking up too.
  Thanks,
  David

--

---
You received this message because you are subscribed to the Google Groups "S-Space Package Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to s-space-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Qi Song

unread,
Jul 16, 2015, 3:46:48 PM7/16/15
to s-spac...@googlegroups.com
Hi David, 
Your question inspired me and I think the reason is memory insufficiency. I run the LSAMain in Ubuntu 12.04 installed in VMware. The memory originally was 8 GB, after I changed it to 16 GB today, the LSAMain works! It can handle the whole dataset now. 
Thank you very much!

Bests~
Qi Song
Reply all
Reply to author
Forward
0 new messages