Big Excel (xlsx) file parsing (docjure, incanter...)

825 views
Skip to first unread message

Stanislav Sobolev

unread,
Sep 6, 2013, 9:22:27 PM9/6/13
to clo...@googlegroups.com
Hello guys. I have excel file with huge columns in there.
When i used some plugin(docjure, or anything else) it shows me CompilerException java.lang.OutOfMemoryError: Java heap space, compiling
How can i handle that big file without converting it to csv? 
Primary problem in xlsx structure, it doesnt have lines or something, unlike csv.

Vijay Kiran

unread,
Sep 7, 2013, 2:30:43 AM9/7/13
to clo...@googlegroups.com
Hi,

I don't think docjure supports streaming options yet  - but you can try using Apache POI streaming API - http://poi.apache.org/spreadsheet/index.html - that should help in reading large files.

HTH,
@vijaykiran

Stanislav Sobolev

unread,
Sep 7, 2013, 10:28:57 AM9/7/13
to clo...@googlegroups.com
Allright, but if use SXSSF how can i stream excel file to that? 
https://github.com/ktsujister/clj-tsv2xls/blob/master/src/tsv2xls/core.clj Like in this example
can anybody provide me code example of using that streaming please?
with-open [out-stream (io/output-stream outfile) and further?


суббота, 7 сентября 2013 г., 12:30:43 UTC+6 пользователь Vijay Kiran написал:

scott tudd

unread,
Dec 10, 2013, 1:06:01 PM12/10/13
to clo...@googlegroups.com
orthogonal, perhaps helpful:

I wrote a clojure (wrapper) library that "streams" data in-and-out of Excel quite easily (and other applications, mostly finance, that use DDE).
especially useful if you need to monitor or updates changes to cells.


feedback welcome.

Ragnar Dahlén

unread,
Dec 11, 2013, 6:54:46 AM12/11/13
to clo...@googlegroups.com
Have you tried increasing the heap size of your JVM using the -Xmx and -Xms options? 

We're loading some spreadsheets that are 80M in size and have to run with a ~1.5G heap to handle this...
Reply all
Reply to author
Forward
0 new messages