Apache POI wrapper????

317 views
Skip to first unread message

Sean Devlin

unread,
Jul 1, 2009, 10:36:03 AM7/1/09
to Clojure
Hey,
Has anyone out there written an Apache POI wrapper yet?

Sean

Richard Newman

unread,
Jul 1, 2009, 1:56:54 PM7/1/09
to clo...@googlegroups.com
> Has anyone out there written an Apache POI wrapper yet?

I started to (for Excel processing), only to abandon it in disgust.
POI is just too incomplete: I have to choose between loading
everything into memory (impossible), or essentially parsing XLSX
myself (so what's the point of using POI?) in a different fashion to
XLS. There's no streaming API for XLSX.

I switched to JavaCSV, simply requiring that the input be pre-
converted. Fine for my situation.

For writing Excel, it's probably worth the time investment. For small
worksheets it's fine. For non-trivial data, particularly in XLSX -- I
was dealing with a 230,000-row table -- it's more trouble than it's
worth.

Sorry to be a downer :/

Some scratch code from me -- sorry it's neither tidy nor complete, but
I don't have the time right now. Hope it helps.

;; Lots of redundancy here from experimentation.
(ns com.foo.xls
(:refer-clojure)
(:use clojure.contrib.duck-streams)
(:use clojure.contrib.pprint)
(:use clojure.contrib.seq-utils)

(:require [clojure.zip :as zip])
(:require [clojure.xml :as xml])
(:require [clojure.contrib.lazy-xml :as lazy-xml])
(:require [clojure.contrib.zip-filter :as zf])
(:require [clojure.contrib.zip-filter.xml :as zfx])
(:import (com.csvreader CsvReader))
(:import
(org.apache.poi.poifs.filesystem POIFSFileSystem)
(org.apache.poi.hssf.usermodel HSSFWorkbook)
(org.apache.poi.hssf.eventusermodel HSSFRequest
HSSFListener
HSSFEventFactory)
(org.apache.poi.ss.usermodel Workbook
WorkbookFactory
Row
Cell)
(org.apache.poi.hslf.model Sheet)
(java.io FileInputStream
FileOutputStream
InputStream
IOException)))

(defn sheets
"Returns a lazy sequence of sheets in the workbook."
([#^Workbook wb]
(sheets wb (.getNumberOfSheets wb) 0))
([#^Workbook wb c i]
(lazy-seq
(when (< i c)
(cons (.getSheetAt wb i)
(sheets wb c (inc i)))))))

(defmacro with-workbook [[wb path] & body]
`(with-open [in# (new FileInputStream ~path)]
(let [~wb (WorkbookFactory/create in#)]
~@body)))

;; Another approach:
(defmacro do-xls-rows [[path] & process-record]
`(let [pr#
(proxy [HSSFListener] []
(processRecord [#^Record ~'record]
~@process-record))]
(with-open [in# (new FileInputStream ~path)]
(let [poifs# (new POIFSFileSystem in#)]
(with-open [din# (.createDocumentInputStream poifs#
"Workbook")]
(let [req# (new HSSFRequest)
factory# (new HSSFEventFactory)]
(.addListenerForAllRecords req# pr#)
(.processEvents factory# req# din#)))))))

Sean Devlin

unread,
Jul 1, 2009, 2:32:30 PM7/1/09
to Clojure
Hmmm... good to know POI still needs work. I guess I'll just stick
with CSV & tab delimited for now. Thanks!
Reply all
Reply to author
Forward
0 new messages