On Tue, Apr 28, 2015 at 4:14 PM, Adrian Klaver
<
adrian...@aklaver.com> wrote:
> On 04/28/2015 12:48 PM, Grant Kumataka wrote:
>> Is there a major difference between openpyxl than xlwt/xlrd?
>
> See below for general overview:
>
http://www.python-excel.org/
I certainly don't expect that overview to get any complaints that it's
too *specific*. :P
For whoever is maintaining that page (Chris Withers?), one thing which
has been bugging me is that .xlsx was introduced with (and is the
default format for) Excel 2007. So, while it's true that openpyxl and
xlsxwriter work with "Excel 2010" files, the verbiage for the other
packages say "older Excel files" which might be interpreted as "older
than Excel 2010". Yes, there is the note that you mean .xls, but if
the extension is what matters, then why not just say that? Why leave
that for a parenthetical comment?
> One major difference is openpyxl only works with .xlsx files while xlrd/xlwt
> work with *.xls files. Another is openpyxl is one program to read and write
> files, while xlrd/xlwt are two programs to do the same thing.
I would not put it *quite* that simply, now that we're responding on a
mailing list and not trying to be as brief as possible (which seems to
have been a goal for the
python-excel.org page). xlrd and xlwt read
and write, respectively, but I wouldn't say they "do the same thing"
as openpyxl, because openpyxl provides a unified representation of a
workbook which you can read from and write to. xlrd and xlwt don't
have that, which is one of the reasons why xlutils exists.
> I see. Well seems to me it would be easier to work with the aggregated CSV
> file. There you have just the data, which is what you want to manipulate,
> without worrying about all the spreadsheet extraneous information. For
> instance workbook, sheet, row/col, etc. Do the processing on the CSV data
> and then output a spreadsheet file.
There are trade-offs. In Excel, numeric data can be represented as
numbers; in a CSV, everything is text, so you have to specify the
conversion yourself. But on balance, I also recommend using CSV if
that's convenient. CSV is much, much faster to process than either of
the Excel formats, and provides the capability for arbitrary-size
files. Excel formats are limited in size, .xls especially so. The
"output a spreadsheet file" step (implying non-CSV) is often not that
important, especially if the data is just going to be looked at by
other programs anyway. It depends on how valuable the cosmetic
aspects are. My "customers" are sales and marketing types, so the
Excel formats provide a lot of value. Some scientists probably either
don't care, or you've made their life harder by giving them Excel
instead of CSV.
John Y.