openpyxl

307 views
Skip to first unread message

Grant Kumataka

unread,
Apr 23, 2015, 8:36:14 PM4/23/15
to python...@googlegroups.com
I just finished a course in python and am trying to tackle a problem at my work to speed up data analysis through excel.  I have decided to use python and the openpyxl module.  Is there an instruction manual about openpyxl besides the limited simple usage cases found on the openpyxl website?  

I am having trouble understanding do I need to add all these sub packages in the code?  Since my preliminary code isn't working?

Here is a simple code that I was trying to debug, just grabbing a value from the loaded excel.  But it says its a list.  Am I reading the file wrong?  or is it the right cell.value command?  Im pretty handy at just reading example code but there seems to be very limited examples that I can decipher and use how I want to.

Thanks for the help.

Grant

from openpyxl import load_workbook
from openpyxl.cell import get_column_letter
filename = raw_input('')
wb = load_workbook(filename, data_only=True)
print filename
ws=wb.get_sheet_names()
print ws
c=ws.cell_value('Q7')
print c

Chris Wakibi

unread,
Apr 24, 2015, 2:57:52 AM4/24/15
to python...@googlegroups.com
Hi Grant,
I think this is an xlwt/xlrd/xlutils list, not Openpyxl.

However, from your code, below is a simple explanation of why your code is not working.

ws=wb.get_sheet_names() #You are getting a name of a sheet, not as opposed to a worksheet
print ws #You printed that name
c=ws.cell_value('Q7') #You are attempting to get the cell from a sheet-name. This is wrong. Change this line to to the below

work_sheet = wb.active #Get a worksheet
c = work_sheet.cell_value('Q7') #Get a value from a cell in that worksheet
print c

Let me know if that helps!

--
You received this message because you are subscribed to the Google Groups "python-excel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python-excel...@googlegroups.com.
To post to this group, send email to python...@googlegroups.com.
Visit this group at http://groups.google.com/group/python-excel.
For more options, visit https://groups.google.com/d/optout.



--

Kind Regards
Wakibi Christopher

I love what I do, and I know my worth...

Charlie Clark

unread,
Apr 24, 2015, 3:02:07 AM4/24/15
to python...@googlegroups.com
Am .04.2015, 08:39 Uhr, schrieb Chris Wakibi <wak...@gmail.com>:

> Hi Grant,

> I think this is an xlwt/xlrd/xlutils list, not Openpyxl.

It's a list for anyone working with Excel files in Python using any of the
libraries. See Chris' post from last week.

> However, from your code, below is a simple explanation of why your code
> is not working.
> ws=wb.get_sheet_names() #You are getting a name of a sheet, not as
> opposed to a worksheet
> print ws #You printed that name

This is one of several reasons why the code won't work. Another is that
there is no .cell_value() for worksheets.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

Charlie Clark

unread,
Apr 24, 2015, 3:06:22 AM4/24/15
to python...@googlegroups.com
Hiya Grant,

Am .04.2015, 01:41 Uhr, schrieb Grant Kumataka
<grantkum...@gmail.com>:

> I just finished a course in python and am trying to tackle a problem at
> my
> work to speed up data analysis through excel. I have decided to use
> python
> and the openpyxl module. Is there an instruction manual about openpyxl
> besides the limited simple usage cases found on the openpyxl website?

The documentation certainly isn't complete but it covers most of the
package. In addition there this and the openpyxl users mailing list which
are both indexed by Google.

> I am having trouble understanding do I need to add all these sub packages
> in the code? Since my preliminary code isn't working?

I don't understand the question. In general you won't need to import from
the subpackages to read workbooks. The code you pasted is inaccurate and
incomplete so cannot actually have been what you're working with.

What are you trying to do? And what is the specific problem with openpyxl
you don't understand?

Chris Withers

unread,
Apr 24, 2015, 5:08:37 AM4/24/15
to python...@googlegroups.com
On 24/04/2015 07:39, Chris Wakibi wrote:
> Hi Grant,
> I think this is an xlwt/xlrd/xlutils list, not Openpyxl.
Just to echo Charlie's comments: this list is for all work involving
Excel and Python, be that the libraries you mention, openpyxl,
xlsxwriter, pyxll or anything else.

There's a very small number of people who have the time to post useful
replies to people, and I'm keen to create an inclusive forum where they
can help anyone with a problem.

cheers,

Chris

derek

unread,
Apr 25, 2015, 12:05:13 PM4/25/15
to python...@googlegroups.com
Good to hear that Chris; this list is a valuable resource.

Sometimes the response to newbies who post can be a tad harsh, though - will you be taking a stronger role as moderator to encourage more supportive responses?

Jon Clements

unread,
Apr 25, 2015, 12:05:45 PM4/25/15
to python-excel
Chris,

I don't see you around often on Stack Overflow but there's also a lot
of asking/answering going on there (all the libraries you mention have
tags) - so that'll be worth bearing in mind as well.

Also note - that because there's always canonical questions/answers,
the Python room on SO has also developed their own forum-esque
platform which has a small wiki/common Q&A section - it's a fairly
basic app written in Flask, but it might be a start if you wanted to
do something based off it... see
https://github.com/sopython/sopython-site which powers
http://sopython.com - might not be suitable but just throwing it out
there...

Cheers,

Jon.
> --
> You received this message because you are subscribed to the Google Groups
> "python-excel" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to python-excel...@googlegroups.com.
> To post to this group, send an email to python...@googlegroups.com.

Charlie Clark

unread,
Apr 25, 2015, 12:49:49 PM4/25/15
to python...@googlegroups.com
Am .04.2015, 18:05 Uhr, schrieb Jon Clements <jon...@googlemail.com>:

> Chris,

> I don't see you around often on Stack Overflow but there's also a lot
> of asking/answering going on there (all the libraries you mention have
> tags) - so that'll be worth bearing in mind as well.

I thought Chris made it quite clear that the number of people who can
answer questions is small; and those few people have limited time. As it
is, I try and answer as many questions as possible about openpyxl on
StackOverflow as possible but this is often a very frustrating exercise.

> Also note - that because there's always canonical questions/answers,
> the Python room on SO has also developed their own forum-esque
> platform which has a small wiki/common Q&A section - it's a fairly
> basic app written in Flask, but it might be a start if you wanted to
> do something based off it... see

Chris was merely describing the function of the mailing list as a forum
for discussing Python and Excel.

John Yeung

unread,
Apr 25, 2015, 1:20:54 PM4/25/15
to python-excel
On Sat, Apr 25, 2015 at 12:49 PM, Charlie Clark
<charli...@clark-consulting.eu> wrote:
> I thought Chris made it quite clear that the number of people who can answer
> questions is small; and those few people have limited time. As it is, I try
> and answer as many questions as possible about openpyxl on StackOverflow as
> possible but this is often a very frustrating exercise.

Yes.

>> Also note - that because there's always canonical questions/answers,
>> the Python room on SO has also developed their own forum-esque
>> platform which has a small wiki/common Q&A section - it's a fairly
>> basic app written in Flask, but it might be a start if you wanted to
>> do something based off it... see
>
> Chris was merely describing the function of the mailing list as a forum for
> discussing Python and Excel.

And Jon was just throwing out a suggestion to help alleviate the
frustration you expressed in the first part of your response. It
would be nice for the busy answerers to be able to tell the
uninitiated askers: "Thank you for your interest in using Python to
work with Excel; a detailed answer to your question can be found in
the FAQ."

John Y.

Jon Clements

unread,
Apr 25, 2015, 1:50:50 PM4/25/15
to python-excel
I'd also be happy to set up and host a moinmoin wiki or similar and
then everyone can work on it together (I can't promise a huge amount
of time to do so) - but at least it'd be a more centralised resource
than the current website and mailing list. Just throwing potential
ideas out there - certainly don't intend to step on anyone's toes.

Charlie Clark

unread,
Apr 25, 2015, 1:55:12 PM4/25/15
to python...@googlegroups.com
Am .04.2015, 19:50 Uhr, schrieb Jon Clements <jon...@googlemail.com>:

> I'd also be happy to set up and host a moinmoin wiki or similar and
> then everyone can work on it together (I can't promise a huge amount
> of time to do so) - but at least it'd be a more centralised resource
> than the current website and mailing list. Just throwing potential
> ideas out there - certainly don't intend to step on anyone's toes.

I'm sure that myself, Eric, Chris, John and John are more than happy to
see other people offering to help out. Personally, I'd rather see more
work documentation for the projects as this is the most effective way to
approach the problem: for me Sphinx has largely removed the need for wikis.

Jon Clements

unread,
Apr 25, 2015, 2:17:30 PM4/25/15
to python-excel
> I'm sure that myself, Eric, Chris, John and John are more than happy to see
> other people offering to help out. Personally, I'd rather see more work
> documentation for the projects as this is the most effective way to approach
> the problem: for me Sphinx has largely removed the need for wikis.

True - but there's a difference between a developer to the code base
and a user of the libraries, that quite frankly, just wants to use it
- and see/potentially contribute documentation only wise. At the
moment - if I need to work something out for a majority of those
libraries - I have to read the code - that's not really what your
average user wants.

I for one would *love* to see on the python-excel.org page a "How do
I?" section for practical purposes as well as a FAQ which lists stuff
like "this won't read xlsx but this will", "no you can't write pivot
tables" etc...

Currently the website doesn't offer that, and although the information
is available across disparate resources I believe that's the point
that Chris was making - it's *there* it's just *scattered around* the
place.

Chris Withers

unread,
Apr 26, 2015, 6:40:36 AM4/26/15
to python...@googlegroups.com
On 25/04/2015 17:05, derek wrote:
> Good to hear that Chris; this list is a valuable resource.
>
> Sometimes the response to newbies who post can be a tad harsh, though
> - will you be taking a stronger role as moderator to encourage more
> supportive responses?

Probably not. There are a lot of extremely lazy first time posters on
this list, and I'd prefer to see people who *can* help spending their
time helping those who've made an effort or working to improve the
libraries or documentation rather than spoonfeeding those too lazy to
make any effort whatsoever.

I'd say a pre-requisite for using this list should be a knowledge of
python (ie: not "your library is broken because the path I'm trying to
open doesn't exist") and an attempt to find the solution using the
documentation or existing answers on the list or stackoverflow.

I'm sorry if that sounds harsh, but we are very short on knowledgeable
helpers and having their time wasted by people who can't be bothered
isn't something I want to encourage.

cheers,

Chris

Chris Withers

unread,
Apr 26, 2015, 6:50:25 AM4/26/15
to python...@googlegroups.com
Hi Jon,

I've love to see the type of wiki FAQ your describe, particularly a
task-oriented one that makes recommendations across all the libraries or
a "if you're using <library name here>, the to achieve <task>, you
should do <name>". If you get one up and running, please let me know and
I'll link to it from the website.

That said, as Charlie points out, adding docs via pull request would
probably be even more helpful. xlwt, xlutils and openpyxl are all
Sphinx-based, I believe. I'm pretty passionate about getting documented,
reproducible developer setups. I think we're there with xlwt
(http://xlwt.readthedocs.org/en/latest/development.html) and xlutils
(http://pythonhosted.org/xlutils/development.html) . Less so with xlrd,
but maybe I'll get to that "some time"?

cheers,

Chris
>

Skip Montanaro

unread,
Apr 26, 2015, 6:59:11 AM4/26/15
to python...@googlegroups.com
On Sun, Apr 26, 2015 at 5:50 AM, Chris Withers <ch...@simplistix.co.uk> wrote:
> I've love to see the type of wiki FAQ your describe, particularly a
> task-oriented one that makes recommendations across all the libraries or a
> "if you're using <library name here>, the to achieve <task>, you should do
> <name>". If you get one up and running, please let me know and I'll link to
> it from the website.

Jumping into this late, but... Since this stuff is intended for use
from Python, why not just carve out a little niche on the Python wiki:
https://wiki.python.org/moin/ ? If this idea floats your boat, create
an account (if you don't already have one), and read down near the
bottom of the page about editing. I helped manage it for awhile.
Believe me, the kiddies love to scribble. We eventually had to shut
off editing by default. Now you have to ask...

Also, someone mentioned python-excel.org, but Chrome tells me that
name doesn't resolve. Is there a misspelling in there or is it a
not-yet-claimed domain name?

Skip Montanaro

Chris Withers

unread,
Apr 26, 2015, 7:07:55 AM4/26/15
to python...@googlegroups.com
http://www.python-excel.org/

Searching for "python excel" will also hopefully find it...

Chris

Charlie Clark

unread,
Apr 26, 2015, 8:00:24 AM4/26/15
to python...@googlegroups.com
Am .04.2015, 12:59 Uhr, schrieb Skip Montanaro <skip.mo...@gmail.com>:

> Also, someone mentioned python-excel.org, but Chrome tells me that
> name doesn't resolve. Is there a misspelling in there or is it a
> not-yet-claimed domain name?

Yes, there is a problem that the domain doesn't resolve without "www". No
idea as to whether this is easily fixable, Chris?

Adrian Klaver

unread,
Apr 26, 2015, 9:28:26 AM4/26/15
to python...@googlegroups.com
+1

>
> cheers,
>
> Chris
>


--
Adrian Klaver
adrian...@aklaver.com

derek

unread,
Apr 27, 2015, 4:25:55 AM4/27/15
to python...@googlegroups.com
Chris

I agree that if its a basic issue then someone can, of course, be referred back to, for example, the Python mailing list to get a better grasp of things,  In many cases, a pointer to the user guide or examples is actually the best answer.  It does seem though that a number of people do get "tossed in the deep end" by a boss or colleagues and are having to using Python for the first time and learn how to process (often badly-formatted) Excel as well.  While their questions often reflect confusion, its the tone of the response that I think is important, so as to not put them off learning this as a valuable tool.  That is all I was trying to communicate.

Derek

Grant Kumataka

unread,
Apr 28, 2015, 1:34:27 PM4/28/15
to python...@googlegroups.com
Thank you everyone for your responses.  I got sidetracked to writing a matlab program and didn't have the email updates selected and just thought to check this group again.  I will look over the responses and see what will work.  Most of the code I gathered from random examples not knowing exactly how the packages worked. 

I ultimately started using xlwt and xlrd imports on my own and seemingly got to a point but have incurred another problem that I am working to solve.

I still am unclear as to the function of the package openpyxl but seeing as the code was wrong I will work on changing and seeing if I understand more clearly.  I am still very new to python so bear with me.  Most of my print statements are to make sure my outputs are the correct :). 

For the people who asked my goal is to read an excel file that has been populated with values exported from a batch program into a csv file or excel file.  Then rearrange the values into a table that can be used for analysis.  My technicians spend hours upon hours copying and pasting this data into a table so my goal was to make this all automatic with python. 

Again thanks for the help sorry to have bugged some people.  I appreciate the open source knowledge in the software world as I am a hardware engineer. 

-Grant

Adrian Klaver

unread,
Apr 28, 2015, 2:54:37 PM4/28/15
to python...@googlegroups.com
On 04/28/2015 10:34 AM, Grant Kumataka wrote:
> Thank you everyone for your responses. I got sidetracked to writing a
> matlab program and didn't have the email updates selected and just
> thought to check this group again. I will look over the responses and
> see what will work. Most of the code I gathered from random examples
> not knowing exactly how the packages worked.
>
> I ultimately started using xlwt and xlrd imports on my own and seemingly
> got to a point but have incurred another problem that I am working to solve.
>
> I still am unclear as to the function of the package openpyxl but seeing
> as the code was wrong I will work on changing and seeing if I understand
> more clearly. I am still very new to python so bear with me. Most of
> my print statements are to make sure my outputs are the correct :).

First thing is, stick to one library or the other. In this case I am
talking the xlwt/xlrd combo or openpyxl. They do similar things, but the
devil is in the details and it less confusing dealing with a single view
of the world when working through the details.

>
> For the people who asked my goal is to read an excel file that has been
> populated with values exported from a batch program into a csv file or
> excel file. Then rearrange the values into a table that can be used for
> analysis. My technicians spend hours upon hours copying and pasting
> this data into a table so my goal was to make this all automatic with
> python.

Is possible to just pull from the original data set and create a
spreadsheet file without reading from another file?

Also, an outline of what 'rearrange' means would be helpful.
> --
> You received this message because you are subscribed to the Google
> Groups "python-excel" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to python-excel...@googlegroups.com
> <mailto:python-excel...@googlegroups.com>.
> To post to this group, send email to python...@googlegroups.com
> <mailto:python...@googlegroups.com>.
> Visit this group at http://groups.google.com/group/python-excel.
> For more options, visit https://groups.google.com/d/optout.


--
Adrian Klaver
adrian...@aklaver.com

Grant Kumataka

unread,
Apr 28, 2015, 3:48:26 PM4/28/15
to python...@googlegroups.com
Okay I have figured out how to read the correct value from a cell and save it to another workbook in a specific cell.  Thanks for the help. I didn't realize i was calling the sheet name and not the worksheet. :) 

Is there a major difference between openpyxl than xlwt/xlrd? 

>Is possible to just pull from the original data set and create a
>spreadsheet file without reading from another file?

I wish I could pull from the original data set.  I run a batch file to continuously run a program that tests throughput.  It returns a .csv file from these batch commands with an overview of the test and return parameters.  This is done for around 1000 tests.  They are separated into certain classes, degree, and type of test and from that I run another batch file that concatenates the .csv files into one per class.  That .csv file (or .xlsx) is the one I want to pull data from and organize into a table based on degree and type of test.  It has been mashed together just to get the program working and its great but now the data analysis is wasting a lot of time.  Ideally it would be programmed into one program that does it all but my resources for that work is limited and I currently do not possess the knowledge to do that.  I figured grabbing all the data in one file and analyzing is easier than hundreds of files.   

Thanks,

Grant


Adrian Klaver

unread,
Apr 28, 2015, 4:14:36 PM4/28/15
to python...@googlegroups.com
On 04/28/2015 12:48 PM, Grant Kumataka wrote:
> Okay I have figured out how to read the correct value from a cell and
> save it to another workbook in a specific cell. Thanks for the help. I
> didn't realize i was calling the sheet name and not the worksheet. :)
>
> Is there a major difference between openpyxl than xlwt/xlrd?

See below for general overview:
http://www.python-excel.org/

One major difference is openpyxl only works with .xlsx files while
xlrd/xlwt work with *.xls files. Another is openpyxl is one program to
read and write files, while xlrd/xlwt are two programs to do the same thing.
>
>
> >Is possible to just pull from the original data set and create a
> >spreadsheet file without reading from another file?
>
>
> I wish I could pull from the original data set. I run a batch file to
> continuously run a program that tests throughput. It returns a .csv
> file from these batch commands with an overview of the test and return
> parameters. This is done for around 1000 tests. They are separated
> into certain classes, degree, and type of test and from that I run
> another batch file that concatenates the .csv files into one per class.
> That .csv file (or .xlsx) is the one I want to pull data from and
> organize into a table based on degree and type of test. It has been
> mashed together just to get the program working and its great but now
> the data analysis is wasting a lot of time. Ideally it would be
> programmed into one program that does it all but my resources for that
> work is limited and I currently do not possess the knowledge to do
> that. I figured grabbing all the data in one file and analyzing is
> easier than hundreds of files.

I see. Well seems to me it would be easier to work with the aggregated
CSV file. There you have just the data, which is what you want to
manipulate, without worrying about all the spreadsheet extraneous
information. For instance workbook, sheet, row/col, etc. Do the
processing on the CSV data and then output a spreadsheet file.

>
> Thanks,
>
> Grant

John Yeung

unread,
Apr 28, 2015, 5:43:17 PM4/28/15
to python-excel
On Tue, Apr 28, 2015 at 4:14 PM, Adrian Klaver
<adrian...@aklaver.com> wrote:
> On 04/28/2015 12:48 PM, Grant Kumataka wrote:
>> Is there a major difference between openpyxl than xlwt/xlrd?
>
> See below for general overview:
> http://www.python-excel.org/

I certainly don't expect that overview to get any complaints that it's
too *specific*. :P

For whoever is maintaining that page (Chris Withers?), one thing which
has been bugging me is that .xlsx was introduced with (and is the
default format for) Excel 2007. So, while it's true that openpyxl and
xlsxwriter work with "Excel 2010" files, the verbiage for the other
packages say "older Excel files" which might be interpreted as "older
than Excel 2010". Yes, there is the note that you mean .xls, but if
the extension is what matters, then why not just say that? Why leave
that for a parenthetical comment?

> One major difference is openpyxl only works with .xlsx files while xlrd/xlwt
> work with *.xls files. Another is openpyxl is one program to read and write
> files, while xlrd/xlwt are two programs to do the same thing.

I would not put it *quite* that simply, now that we're responding on a
mailing list and not trying to be as brief as possible (which seems to
have been a goal for the python-excel.org page). xlrd and xlwt read
and write, respectively, but I wouldn't say they "do the same thing"
as openpyxl, because openpyxl provides a unified representation of a
workbook which you can read from and write to. xlrd and xlwt don't
have that, which is one of the reasons why xlutils exists.

> I see. Well seems to me it would be easier to work with the aggregated CSV
> file. There you have just the data, which is what you want to manipulate,
> without worrying about all the spreadsheet extraneous information. For
> instance workbook, sheet, row/col, etc. Do the processing on the CSV data
> and then output a spreadsheet file.

There are trade-offs. In Excel, numeric data can be represented as
numbers; in a CSV, everything is text, so you have to specify the
conversion yourself. But on balance, I also recommend using CSV if
that's convenient. CSV is much, much faster to process than either of
the Excel formats, and provides the capability for arbitrary-size
files. Excel formats are limited in size, .xls especially so. The
"output a spreadsheet file" step (implying non-CSV) is often not that
important, especially if the data is just going to be looked at by
other programs anyway. It depends on how valuable the cosmetic
aspects are. My "customers" are sales and marketing types, so the
Excel formats provide a lot of value. Some scientists probably either
don't care, or you've made their life harder by giving them Excel
instead of CSV.

John Y.

Adrian Klaver

unread,
Apr 28, 2015, 6:30:48 PM4/28/15
to python...@googlegroups.com
On 04/28/2015 02:43 PM, John Yeung wrote:
> On Tue, Apr 28, 2015 at 4:14 PM, Adrian Klaver
> <adrian...@aklaver.com> wrote:
>> On 04/28/2015 12:48 PM, Grant Kumataka wrote:
>>> Is there a major difference between openpyxl than xlwt/xlrd?
>>
>> See below for general overview:
>> http://www.python-excel.org/
>
> I certainly don't expect that overview to get any complaints that it's
> too *specific*. :P
>
> For whoever is maintaining that page (Chris Withers?), one thing which
> has been bugging me is that .xlsx was introduced with (and is the
> default format for) Excel 2007. So, while it's true that openpyxl and
> xlsxwriter work with "Excel 2010" files, the verbiage for the other
> packages say "older Excel files" which might be interpreted as "older
> than Excel 2010". Yes, there is the note that you mean .xls, but if
> the extension is what matters, then why not just say that? Why leave
> that for a parenthetical comment?

Because it is a first draft of a new version of the site?

>
>> One major difference is openpyxl only works with .xlsx files while xlrd/xlwt
>> work with *.xls files. Another is openpyxl is one program to read and write
>> files, while xlrd/xlwt are two programs to do the same thing.
>
> I would not put it *quite* that simply, now that we're responding on a
> mailing list and not trying to be as brief as possible (which seems to
> have been a goal for the python-excel.org page). xlrd and xlwt read
> and write, respectively, but I wouldn't say they "do the same thing"
> as openpyxl, because openpyxl provides a unified representation of a
> workbook which you can read from and write to. xlrd and xlwt don't
> have that, which is one of the reasons why xlutils exists.

I use xlrd and xlwt together all the time without using xlutils. Read a
file in using xlrd, process data, write out using xlwt. So two programs
to do what one program does, which is what the OP was asking.

>
>> I see. Well seems to me it would be easier to work with the aggregated CSV
>> file. There you have just the data, which is what you want to manipulate,
>> without worrying about all the spreadsheet extraneous information. For
>> instance workbook, sheet, row/col, etc. Do the processing on the CSV data
>> and then output a spreadsheet file.
>
> There are trade-offs. In Excel, numeric data can be represented as
> numbers; in a CSV, everything is text, so you have to specify the
> conversion yourself. But on balance, I also recommend using CSV if
> that's convenient. CSV is much, much faster to process than either of
> the Excel formats, and provides the capability for arbitrary-size
> files. Excel formats are limited in size, .xls especially so. The
> "output a spreadsheet file" step (implying non-CSV) is often not that
> important, especially if the data is just going to be looked at by
> other programs anyway. It depends on how valuable the cosmetic
> aspects are. My "customers" are sales and marketing types, so the
> Excel formats provide a lot of value. Some scientists probably either
> don't care, or you've made their life harder by giving them Excel
> instead of CSV.

It has nothing to do with CSV/Excel tradeoffs and everything to do with
the OPs workflow:

"I wish I could pull from the original data set. I run a batch file to
continuously run a program that tests throughput. It returns a .csv
file from these batch commands with an overview of the test and return
parameters. This is done for around 1000 tests. They are separated
into certain classes, degree, and type of test and from that I run
another batch file that concatenates the .csv files into one per class.
That .csv file (or .xlsx) is the one I want to pull data from and
organize into a table based on degree and type of test. ..."

The information is starting out as CSV, so I see no reason to turn into
a spreadsheet format until absolutely necessary, which would be after
the data processing is done.

>
> John Y.
>


--
Adrian Klaver
adrian...@aklaver.com

John Yeung

unread,
Apr 28, 2015, 7:36:48 PM4/28/15
to python-excel
On Tue, Apr 28, 2015 at 6:30 PM, Adrian Klaver
<adrian...@aklaver.com> wrote:
> On 04/28/2015 02:43 PM, John Yeung wrote:

>> I would not put it *quite* that simply, now that we're responding on a
>> mailing list and not trying to be as brief as possible (which seems to
>> have been a goal for the python-excel.org page). xlrd and xlwt read
>> and write, respectively, but I wouldn't say they "do the same thing"
>> as openpyxl, because openpyxl provides a unified representation of a
>> workbook which you can read from and write to. xlrd and xlwt don't
>> have that, which is one of the reasons why xlutils exists.
>
>
> I use xlrd and xlwt together all the time without using xlutils.

So do I. Well, I did, back before xlsxwriter. Now I use xlsxwriter
instead of xlwt. But I've never been much into xlutils.

> Read a file in using xlrd, process data, write out using xlwt. So two programs
> to do what one program does, which is what the OP was asking.

OP asked if there was a difference between openpyxl and the xlrd/xlwt
pair, and my contention is that there is. There are definite
conveniences that openpyxl provides which some users will find lacking
in xlrd and xlwt. "You can accomplish X whether you use Y or Z" isn't
the same as "Y and Z are more-or-less equivalent". Would you say
Python is about the same as C? Anything you can build with Python,
you can definitely build with C.

> It has nothing to do with CSV/Excel tradeoffs and everything to do with the
> OPs workflow:

(1) I *was* writing with OP's workflow in mind. His workflow is not
as clear to me as it seems to be to you. (2) There are more people
reading this than just the OP. My comments are intended to help
lurkers and future readers as well.

> "That .csv file (or .xlsx) is the one I want to pull data from and
> organize into a table based on degree and type of test. ..."
>
> The information is starting out as CSV

At some point, it looks like OP has the choice of whether to use CSV
or .xlsx. Either that, or he's getting a mix, or he doesn't know what
he's getting. If it's his choice, then doesn't it make sense for him
to be informed of the trade-offs between the formats, so that he can
properly choose his workflow?

For all I know, OP could do everything he needs to do without using
ANY Excel package at all. If he does need to work directly with Excel
files, and he's not being sloppy when he says ".xlsx", then xlwt is
right out the window anyway. (But xlrd is not, which is the other
thing that annoys me about the current state of python-excel.org, but
I understand the desire to make things simpler for newcomers.)

John Y.

Charlie Clark

unread,
Apr 29, 2015, 4:21:03 AM4/29/15
to python...@googlegroups.com
Am .04.2015, 23:43 Uhr, schrieb John Yeung <gallium....@gmail.com>:

> I certainly don't expect that overview to get any complaints that it's
> too *specific*.

> For whoever is maintaining that page (Chris Withers?), one thing which
> has been bugging me is that .xlsx was introduced with (and is the
> default format for) Excel 2007. So, while it's true that openpyxl and
> xlsxwriter work with "Excel 2010" files, the verbiage for the other
> packages say "older Excel files" which might be interpreted as "older
> than Excel 2010". Yes, there is the note that you mean .xls, but if
> the extension is what matters, then why not just say that? Why leave
> that for a parenthetical comment?

Excel started to switch to an XML based format with Office 2003. I think
it's fair to say that this went down like a lead balloon.

Technically, while xlsx was introduced with Office 2007, it wasn't until
afterwards that the format was standardised as ECMA 376: Office OpenXML.
Office 2010 was the first version to support the specification so there's
a chance that we do stuff that can't be read in Office 2007. Furthermore,
there's not just .xlsx but also .xlsm and .xlsxt. And since then Microsoft
has even gone so far as to start to introduce support for the Strict
version of the specification.

However, I don't think it would help much to try and cover such nuances on
the website and just stick with a high-level distinction.

Grant Kumataka

unread,
Apr 30, 2015, 2:02:27 PM4/30/15
to python...@googlegroups.com
I took into consideration working with csv.  And I believe that would be a preferred way in most cases but after looking at the exported file from my batch it will be easier for me to work in excel, the csv has nothing I can use as a tag to get the correct value I need. I even asked my programmer and he suggested to keep using excel too.

I have successfully read and written all the data I need.

After understanding more of the structure of python and the available references I was able to figure everything out fairly easily.  Thank you everyone who helped and contribute to the documentation. 

Now to build a functional GUI :)

Cheers,

Grant

Adrian Klaver

unread,
Apr 30, 2015, 3:06:02 PM4/30/15
to python...@googlegroups.com
On 04/30/2015 11:02 AM, Grant Kumataka wrote:
> I took into consideration working with csv. And I believe that would be
> a preferred way in most cases but after looking at the exported file
> from my batch it will be easier for me to work in excel, the csv has
> nothing I can use as a tag to get the correct value I need. I even asked
> my programmer and he suggested to keep using excel too.

There seems to a step missing from your description of your workflow.

How does it get from the CSV into Excel now?

Or to put in another way, the data starts in CSV and ends up in Excel
with a tag, so how does the CSV data get tagged?

>
> I have successfully read and written all the data I need.
>
> After understanding more of the structure of python and the available
> references I was able to figure everything out fairly easily. Thank you
> everyone who helped and contribute to the documentation.
>
> Now to build a functional GUI :)
>
> Cheers,
>
> Grant
>
> On Wednesday, April 29, 2015 at 1:21:03 AM UTC-7, Charlie Clark wrote:
>
> Am .04.2015, 23:43 Uhr, schrieb John Yeung <gallium....@gmail.com
> <javascript:>>:
Reply all
Reply to author
Forward
0 new messages