PyExcelerator : Reading a blank cell?

ppaddys

unread,

Feb 11, 2009, 11:40:27 PM2/11/09

to python-excel

hi,

Thanks for the solution in last post.
I am reading a Excel file using following :
book = pyExcelerator.parse_xls (iTestCaseName)
data = book[0][1]

and reading the required cell value using following :
Testdata = ((data[(6, 4)]))

the problem is when the cell is blank PyExcelrator gives and
KeyError (6,4)
is there any way to read the blank cell?

John Machin

unread,

Feb 12, 2009, 1:31:02 AM2/12/09

to python...@googlegroups.com

On 12/02/2009 3:40 PM, ppaddys wrote:
> hi,
>
> Thanks for the solution in last post.

You didn't implement this part of the solution:
"""
> I am new to the Python Pyexcelerator.

Good. You won't miss it. Flick it and get xlwt from
http://pypi.python.org/pypi/xlwt. Then
g/pyExcelerator/s//xlwt/
in your source code and you're off. Benefits: it's maintained, it's
fixed a bug or two, it's faster, it's being enhanced, and it's not
called eXceLWriTe so you don't need 3 attempts to get your import
statements spelled right :-)
"""

Why not?

> I am reading a Excel file using following :
> book = pyExcelerator.parse_xls (iTestCaseName)

Don't do that. pyExcelerator is abandonware; it hasn't been maintained
for a long time. Use xlwt (a maintained fork of pyExcelerator) instead
for writing files. The parse_xls facility was deprecated so severely
that it has been removed from xlwt. Use xlrd for reading xls files.

http://pypi.python.org/pypi/xlrd
http://pypi.python.org/pypi/xlwt

> data = book[0][1]
>
> and reading the required cell value using following :
> Testdata = ((data[(6, 4)]))

Why have you wrapped that in (( )) ??

>
> the problem is when the cell is blank PyExcelrator gives and
> KeyError (6,4)
> is there any way to read the blank cell?

It's blank. That means there's no data in it. What do you expect to find
there?

The best you can do is

try:
value = data[(6, 4)]
except KeyError:
# do something else here

HTH,
John

paddy paddys

unread,

Feb 12, 2009, 2:20:57 AM2/12/09

to python...@googlegroups.com

thanks john

your comments are really helpful. I already have the xlwt installed but since it does not have this Parse_xls functionality i am using PyExcelarator.

John Machin

unread,

Feb 12, 2009, 2:34:47 AM2/12/09

to python...@googlegroups.com

On 12/02/2009 6:20 PM, paddy paddys wrote:
> thanks john
> your comments are really helpful. I already have the xlwt installed but
> since it does not have this Parse_xls functionality i am using
> PyExcelarator.

"functionality" is not the word I would use in that context :-)

Follow the bouncing ball and sing along with me """The parse_xls

facility was deprecated so severely that it has been removed from xlwt.

Use xlrd for reading xls files.""" and you'll be a lot better off.

Cheers,
John

yamex5

unread,

Feb 18, 2011, 4:53:04 PM2/18/11

to python...@googlegroups.com

Hello John,

I am trying to extract test results and logging info, which may or may not be present output into Excel. Even when it's present, I have no idea how many rows may have data, so I am interested in a solution to the blank cell problem. In other words, I am using the blank cell to let me know to move on. I could catch the error ( thank you for that, I am new but should have thought of it myself ), but I was wondering if empty_cell( variable ) could be used instead?
I'm assuming the variable is the cell to be tested, or is that an incorrect assumption?
The documentation states:
"There is one and only one instance of an empty cell -- it's a singleton. This is it. You may use a test like "acell is empty_cell".
However, I am not sure what class the method empty_cell belongs to, though it is not a member of the Sheet class, and if run on it's own, it reports "Name Error: name 'empty_cell' is not defined"

Thank you,
Mike

yamex5

unread,

Feb 18, 2011, 5:28:11 PM2/18/11

to python...@googlegroups.com

Hello John,

I am catching the following error:

Traceback (most recent call last):
File "test.py", line 44, in <module>
      print sheet.cell_value( row, col )
File "C:\Python26\lib\site-packages\xlrd\sheet.py", line 263, in cell_value
      return self._cell_values[rowx][colx]
   IndexError: list index out of range

when running the following code:
########################################################################

import xlrd
import sys
import os
import fnmatch
import shutil

## Save the path to the output result exel files.
OutputPath = 'D:\Profiles\systest\My Documents\Downloads\\'

## Get the list of files in the download directory.
dir_list = os.listdir( OutputPath )

## Find all the .xls test result files.
for fileName in dir_list:
    if fnmatch.fnmatch( fileName, 'outreport*.xls' ):

        ## Append the path name to the results file.
        extendedFileName = OutputPath+fileName
    print extendedFileName

        ## Open the excel document.
        excelBook = xlrd.open_workbook( extendedFileName )

        for sheetName in excelBook.sheet_names():
            sheet = excelBook.sheet_by_name( sheetName )
            print sheetName

            if ( sheetName == 'Log' ):
                row = 1
                col = 2
                moreData = 1
                while( moreData ):
                    try:
                    print sheet.cell_value( row, col )
                        row += 1
                    except IndexError:
                        moreData = 0
                        break

            if ( sheetName == 'Trend' ):
                row = 1
                col = 2
                print sheet.cell_value( row, col )

################################################################

When the cells are non-empty, there are no errors and I am able to view the data content successfully.

Thank you,
Mike

John Machin

unread,

Feb 18, 2011, 6:15:27 PM2/18/11

to python-excel

On Feb 19, 8:53 am, yamex5 <yam...@gmail.com> wrote:
> Hello John,

Hello.

When you have a question, please start a new thread. Don't reply to a
thread that is only casually related to your question.

>
> I am trying to extract test results and logging info, which may or may not
> be present output into Excel. Even when it's present, I have no idea how
> many rows may have data, so I am interested in a solution to the blank cell
> problem. In other words, I am using the blank cell to let me know to move
> on. I could catch the error ( thank you for that, I am new but should have

> thought of it myself ), but I was wondering if *empty_cell( variable )*could be used instead?

> I'm assuming the variable is the cell to be tested, or is that an incorrect
> assumption?
> The documentation states:

> *"There is one and only one instance of an empty cell -- it's a
> singleton. This is it. You may use a test like "acell is empty_cell"*.

> However, I am not sure what class the method empty_cell belongs to, though

> it is not a member of the Sheet class, and if run on it's own, it reports "*Name
> Error: name 'empty_cell' is not defined*"

The documentation that you found is under the heading "Module
Contents". empty_cell is a module attribute, like open_workbook. You
can address it as xlrd.empty_cell

"""
empty_cell (variable)

There is one and only one instance of an empty cell -- it's a
singleton. This is it. You may use a test like "acell is empty_cell"
"""

" (variable)" in a non-bold font indicates that it is a variable, as
opposed to a class or function
"(variable)" in bold would indicate that it was a function of one arg
with the unlikely name "variable"

"acell is empty_cell" does not mean "empty_cell(acell)"

John Yeung

unread,

Feb 18, 2011, 6:33:57 PM2/18/11

to python...@googlegroups.com, yamex5

John Machin already addressed your question about empty_cell. I will
add a couple other comments:

If you haven't already, please check out
<http://www.python-excel.org/>. In particular, have a look at the
tutorial (python-excel.pdf, and don't be deterred by the need to click
another link to get to the actual PDF). There you will find lots of
examples, including how to use empty_cell.

My other comment is: If your data is in contiguous rows (that is,
once you see one blank cell, you can be sure there is no more data in
subsequent rows) then you could just as well (and should) use nrows.

John Y.

John Machin

unread,

Feb 18, 2011, 6:37:07 PM2/18/11

to python-excel

On Feb 19, 9:28 am, yamex5 <yam...@gmail.com> wrote:
> Hello John,
>
> I am catching the following error:
>

> *Traceback (most recent call last):

> File "test.py", line 44, in <module>
> print sheet.cell_value( row, col )
> File "C:\Python26\lib\site-packages\xlrd\sheet.py", line 263, in
> cell_value
> return self._cell_values[rowx][colx]

> IndexError: list index out of range*

>
> when running the following code:

Sorry, I don't understand. If you were catching it, you wouldn't get a
traceback.

> ########################################################################
>
> *import* xlrd
> *import* sys
> *import *os
> *import *fnmatch
> *import *shutil
>
> *## Save the path to the output result exel files. *

> OutputPath = 'D:\Profiles\systest\My Documents\Downloads\\'
>

> *## Get the list of files in the download directory.*

> dir_list = os.listdir( OutputPath )
>

> *## Find all the .xls test result files.*

> for fileName in dir_list:
> if fnmatch.fnmatch( fileName, 'outreport*.xls' ):
>

> *## Append the path name to the results file*.

> extendedFileName = OutputPath+fileName
> print extendedFileName
>

> *## Open the excel document.*

> excelBook = xlrd.open_workbook( extendedFileName )
>
> for sheetName in excelBook.sheet_names():
> sheet = excelBook.sheet_by_name( sheetName )
> print sheetName
>
> if ( sheetName == 'Log' ):
> row = 1
> col = 2
> moreData = 1
> while( moreData ):
> try:
> print sheet.cell_value( row, col )

The above line is indented incorrectly.

> row += 1
> except IndexError:
> moreData = 0
> break
>
> if ( sheetName == 'Trend' ):
> row = 1
> col = 2
> print sheet.cell_value( row, col )

Having counted the lines in your code, it seems that the above line is
the one causing the problem.

If cell C2 doesn't exist, then of course you will get that error. What
do sheet.nrows and sheet.ncols contain?
Why don't you use them? Example:

colx = 2 # "col" is a bad name for a column index
for rowx in xrange(1, sheet.nrows):
print sheet.cell_value(rowx, colx)

Please note two separate concepts:

(1) Non-existant cells are ones whose coordinates don't satisfy 0 <=
colx < sheet.ncols and 0 <= rowx < sheet.nrows -- you will get an
index_error if you try to access them; with the wrinkle that Python-
fashion, negative indexes wrap around.

(2) Empty cells are existing cells that have no value and (when
formatting_info is true) no explicit formatting applied.

yamex5

unread,

Feb 18, 2011, 7:10:54 PM2/18/11

to python-excel

To both Johns,

Thank you very much.
Mr. Machin, apologize for the mistake regarding catching the
exception, and my salute for seeing the issue so quickly.
(Especially given there were no line numbers in the sample code!)
After posting, I ran the code and then saw that the issue was indeed
with the code handling the 'Trend' sheet.
I was not aware of the nrows and ncols members and will indeed use
them.
I should explain that I've only been using python for a couple of
weeks, so I'm sure I'm going to miss using quite a few features of the
language.
Also, thank you Mr. Yeung for the pointer to examples for
empty_cell().

Sincerely,

Mike

yamex5

unread,

Feb 18, 2011, 7:16:24 PM2/18/11

to python-excel

Hello John,

It took a while to sink in, but I see that the sheet.nrows are the
number of non-empty rows, and I was able to remove the exception
handling code.
Thank you again!

Mike

On Feb 18, 5:37 pm, John Machin <sjmac...@lexicon.net> wrote:

Reply all

Reply to author

Forward