Blank rows at end of workbook

1,987 views
Skip to first unread message

dads

unread,
May 26, 2011, 10:40:59 AM5/26/11
to python-excel
I'm trying to check for blank rows at the end of an excel spreadsheet.
We can locate them manually using ctrl-end and then delete the empty
row but we'd like to automate it.

The empty row could have been created by inserting a row at the end
and saving the spreadsheet,

I've just started but the code below brings back the last row that has
data in it, not the empty last row. Is there anyway to check if this
row exists.

fp = os.path.join(winshell.desktop(),'spsheet.xls')
book = open_workbook(fp)
sheet = book.sheet_by_index(0)
last_row = sheet.row(sheet.nrows-1)
print(last_row)

John Machin

unread,
May 26, 2011, 6:03:47 PM5/26/11
to python...@googlegroups.com
On Friday, 27 May 2011 00:40:59 UTC+10, dads wrote:
I'm trying to check for blank rows at the end of an excel spreadsheet.
We can locate them manually using ctrl-end and then delete the empty
row but we'd like to automate it.

blank != empty. See the docs for xlrd's Cell class. Blank cells actually exist in the file, and have formatting, but no data. Empty cells don't exist in the file at all, and are just an artifact of xlrd keeping a row's cell data in sequences rather than using mappings.

The empty row could have been created by inserting a row at the end
and saving the spreadsheet,

I've just started but the code below brings back the last row that has
data in it, not the empty last row.

The default when you do xlrd.open_workbook() is formatting_info=False, which treats blank cells as empty. Consequently, trailing rows with nothing more interesting than blank cells are ignored.
 
Is there anyway to check if this
row exists.

fp = os.path.join(winshell.desktop(),'spsheet.xls')
book = open_workbook(fp)
sheet = book.sheet_by_index(0)
last_row = sheet.row(sheet.nrows-1)
print(last_row)


I'd use sheet.row_types for this, and check for multiple useless rows, like this (untested):

dud_types = set([xlrd.XL_CELL_BLANK, xlrd.XL_CELL_EMPTY])
for rowx in xrange( sheet.nrows -1, -1, -1):
    if not all(ty in dud_types for ty in sheet.row_types(rowx)):
        break
n_good_rows = rowx + 1

If you are interested in code that has a much wider definition of useless, and works on trailing COLUMNS as well as rows, have a look at xlutils.margins.

Curiosity: why does one blank row at the end of a spreadsheet cause grief? Can't it be ignored?

dads

unread,
May 28, 2011, 5:12:25 PM5/28/11
to python-excel
Thank you for your speedy reply.

With your example I've come up with the following code, which
does the job required.

fp = c:\workbook.xls
book = xlrd.open_workbook(fp, formatting_info=True)
sheet = book.sheet_by_index(0)

dud_types = set([xlrd.XL_CELL_BLANK, xlrd.XL_CELL_EMPTY ])
rowf = [ ty for ty in sheet.row_types(sheet.nrows-1) ]
if all( x in dud_types for x in rowf ):
print('blank row') #do whatever

Now to fulfil your curiosity, I'll try to keep a long story short.
I'm IT 1st line support and one of my hobbies is programming.
Where in neck deep with a stretched out dev firm, our company
is always running before it can crawl. dev firm creates job to collate
data. It breaks with blank rows at end or duplicate rows of excel
sheet. To make my life easier in support I'm creating the script to
do the checking. The dev firm won't touch it, too busy and I'm
not allowed either. Hebce work around.

Thanks again
Reply all
Reply to author
Forward
0 new messages