possible bug? - openpyxl reader returns None values every second cell

1,652 views
Skip to first unread message

Arvind R

unread,
Mar 18, 2015, 3:27:41 PM3/18/15
to openpyx...@googlegroups.com
Hello everyone,

I'm trying to read a xlsx using the load_workbook method and it works well. However, if there is an Empty cell in a column, the iter_rows() includes a None value to the output every second cell in a column. If there are no empty cells in the column within the data range, then there is no problem.

Please find below the piece of code and the data of the excel file I'm reading from:

/*

Excel file: book1.xlsx
Data in it:
(Column A, row 1-7)

<empty cell>
1
<empty cell>
2
3
4
5

*/


Code:

import openpyxl

Book = openpyxl.load_workbook(filename='book1.xlsx', use_iterators = True, read_only=True, data_only=True, guess_types=True)
ws = Book.get_sheet_by_name('sheet1')
for col in ws.iter_rows():
            for col_num in range(0, len(col)):
                data = {'col1': col[0].value}
                print data['col']

Actual Output:

None
1
None
None
2
None
3
None
4
None
5

Expected Output:

None
1
None
2
3
4
5

Any help would be highly appreciated.

Thanks,
Arvind R

Charlie Clark

unread,
Mar 18, 2015, 3:41:07 PM3/18/15
to openpyx...@googlegroups.com
Am .03.2015, 20:27 Uhr, schrieb Arvind R <r.arvi...@gmail.com>:

> Any help would be highly appreciated.

openpyxl.load_workbook(filename='book1.xlsx', use_iterators = True,
read_only=True, data_only=True, guess_types=True)

You seem to be overloading load_workbook with every option you can find.
Why? guess_types makes no sense in read_only mode.

ws.iter_rows(), or more succinctly ws.rows returns rows so

for col in ws.iter_rows()

looks distinctly odd. As, to be honest, does the rest of the code.

As per Stack Overflow it's essential to have a file to work with. See
https://bitbucket.org/snippets/openpyxl/ok6o for the code and a sample
file.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226
Message has been deleted

Arvind R

unread,
Mar 19, 2015, 5:12:33 PM3/19/15
to openpyx...@googlegroups.com
Hi,

Thanks for clarifying. I sincerely apologize and thereby agree that there were flaws with the code snippet I pasted. I visited the link you shared and worked out your example. However, I'm still facing the issue I reported. I'll try to be clearer and better this time. Please pardon me if I'm wrong for I'm just beginning to work with this python and excel stuff.

I see that the openpyxl read the cells perfectly if the excel file in not loaded in read-only mode. But, if opened in read-only mode, I still face the issue I reported.

Please see the code I'm using:

import openpyxl as p

W
= p.load_workbook(empty_rows.xlsx, read_only = True)
p
= W.get_sheet_by_name(name = 'sheet_a')

for row in p.iter_rows():
   
for k in row:
       
print k.value

I have attached excel file I'm using - "empty_rows.xlsx".  If I'm using the "read_only = True" option, the code prints additional Empty cells:


Output:

None
1
None
None
2
None
3
None
4
None
5

But, If I'm using the "read_only = False" option, the same code exactly prints the file contents:


Output:

None
1
None
2
3
4
5

Note the additional "None" values returned from reading the same excel file after every actual cell value. The reason why I want to use the read_only mode is that, otherwise, the openpyxl module raises a Type error exception: "unexpected keyword argument 'quote_prefix'", if I do not use the read_only option for certain xlsx files.

Even after several attempts, I fail to understand the reason behind this behavior. Please correct me if I'm still doing something wrong.

Thanks in advance.

Regards,
Arvind R
empty_rows.xlsx

Charlie Clark

unread,
Mar 20, 2015, 1:58:50 PM3/20/15
to openpyx...@googlegroups.com
Am .03.2015, 22:12 Uhr, schrieb Arvind R <r.arvi...@gmail.com>:

> Note the additional "None" values returned from reading the same excel
> file
> after every actual cell value. The reason why I want to use the read_only
> mode is that, otherwise, the openpyxl module raises a Type error
> exception:
> "unexpected keyword argument 'quote_prefix'", if I do not use the
> read_only
> option for certain xlsx files.
> Even after several attempts, I fail to understand the reason behind this
> behavior. Please correct me if I'm still doing something wrong.

Okay, thanks for keeping on this. I can now confirm the bug which I
suspect is related to a worksheets where there is only data in one column
and where there are empty cells.

Arvind

unread,
Mar 20, 2015, 10:22:44 PM3/20/15
to Charlie Clark, openpyx...@googlegroups.com
Thanks for confirming. However, I don't think the issue is only when the work book has data in a single column.

It is also when there are data in multiple columns with some of them being empty but only when the work book is loaded in read only mode. I hope this helps.


Arvind R

Sent from my iPhone

On Mar 20, 2015, at 11:28 PM, "Charlie Clark" <charli...@clark-consulting.eu> wrote:

>> Note the additional "None" values returned from reading the same excel =
> =20
>> file
>> after every actual cell value. The reason why I want to use the read_only
>> mode is that, otherwise, the openpyxl module raises a Type error =20
>> exception:
>> "unexpected keyword argument 'quote_prefix'", if I do not use the =20
>> read_only
>> option for certain xlsx files.
>> Even after several attempts, I fail to understand the reason behind this
>> behavior. Please correct me if I'm still doing something wrong.
>
> Okay, thanks for keeping on this. I can now confirm the bug which I =20
> suspect is related to a worksheets where there is only data in one column =
> =20
> and where there are empty cells.
>
> Charlie
> --=20
> Charlie Clark
> Managing Director
> Clark Consulting & Research
> German Office
> Kronenstr. 27a
> D=C3=BCsseldorf
> D- 40217
> Tel: +49-211-600-3657
> Mobile: +49-178-782-6226
>
> --=20
> You received this message because you are subscribed to a topic in the Goog=
> le Groups "openpyxl-users" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/ope=
> npyxl-users/u6ddjf7TfrQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to openpyx=
> l-users+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Charlie Clark

unread,
Mar 21, 2015, 11:20:20 AM3/21/15
to openpyx...@googlegroups.com
Am .03.2015, 03:22 Uhr, schrieb Arvind <r.arvi...@gmail.com>:

> Thanks for confirming. However, I don't think the issue is only when the
> work book has data in a single column.

No, any worksheet with empty rows due a counter not being updated. Fixed
in the 2.2 branch and will be in 2.2.1 when we release it.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf

Arvind

unread,
Mar 21, 2015, 11:25:46 AM3/21/15
to openpyx...@googlegroups.com
Sounds good! Thank you very much. Awaiting the release..!

Sent from my iPhone
> --
> You received this message because you are subscribed to a topic in the Google Groups "openpyxl-users" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/openpyxl-users/u6ddjf7TfrQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to openpyxl-user...@googlegroups.com.

Charlie Clark

unread,
Mar 21, 2015, 11:26:30 AM3/21/15
to openpyx...@googlegroups.com
Am .03.2015, 16:25 Uhr, schrieb Arvind <r.arvi...@gmail.com>:

> Sounds good! Thank you very much. Awaiting the release..!

Testing it from a checkout with a real file would be helpful.
Message has been deleted

Arvind R

unread,
Mar 22, 2015, 3:31:34 AM3/22/15
to openpyx...@googlegroups.com
Hi there,

So.., I checked out the latest source (tip) from the repo and tested the issue (with a real excel file) and to me it seems to have got fixed. Empty cells are not duplicated anymore. 
Hope this helps.

Also, I noted that some of the excel files (real files) won't load when "read_only=False" and throw an exception like this: "Error: __init__() got an unexpected keyword argument 'quotePrefix'". But loads perfectly when in read_only mdoe. Can you help me with this?

And, sorry to insist on, I modified my original question on stackoverflow. So, can you please remove the downvote for this question?

Thanks,
Arvind R

Charlie Clark

unread,
Mar 23, 2015, 5:22:47 AM3/23/15
to openpyx...@googlegroups.com
Am .03.2015, 08:31 Uhr, schrieb Arvind R <r.arvi...@gmail.com>:

> Hi there,

> So.., I checked out the latest source (tip) from the repo and tested the
> issue (with a real excel file) and to me it seems to have got fixed.
> Empty
> cells are not duplicated anymore.
> Hope this helps.

It's the rows that get duplicated. Glad to know it's working for you.

> Also, I noted that some of the excel files (real files) won't load when
> "read_only=False" and throw an exception like this: "Error: __init__()
> got
> an unexpected keyword argument 'quotePrefix'". But loads perfectly when
> in
> read_only mdoe. Can you help me with this?

Not without a a sample file.
Reply all
Reply to author
Forward
0 new messages