[BUG] def parse_ranges broken

48 views
Skip to first unread message

Marco Aschwanden

unread,
Jan 21, 2015, 4:37:19 AM1/21/15
to openpyx...@googlegroups.com
Sorry for posting a bug report here - but the following site seems to be down: https://bitbucket.org/openpyxl/openpyxl/issues

Setting:
- Ubuntu 14.0.4
- Python 2.7.6
- openpyxl 2.1.4

In our code we load a workbook (attached: Titelanalyse_USA3.xlsm):

wb = load_workbook(filename=file_path, data_only=True, guess_types=True)

And while reading the file an exception is thrown on line 85:

/usr/local/lib/python2.7/dist-packages/openpyxl/workbook/names/external.py in parse_ranges

    def parse_ranges(xml):
        tree = fromstring(xml)
        book = tree.find('{%s}externalBook' % SHEET_MAIN_NS)
        names = book.find('{%s}definedNames' % SHEET_MAIN_NS)
        for n in safe_iterator(names, '{%s}definedName' % SHEET_MAIN_NS):
            yield ExternalRange(**n.attrib)  # <------ Throws: __init__() takes at least 3 arguments (2 given) --> n.attrib = {'name': 'FT'}

The problem arises with a call to:

class openpyxl.workbook.names.external.External(name, refersTo, sheetId=None)

A look at the xml of the function parse_ranges(xml):

--- xml ---

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <sheetNames>
      <sheetName val="sample"/>
    </sheetNames>
    <definedNames><definedName name="FT"/></definedNames>
    <sheetDataSet><sheetData sheetId="0"/></sheetDataSet>
  </externalBook>
</externalLink>

--------------

So, the refersTo-part is not in the dictionary n.attrib = {'name': 'FT'}

This load_workbook function loaded without problems with openpyxl 2.0.5/1.8.4

I will glad re-post to a bug tracker if it works again...

Cheers,
Marco


Titelanalyse_USA3.xlsm

Charlie Clark

unread,
Jan 21, 2015, 5:45:01 AM1/21/15
to openpyx...@googlegroups.com
Am .01.2015, 10:37 Uhr, schrieb Marco Aschwanden <m.asch...@gmail.com>:

> Sorry for posting a bug report here - but the following site seems to be
> down: https://bitbucket.org/openpyxl/openpyxl/issues

Yes, please do. Seems to be working at the moment.

I'm not sure if it's related to:
https://bitbucket.org/openpyxl/openpyxl/issue/406/excel-gives-an-error-opening-the-xlsm-file

> Setting:
> - Ubuntu 14.0.4
> - Python 2.7.6
> - openpyxl 2.1.4
> In our code we load a workbook (attached: Titelanalyse_USA3.xlsm):
> wb = load_workbook(filename=file_path, data_only=True, guess_types=True)

I think you probably don't want guess_types=True for an existing file, but
you probably know what you're doing.

> And while reading the file an exception is thrown on line 85:
> */usr/local/lib/python2.7/dist-packages/openpyxl/workbook/names/external.py
> in parse_ranges*
> def parse_ranges(xml):
> tree = fromstring(xml)
> book = tree.find('{%s}externalBook' % SHEET_MAIN_NS)
> names = book.find('{%s}definedNames' % SHEET_MAIN_NS)
> for n in safe_iterator(names, '{%s}definedName' % SHEET_MAIN_NS):
> yield ExternalRange(**n.attrib) *# <------ Throws:
> __init__()
> takes at least 3 arguments (2 given) --> n.attrib = {'name': 'FT'}*

> The problem arises with a call to:
> class openpyxl.workbook.names.external.External(name, refersTo,
> sheetId=None)

I don't know why I didn't do it at the time, but refersTo can be optional.
Simply setting the descriptor to String(allow_none=True) and making None
the default value for refersTo in the __init__ will avoid this. Just tried
this locally and it's fine but I get an exception elsewhere trying to save
the file.

I'm currently trying to avoid a 2.1.5 release (fear of the merge due to
copyright date changes) but as 2.2 won't be ready for a while (too many of
my fingers in too many pies), it's probably going to be inevitable.

> So, the refersTo-part is not in the dictionary n.attrib = {'name': 'FT'}
> This load_workbook function loaded without problems with openpyxl
> 2.0.5/1.8.4

Support for external links was only added in 2.1 and is still a bit rough,
so you get no errors in earlier versions because they are just ignored
(and stripped out). Seeing as you're disabling macros and working with the
values only, I suspect you're not interested in preserving everything in
the workbook anyway.

Charlie
--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Kronenstr. 27a
Düsseldorf
D- 40217
Tel: +49-211-600-3657
Mobile: +49-178-782-6226

Marco Aschwanden

unread,
Jan 21, 2015, 7:57:15 AM1/21/15
to openpyx...@googlegroups.com
2015-01-21 11:44 GMT+01:00 Charlie Clark <charli...@clark-consulting.eu>:
Am .01.2015, 10:37 Uhr, schrieb Marco Aschwanden <m.asch...@gmail.com>:

Sorry for posting a bug report here - but the following site seems to be
down: https://bitbucket.org/openpyxl/openpyxl/issues

Yes, please do. Seems to be working at the moment.


Sorry, I still get 'Website not available'... hmmm... I checked with System-Administration and it seems, they have a Firewall blockage in place. [:(] It is removed now... I will place the intial request into the issue queue in a moment
Not really... is it?

 
Setting:
- Ubuntu 14.0.4
- Python 2.7.6
- openpyxl 2.1.4
In our code we load a workbook (attached: Titelanalyse_USA3.xlsm):
wb = load_workbook(filename=file_path, data_only=True, guess_types=True)

I think you probably don't want guess_types=True for an existing file, but you probably know what you're doing.


It is a generic importer of xlsx/m files, so we do not know, what type of field is in the cell... it is a read-only import though...

 
And while reading the file an exception is thrown on line 85:
*/usr/local/lib/python2.7/dist-packages/openpyxl/workbook/names/external.py
in parse_ranges*
   def parse_ranges(xml):
        tree = fromstring(xml)
        book = tree.find('{%s}externalBook' % SHEET_MAIN_NS)
        names = book.find('{%s}definedNames' % SHEET_MAIN_NS)
        for n in safe_iterator(names, '{%s}definedName' % SHEET_MAIN_NS):
            yield ExternalRange(**n.attrib)  *# <------ Throws: __init__()
takes at least 3 arguments (2 given) --> n.attrib = {'name': 'FT'}*

The problem arises with a call to:
class openpyxl.workbook.names.external.External(name, refersTo,
sheetId=None)

I don't know why I didn't do it at the time, but refersTo can be optional. Simply setting the descriptor to String(allow_none=True) and making None the default value for refersTo in the __init__ will avoid this. Just tried this locally and it's fine but I get an exception elsewhere trying to save the file.

In my case this would be sufficient - I am reading the file only.
 

I'm currently trying to avoid a 2.1.5 release (fear of the merge due to copyright date changes) but as 2.2 won't be ready for a while (too many of my fingers in too many pies), it's probably going to be inevitable.


Could you tell
 
So, the refersTo-part is not in the dictionary n.attrib = {'name': 'FT'}
This load_workbook function loaded without problems with openpyxl
2.0.5/1.8.4

Support for external links was only added in 2.1 and is still a bit rough, so you get no errors in earlier versions because they are just ignored (and stripped out). Seeing as you're disabling macros and working with the values only, I suspect you're not interested in preserving everything in the workbook anyway.

As I said... I read-only the file(s). The macros are ignored anyway - I even tried to export the xlsm to xlsx and tried an import --> same issue.
 
What patch would resolve the issue? Should I patch ExternalRange.__init__?

Gruss,
Marco

Charlie Clark

unread,
Jan 21, 2015, 8:48:57 AM1/21/15
to openpyx...@googlegroups.com
Am .01.2015, 13:56 Uhr, schrieb Marco Aschwanden <m.asch...@gmail.com>:

> Not really... is it?

I guess not, was just mentally grouping issues related to externalLinks.

>> Setting:
>>> - Ubuntu 14.0.4
>>> - Python 2.7.6
>>> - openpyxl 2.1.4
>>> In our code we load a workbook (attached: Titelanalyse_USA3.xlsm):
>>> wb = load_workbook(filename=file_path, data_only=True,
>>> guess_types=True)
>>>
>>
>> I think you probably don't want guess_types=True for an existing file,
>> but
>> you probably know what you're doing.
>>
>>
> It is a generic importer of xlsx/m files, so we do not know, what type of
> field is in the cell... it is a read-only import though...

Then you probably don't want it. It really only makes sense for untyped
imports where you trust openpyxl to guess what's a number and what isn't
it. Eric says he has customers with such crap in existing workbooks so it
makes sense for them.

> In my case this would be sufficient - I am reading the file only.
>
>>
>> I'm currently trying to avoid a 2.1.5 release (fear of the merge due to
>> copyright date changes) but as 2.2 won't be ready for a while (too many
>> of
>> my fingers in too many pies), it's probably going to be inevitable.
>>
>>
> Could you tell

Could I tell what? ;-)

>> So, the refersTo-part is not in the dictionary n.attrib = {'name':
>> 'FT'}
>>> This load_workbook function loaded without problems with openpyxl
>>> 2.0.5/1.8.4
>>>
>>
>> Support for external links was only added in 2.1 and is still a bit
>> rough,
>> so you get no errors in earlier versions because they are just ignored
>> (and
>> stripped out). Seeing as you're disabling macros and working with the
>> values only, I suspect you're not interested in preserving everything in
>> the workbook anyway.
> As I said... I read-only the file(s). The macros are ignored anyway - I
> even tried to export the xlsm to xlsx and tried an import --> same issue.

> What patch would resolve the issue? Should I patch
> ExternalRange.__init__?

I've committed the changes to the 2.1 and 2.2 branches so you fill your
boots. Will need to create a new issue myself on why I can't save the file.
Reply all
Reply to author
Forward
0 new messages