crash on reading complex xl file - way to ignore some sheets or import errors?

129 views
Skip to first unread message

Sylvain Thénault

unread,
Jun 17, 2022, 9:50:15 AM6/17/22
to openpyxl-users
Hello,

I'm having a problem on attempting to read a complex xls file with dozen sheets. Here is a traceback (I already fixed one by deleting some cell content):

```
      File "Rapport audit.py", line 125, in generate
        with pd.ExcelWriter(
      File "pandas/io/excel/_openpyxl.py", line 66, in __init__
        self.book = load_workbook(self.handles.handle, **engine_kwargs)
      File "openpyxl/reader/excel.py", line 311, in load_workbook
        reader.read()
      File "openpyxl/reader/excel.py", line 278, in read
        self.read_worksheets()
      File "openpyxl/reader/excel.py", line 254, in read_worksheets
        charts, images = find_images(self.archive, rel.target)
      File "openpyxl/reader/drawings.py", line 39, in find_images
        cs = get_rel(archive, deps, rel.id, ChartSpace)
      File "openpyxl/packaging/relationship.py", line 145, in get_rel
        obj = cls.from_tree(tree)
      File "openpyxl/descriptors/serialisable.py", line 87, in from_tree
        obj = desc.expected_type.from_tree(el)
      File "openpyxl/descriptors/serialisable.py", line 87, in from_tree
        obj = desc.expected_type.from_tree(el)
      File "openpyxl/descriptors/serialisable.py", line 87, in from_tree
        obj = desc.expected_type.from_tree(el)
      [Previous line repeated 1 more time]
      File "openpyxl/descriptors/serialisable.py", line 103, in from_tree
        return cls(**attrib)
      File "openpyxl/chart/layout.py", line 56, in __init__
        self.y = y
      File "openpyxl/descriptors/nested.py", line 35, in __set__
        super(Nested, self).__set__(instance, value)
      File "openpyxl/descriptors/base.py", line 104, in __set__
        raise ValueError("Min value is {0}".format(self.min))
    ValueError: Min value is -1
```

This occurs on a sheet with several charts. I understand this occurs because the file, though generated by XL itself, doesn't follow the spec strictly, right?

However, I still would like to read it... For what it's worth, my use case is to read this complex XL file, regenerate a single (simple) sheet within it then close it. Isn't there a way to read/write this single sheet and/or to ignore reading errors? In my case the rewritten sheet is used as source in several other sheets, dunno the implication of this...

Many thanks for any help !
Cheers  

Charlie Clark

unread,
Jun 20, 2022, 4:53:00 AM6/20/22
to openpyxl-users
On 17 Jun 2022, at 15:50, Sylvain Thénault wrote:

> This occurs on a sheet with several charts. I understand this occurs
>
> because the file, though generated by XL itself, doesn't follow the spec
>
> strictly, right?

It certainly looks like that but I'd need to check a file to be sure. I did spend a while checking the chart layout element with the OOXML working group because it's very poorly documented.

> However, I still would like to read it... For what it's worth, my use case
>
> is to read this complex XL file, regenerate a single (simple) sheet within
>
> it then close it. Isn't there a way to read/write this single sheet and/or
>
> to ignore reading errors? In my case the rewritten sheet is used as source
>
> in several other sheets, dunno the implication of this...

You can read individual sheets in read-only mode and this will avoid the charts. Apart from that you'd have to monkey patch the code to ignore the charts. We don't provide any fine-grained controls for this, because it would clog up the interface and we think that specification violations should be flagged as such.

Charlie

--
Charlie Clark
Managing Director
Clark Consulting & Research
German Office
Sengelsweg 34
Düsseldorf
D- 40489
Tel: +49-203-3925-0390
Mobile: +49-178-782-6226

Sylvain Thénault

unread,
Jun 20, 2022, 10:18:43 AM6/20/22
to openpyxl-users
Hello Charlie, thank you for your response. Some questions below.

Le lundi 20 juin 2022 à 10:53:00 UTC+2, charli...@clark-consulting.eu a écrit :
On 17 Jun 2022, at 15:50, Sylvain Thénault wrote:

> This occurs on a sheet with several charts. I understand this occurs
>
> because the file, though generated by XL itself, doesn't follow the spec
>
> strictly, right?

It certainly looks like that but I'd need to check a file to be sure. I did spend a while checking the chart layout element with the OOXML working group because it's very poorly documented.

File is not mine but I'll check with its authors if I can share somehow. 

> However, I still would like to read it... For what it's worth, my use case
>
> is to read this complex XL file, regenerate a single (simple) sheet within
>
> it then close it. Isn't there a way to read/write this single sheet and/or
>
> to ignore reading errors? In my case the rewritten sheet is used as source
>
> in several other sheets, dunno the implication of this...

You can read individual sheets in read-only mode and this will avoid the charts. Apart from that you'd have to monkey patch the code to ignore the charts. We don't provide any fine-grained controls for this, because it would clog up the interface and we think that specification violations should be flagged as such.

I see in the doc how to load a whole workbook in read-only mode, but I can't see how I could read some sheets read-only and another read-write (or alternativly to read/write a single sheet and ignore others), do you have a clue? 

best regards

Charlie Clark

unread,
Jun 20, 2022, 11:53:58 AM6/20/22
to openpyxl-users
On 20 Jun 2022, at 16:18, Sylvain Thénault wrote:

> I see in the doc how to load a whole workbook in read-only mode, but I
>
> can't see how I could read some sheets read-only and another read-write (or
>
> alternativly to read/write a single sheet and ignore others), do you have a
>
> clue?

You don't get to pick and choose modes. Read-only is for the whole workbook. Worksheets are read on demand and most of their associated data like charts aren't.
Reply all
Reply to author
Forward
0 new messages