avhrr_l1b_eps reader broken in satpy 0.45 and 0.46

176 views
Skip to first unread message

lobsiger...@gmail.com

unread,
Dec 22, 2023, 2:08:05 PM12/22/23
to pytroll
Dear developers,

a user of my satpy scripts could not read his EUMETCast GDS Metop
AVHRR files anymore. I had no problems with satpy 0.44. I updated to
satpy 0.46 and ran into the same problems. Downgrading to 0.45
did not help. The "cloud_flag" has been added from 0.44 to 0.45.
I'm not sure this has something to do with the issue. There are no
changes of file pattern definitions but the reader says "ValueError:
No supported files found". ERROR messages and more details see:


Cheers,
Ernst

David Hoese

unread,
Dec 22, 2023, 3:23:38 PM12/22/23
to pyt...@googlegroups.com
Hi Ernst,

Thanks for letting us know. Just to be sure, what versions of xarray and dask are being used? When Satpy was updated for you, were any other libraries updated at the same time ("a lot" is an OK answer)?

What version of Python are you and your user using?

While trying to test this locally I noticed I got an import error. Could you make sure that "defusedxml" is installed in your environment?

Lastly, I did want to mention that a couple days ago I made a fix to the `eps_l1b.py` module related to documentation generation and how it defined a function used by dask (https://github.com/pytroll/satpy/pull/2700). However, this wasn't causing any import or other testing issues so I'm skeptical that it is the issue here.

Dave
> --
> You received this message because you are subscribed to the Google Groups "pytroll" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pytroll+u...@googlegroups.com <mailto:pytroll+u...@googlegroups.com>.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/pytroll/3757e4db-587e-42d9-af0d-d4b53929fc25n%40googlegroups.com <https://groups.google.com/d/msgid/pytroll/3757e4db-587e-42d9-af0d-d4b53929fc25n%40googlegroups.com?utm_medium=email&utm_source=footer>.

lobsiger...@gmail.com

unread,
Dec 23, 2023, 6:06:33 AM12/23/23
to pytroll
Hi Dave,

I started with 2 EUMETCast receivers Luna and Kallisto. Both had
Satpy 0.44 and were working fine. I updated Kallisto to satpy
0.46 and *YES* a lot of libraries have been updated as well.
Kallisto stopped working with AVHRR eps. I downgraded satpy
on Kallisto to 0.45 and finally to 0.44 to make it work again.
None of my receivers had "defusedxml" installed anytime so far.
None of the Python stuff has been changed in all what I did.

# Name                    Version                   Build  Channel
brotli-python             1.0.9           py311ha362b79_9    conda-forge
msgpack-python            1.0.7           py311h9547e67_0    conda-forge
python                    3.11.7          hab00c5b_0_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-geotiepoints       1.7.1           py311h1f0f07a_0    conda-forge
python-tzdata             2023.3             pyhd8ed1ab_0    conda-forge
python_abi                3.11                    4_cp311    conda-forge

Dask has not been changed in all my experiments.
# Name                    Version                   Build  Channel
dask                      2023.12.1          pyhd8ed1ab_0    conda-forge
dask-core                 2023.12.1          pyhd8ed1ab_0    conda-forge

Xarray found on my receivers after my first updating to satpy 0.46 and
downgrading to satpy 0.44 on receiver Kallisto (Luna has not been touched).

(pytroll) eumetcast@luna:~/SPStools/DEVscripts$ grep xarray luna_list044.txt
rioxarray                 0.15.0             pyhd8ed1ab_0    conda-forge
xarray                    2023.10.1          pyhd8ed1ab_0    conda-forge

(pytroll) eumetcast@kallisto:~/SPStools/DEVscripts$ grep xarray kallisto_list044.txt
rioxarray                 0.14.1             pyhd8ed1ab_0    conda-forge
xarray                    2023.12.0          pyhd8ed1ab_0    conda-forge


Following the lines of your post I upgraded Kallisto to satpy 0.45
again. The AVHRR driver didn't work. Then I added "defusedxml" and
everything was fine. I updated to the latest satpy 0.46 and this
works as well. I attach the DEBUG output from receiver Kallisto
using satpy 0.45 again without and with "defusedxml" installed.

Is "defusedxml" just something that has been forgotten to include
with the update of satpy 0.45 and 0.46? Or is this something that
does help you to debug the underlying problem in the EPS driver?


Best regards,
Ernst
kallisto_045_defusedxml_debug.txt
kallisto_045_debug.txt

lobsiger...@gmail.com

unread,
Dec 23, 2023, 8:30:53 AM12/23/23
to pytroll
Hi Dave,

googling on the internet for "what is defusedxml" reveals little fun. I find different entries like:


Should we be worried using older satpy versions? Does it mean my GNU/Linux servers might already have a problem?
Or do the latest satpy versions 0.45 and 0.46 just ask for "defusedxml" and do not handle the case if it's missing correctly?

Ernst

lobsiger...@gmail.com

unread,
Dec 23, 2023, 10:57:56 AM12/23/23
to pytroll
Dave and Martin,

I have no idea whether or not this additional problem is related.
As I now have xarray 2023.12.0 on PC Kallisto I can also confirm
that OLCI files cannot be handled anymore as described here:



That's what I get:

(pytroll) eumetcast@kallisto:~/SPStools/DEVscripts$ python Sen3B-area.py 20231223DAY
Satellite Sen3B is expected to make an ideal DAY overhead pass in the center of area "a23a" at 12:21 UTC.
Add worst case EUMETCast channel timeliness + 1 hour slack to above time for scheduling images from this area!
 1 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa01_radiance.nc
 2 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa02_radiance.nc
 3 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa03_radiance.nc
 4 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa04_radiance.nc
 5 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa05_radiance.nc
 6 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa06_radiance.nc
 7 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa07_radiance.nc
 8 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa08_radiance.nc
 9 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa09_radiance.nc
10 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa10_radiance.nc
11 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa11_radiance.nc
12 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa12_radiance.nc
13 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa13_radiance.nc
14 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa14_radiance.nc
15 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa15_radiance.nc
16 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa16_radiance.nc
17 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa17_radiance.nc
18 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa18_radiance.nc
19 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa19_radiance.nc
20 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa20_radiance.nc
21 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/Oa21_radiance.nc
22 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/geo_coordinates.nc
23 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/instrument_data.nc
24 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/tie_geometries.nc
25 --> S3B_OL_1_ERR____20231223T120608_20231223T125013_20231223T140956_2645_087_337______MAR_O_NR_002.SEN3/tie_meteo.nc
Script: Sen3B-area.py 20231223DAY POI: lat=-63.1 lon=-52.8 ran=10.0
Maximum elevation of satellite Sentinel-3B is at telmax=12:37 UTC
/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/core/dataset.py:282: UserWarning: The specified chunks separate the stored chunks along dimension "tie_rows" starting at index 4096. This could degrade performance. Instead, consider rechunking after loading.
  warnings.warn(
/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/namedarray/core.py:487: UserWarning: Duplicate dimension names present: dimensions {'bands'} appear more than once in dims=('bands', 'bands'). We do not yet support duplicate dimension names, but we do allow initial construction of the object. We recommend you rename the dims immediately to become distinct, as most xarray functionality is likely to fail silently if you do not. To rename the dimensions you will need to set the ``.dims`` attribute of each variable, ``e.g. var.dims=('x0', 'x1')``.
  warnings.warn(
/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/core/dataset.py:282: UserWarning: The specified chunks separate the stored chunks along dimension "rows" starting at index 4096. This could degrade performance. Instead, consider rechunking after loading.
  warnings.warn(
Traceback (most recent call last):
  File "/home/eumetcast/SPStools/DEVscripts/Sen3B-area.py", line 132, in <module>
    Yea, Mon, Day, Hou, Min, height = leo_images(Yea, Mon, Day, sat, NoD, False, segdir, False, isbulk, 'olci_l1b',
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/SPStools/DEVscripts/LEOstuff.py", line 1669, in leo_images
    scn = Scene(filenames=bestfiles, reader=reader)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/satpy/scene.py", line 152, in __init__
    self._readers = self._create_reader_instances(filenames=filenames,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/satpy/scene.py", line 173, in _create_reader_instances
    return load_readers(filenames=filenames,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/satpy/readers/__init__.py", line 575, in load_readers
    reader_instance.create_filehandlers(
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/satpy/readers/yaml_reader.py", line 616, in create_filehandlers
    filehandlers = self._new_filehandlers_for_filetype(filetype_info,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/satpy/readers/yaml_reader.py", line 604, in _new_filehandlers_for_filetype
    return list(filtered_iter)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/satpy/readers/yaml_reader.py", line 572, in filter_fh_by_metadata
    for filehandler in filehandlers:
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/satpy/readers/yaml_reader.py", line 513, in _new_filehandler_instances
    yield filetype_cls(filename, filename_info, filetype_info, *req_fh, **fh_kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/satpy/readers/olci_nc.py", line 172, in __init__
    self.cal = cal.nc
               ^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/functools.py", line 1001, in __get__
    val = self.func(instance)
          ^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/satpy/readers/olci_nc.py", line 121, in nc
    dataset = xr.open_dataset(f_obj,
              ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/backends/api.py", line 578, in open_dataset
    ds = _dataset_from_backend_dataset(
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/backends/api.py", line 371, in _dataset_from_backend_dataset
    ds = _chunk_ds(
         ^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/backends/api.py", line 336, in _chunk_ds
    variables[name] = _maybe_chunk(
                      ^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/core/dataset.py", line 327, in _maybe_chunk
    var = var.chunk(
          ^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/core/variable.py", line 1042, in chunk
    chunks = {self.get_axis_num(dim): chunk for dim, chunk in chunks.items()}
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/core/variable.py", line 1042, in <dictcomp>
    chunks = {self.get_axis_num(dim): chunk for dim, chunk in chunks.items()}
              ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/namedarray/core.py", line 661, in get_axis_num
    return self._get_axis_num(dim)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/namedarray/core.py", line 664, in _get_axis_num
    _raise_if_any_duplicate_dimensions(self.dims)
  File "/home/eumetcast/miniconda3/envs/pytroll/lib/python3.11/site-packages/xarray/namedarray/core.py", line 867, in _raise_if_any_duplicate_dimensions
    raise ValueError(
ValueError: This function cannot handle duplicate dimensions, but dimensions {'bands'} appear more than once on this object's dims: ('bands', 'bands')
(pytroll) eumetcast@kallisto:~/SPStools/DEVscripts$



I will try to downgrade xarray now.

Merry christmas,
Ernst

David Hoese

unread,
Dec 25, 2023, 3:05:57 PM12/25/23
to pyt...@googlegroups.com
Ernst,

I found the relevant commits in git and it looks like you've found the relevant information in your searches. In attempt to make one of our security checkers happy Martin Raspaud switched from the "lxml" module to "defusedxml" for parsing XML as it is meant to be safer. It looks like defusedxml wasn't added anywhere in the setup.py dependencies lists including the "extras". I'd say it could definitely be added as an "extra", but this wouldn't have added it for your environments as-is anyway. I'll try to talk to Martin about this after the holidays.

Bottom line: It looks like defusedxml is required now for any XML-based readers.

As for security for past versions, it depends how "untrusted" these past XML inputs were. We mostly made this change to make the security checks happy, but in most (if not all) cases Satpy is reading from trusted XML sources. So the security concerns related to these readers using "lxml" are limited in my opinion.

Dave
> Lastly, I did want to mention that a couple days ago I made a fix to the `eps_l1b.py` module related to documentation generation and how it defined a function used by dask (https://github.com/pytroll/satpy/pull/2700 <https://github.com/pytroll/satpy/pull/2700>). However, this wasn't causing any import or other testing issues so I'm skeptical that it is the issue here.
>
> Dave
>
> On 12/22/23 13:08, lobsiger...@gmail.com wrote:
> > Dear developers,
> >
> > a user of my satpy scripts could not read his EUMETCast GDS Metop
> > AVHRR files anymore. I had no problems with satpy 0.44. I updated to
> > satpy 0.46 and ran into the same problems. Downgrading to 0.45
> > did not help. The "cloud_flag" has been added from 0.44 to 0.45.
> > I'm not sure this has something to do with the issue. There are no
> > changes of file pattern definitions but the reader says "ValueError:
> > No supported files found". ERROR messages and more details see:
> >
> > https://groups.io/g/MSG-1/topic/103296673#36134 <https://groups.io/g/MSG-1/topic/103296673#36134>
> >
> > Cheers,
> > Ernst
> >
> > --
> > You received this message because you are subscribed to the Google Groups "pytroll" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to pytroll+u...@googlegroups.com <mailto:pytroll+u...@googlegroups.com>.
> > To view this discussion on the web, visit https://groups.google.com/d/msgid/pytroll/3757e4db-587e-42d9-af0d-d4b53929fc25n%40googlegroups.com <https://groups.google.com/d/msgid/pytroll/3757e4db-587e-42d9-af0d-d4b53929fc25n%40googlegroups.com> <https://groups.google.com/d/msgid/pytroll/3757e4db-587e-42d9-af0d-d4b53929fc25n%40googlegroups.com?utm_medium=email&utm_source=footer <https://groups.google.com/d/msgid/pytroll/3757e4db-587e-42d9-af0d-d4b53929fc25n%40googlegroups.com?utm_medium=email&utm_source=footer>>.
>
> --
> You received this message because you are subscribed to the Google Groups "pytroll" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pytroll+u...@googlegroups.com <mailto:pytroll+u...@googlegroups.com>.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/pytroll/8398ee63-e40d-4576-9ec1-393840295fe6n%40googlegroups.com <https://groups.google.com/d/msgid/pytroll/8398ee63-e40d-4576-9ec1-393840295fe6n%40googlegroups.com?utm_medium=email&utm_source=footer>.

David Hoese

unread,
Dec 25, 2023, 3:07:59 PM12/25/23
to pyt...@googlegroups.com
Ernst,

This is a known issue. See:

https://github.com/pytroll/satpy/issues/2705

The files have a variable with dimensions (bands, bands) and this breaks the newest xarray's expectations about dimensions and file structure. So this is an xarray bug in my opinion and will hopefully see an xarray bug report filed and fixed.

Dave
> https://www.linkedin.com/pulse/protecting-your-application-from-xml-based-attacks-importance-koshy <https://www.linkedin.com/pulse/protecting-your-application-from-xml-based-attacks-importance-koshy>
> Lastly, I did want to mention that a couple days ago I made a fix to the `eps_l1b.py` module related to documentation generation and how it defined a function used by dask (https://github.com/pytroll/satpy/pull/2700 <https://github.com/pytroll/satpy/pull/2700>). However, this wasn't causing any import or other testing issues so I'm skeptical that it is the issue here.
>
> Dave
>
> On 12/22/23 13:08, lobsiger...@gmail.com wrote:
> > Dear developers,
> >
> > a user of my satpy scripts could not read his EUMETCast GDS Metop
> > AVHRR files anymore. I had no problems with satpy 0.44. I updated to
> > satpy 0.46 and ran into the same problems. Downgrading to 0.45
> > did not help. The "cloud_flag" has been added from 0.44 to 0.45.
> > I'm not sure this has something to do with the issue. There are no
> > changes of file pattern definitions but the reader says "ValueError:
> > No supported files found". ERROR messages and more details see:
> >
> > https://groups.io/g/MSG-1/topic/103296673#36134 <https://groups.io/g/MSG-1/topic/103296673#36134>
> >
> > Cheers,
> > Ernst
> >
> > --
> > You received this message because you are subscribed to the Google Groups "pytroll" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to pytroll+u...@googlegroups.com <mailto:pytroll+u...@googlegroups.com>.
> --
> You received this message because you are subscribed to the Google Groups "pytroll" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pytroll+u...@googlegroups.com <mailto:pytroll+u...@googlegroups.com>.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/pytroll/9fe4ec0b-b88d-42b4-ac4b-c6c7f623149cn%40googlegroups.com <https://groups.google.com/d/msgid/pytroll/9fe4ec0b-b88d-42b4-ac4b-c6c7f623149cn%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages