Processing fails for scheduled events (unable to load metadata from catalog)

39 views
Skip to first unread message

Katrin S

unread,
Sep 5, 2023, 12:17:40 PMSep 5
to Opencast Users
Hi all,

I am in the process of debugging a strange race condition with the processing of scheduled events from capture agents.

In most cases, workflows will fail in the "inspect" step because the file "dublincore.xml" cannot be found, for example:

```
17767 2023-09-05T17:17:24,137 | ERROR | (DublinCoreUtil:79) - Unable to load metadata from catalog 'https://XXX/files/mediapackage/c9348f0b-ffa0-46e2-9dec-2453827e8907/e8e89c83-534c-4d8f-90ea-6d605d1f9f54/dublincore.xml'
```
The reason seems to be that Opencast is merging the catalog/mediapackage from the capture agent with the one from the asset manager (legacy mediapackage). However, dublincore.xml is not made available at the file repo location of the (legacy) mediapackage from Opencast.

When I check the file system, I can see dublincore.xml at 

files/mediapackage/{ID FROM CA}/e8e89c83-534c-4d8f-90ea-6d605d1f9f54/dublincore.xml'

but *not* 

files/mediapackage/{LEGACY ID FROM OC}/e8e89c83-534c-4d8f-90ea-6d605d1f9f54/dublincore.xml'

More detailed log from admin node:
```
2023-09-05T17:17:24,088 | WARN  | (WorkspaceImpl:418) - The working file repository URI and paths don't match. Looking up https://XXX/files/mediapackage/c9348f0b-ffa0-46e2-9dec-2453827e8907/e8e89c83-534c-4d8f-90ea-6d605d1f9f54/dublincore.xml at /srv/opencast-data/opencast/files/mediapackage/c9348f0b-ffa0-46e2-9dec-2453827e8907/e8e89c83-534c-4d8f-90ea-6d605d1f9f54/dublincore.xml failed
17766 2023-09-05T17:17:24,091 | WARN  | (WorkspaceImpl:386) - The working file repository and workspace paths don't match. Looking up https://XXX/files/mediapackage/c9348f0b-ffa0-46e2-9dec-2453827e8907/e8e89c83-534c-4d8f-90ea-6d605d1f9f54/dublincore.xml at /srv/opencast-data/opencast/files/mediapackage/c9348f0b-ffa0-46e2-9dec-2453827e8907/e8e89c83-534c-4d8f-90ea-6d605d1f9f54/dublincore.xml failed
17767 2023-09-05T17:17:24,137 | ERROR | (DublinCoreUtil:79) - Unable to load metadata from catalog 'https://XXX/files/mediapackage/c9348f0b-ffa0-46e2-9dec-2453827e8907/e8e89c83-534c-4d8f-90ea-6d605d1f9f54/dublincore.xml'
```

Log from remote worker:
```
2023-09-05T17:17:21,237 | INFO  | (IngestDownloadServiceImpl:291) - Downloaded the external element https://XXX/assets/assets/c9348f0b-ffa0-46e2-9dec-2453827e8907/8354f909-ce2f-4603-aa20-e12c62c123e7/0/dublincore.xml
2023-09-05T17:17:21,256 | INFO  | (IngestDownloadServiceImpl:291) - Downloaded the external element https://XXX/assets/assets/c9348f0b-ffa0-46e2-9dec-2453827e8907/e8e89c83-534c-4d8f-90ea-6d605d1f9f54/0/dublincore.xml
2023-09-05T17:17:21,275 | INFO  | (IngestDownloadServiceImpl:291) - Downloaded the external element https://XXX/assets/assets/c9348f0b-ffa0-46e2-9dec-2453827e8907/security-policy-series/0/xacml.xml
2023-09-05T17:17:21,309 | INFO  | (IngestDownloadServiceImpl:291) - Downloaded the external element https://XXX/assets/assets/c9348f0b-ffa0-46e2-9dec-2453827e8907/security-policy-episode/0/xacml.xml
```

I also checked the file repo location, on all nodes, I have:
prop.org.opencastproject.admin.ui.url=https://OURADMIN
prop.org.opencastproject.file.repo.url=${prop.org.opencastproject.admin.ui.url}


Does someone have any insight on how this merging process works and how the files are made available? 
I suspect some kind of timing issue, because it sometimes works , but I don't fully understand the underlying process.

Any help/hints are much appreciated.

Katrin

Seeland, Per Pascal

unread,
Sep 6, 2023, 1:09:56 AMSep 6
to us...@opencast.org

Hi Katrin,

 

I don’t remember having issues with the dublincore.xml itsself, but we habe also some issues some time ago with files not being present. I did not have time to dig into the code but we do since then mount all nfsshares on the workers with the sync option, which did solve out issues.

 

Kind regards,

Pascal

--
To unsubscribe from this group and stop receiving emails from it, send an email to users+un...@opencast.org.

Katrin Ihler

unread,
Sep 15, 2023, 4:08:56 AMSep 15
to us...@opencast.org

Hi Katrin, Pascal,

some of our clients have also been experiencing this issue. Out of curiosity - which operating system are you running on your servers? Also which Opencast version are you using and can you give me some information on your NFS? Trying to find a common denominator here.

Cheers,

Katrin

-- 
ELAN e.V.
Heger-Tor-Wall 12
D-49074 Osnabrück
https://elan-ev.de

Katrin S

unread,
Oct 5, 2023, 5:25:51 AMOct 5
to Opencast Users, Katrin Ihler
Hi Katrin,

please excuse the delayed answer, I was out of office the past two weeks.

Currently we use:
- Debian 11/Bullseye
- Opencast 14.2 (deb packages)
- Separate nodes for admin, presentation, worker
- NFS Server: NetApp, mounted with options rw,hard,intr,rsize=32768,wsize=32768

The issue happens with any scheduled event (tested with pyCA and Extron SMP).

Perhaps NetApp's request throttling mechanism has something to do with it. Although it's still weird, that it happens so consistently at this specific step and we don't notice any issues with e.g. uploading.

Setting the kernel parameters:
sunrpc.tcp_max_slot_table_entries = 65535
sunrpc.tcp_slot_table_entries = 2

might be worth a try (will try that myself, too)

- Katrin


Katrin S

unread,
Oct 12, 2023, 4:01:29 AMOct 12
to Opencast Users, Katrin S, Katrin Ihler
Another observation we now made is that longer events with larger video files do not seem to be affected at all, so it is probably some kind of timing issue. 
Luckily, this also means it's not a huge problem in praxis. Of course it would still be good to find a solution since we can never be 100% sure.

Reply all
Reply to author
Forward
0 new messages