Hi
We've recently upgraded from version 2.4 of AtoM to version 2.6.1
We were unable to switch on the
Cache description XML exports upon creation/modification
in previous version as it put too much pressure on our site and caused
it go down. During testing of the upgraded site we switched this
functionality back on and did some load testing and it appeared to
manage fine, with jobs being queued up nicely in gearman. However, when
we went live we soon started experiencing issues with number of jobs
queuing for
arXmlExportSingleFileJob growing and never seeming to
get to the end of the list of jobs, although we could see from gearman
monitor page that it was still appearing to process things. However this
job appeared to have an impact on other jobs e.g.
arObjectMoveJob and
arFindingAidJob which had items in queue but they no longer appeared to be processing anything. We therefore switched the
Cache description XML exports upon creation/modification
off again to see if this was the cause of the problem. All seemed to
be good for a few days and now we appear to hit the same sort of issue
again but this time with the
arUpdateEsIoDocumentsJob, this still seems to be running but
arObjectMoveJob and
arFindingAidJob don't
seem to be processing at all, just queuing everything up. Any ideas
why a job which is being heavily used is affecting other jobs? Is there
something we can do to prevent this from happening?
What does the
arUpdateEsIoDocumentsJob relate
to? An archivist spotted that this was running and updating
descendants after he'd amended an archival description. When I did some
testing I noticed that this job appeared when editing a record that had
the
Name of creator field populated. The archivist hadn't touched that field when editing, so does this job run whether you've altered the
Name of creator field or not?
Another thing I noticed having switched the
Cache description XML exports upon creation/modification off was that the
DC in
the OAI-PMH was still being updated (which was a bit of a surprise to
me, I didn't think OAI-PMH would be updated with Cache descriptions
switched off) however the descriptive information in the corresponding
EAD OAI record wasn't updated, even though the
datestamp on the OAI record had changed. Any idea as to why
DC is being updated but not the
EAD?
When I manually re-cache the description via command line it then
updates the EAD. Any idea as to why we are seeing this behaviour?
Ideally we would like both EAD and DC OAI updated
automatically without human intervention so that the DC can be
harvested by our main library discovery interface Primo in order to keep
that in sync with AtoM and also the EAD so that the Archives Hub can
harvest that automatically to keep our records up-to-date in their
system. Therefore any help you can provide us with here would be very
much appreciated.
Update: It appears that overnight the jobs did all finally complete, with only a few failures on the
arObjectMoveJob I'm not sure if this was an AtoM issue or
possibly archivists trying to move same object twice when they didn't
see it move initially. The biggest problem at the moment is that some
of these tasks took 9hrs to complete. This is ok for some tasks but not
for a
arObjectMoveJob task. As mentioned above I was surprised that
some jobs appeared to be impacted by another job, I thought each job had
it's own resources. Is this how it should be behaving? If so is there
a way we can configure this better?
Thanks,
Vicky
Digital Standards Manager
National Library of Wales