Problemas con la recolección del repositorio

85 views
Skip to first unread message

ism...@gmail.com

unread,
Aug 10, 2023, 5:32:14 AM8/10/23
to AtoM Users
Good morning, from the Hispana website they are trying to collect records from our AtoM repository through OAI-PMH.

They tell us that they cannot do the collection since they collect the first 100 records, and when calling the following ones through resumptionToken it gives them an error:

uiuc.oai.OAIException: Timeout exceeded (60 seconds). Request has been sent 3 times

We have done the test from https://validador.recolecta.fecyt.es and we have obtained the following error when collecting the URL:

https://archivo.dipusevilla.es/;oai?verb=ListIdentifiers&metadataPrefix=oai_dc

Mistake:

Could not fetch records from repository. Connection errors on https://archivo.dipusevilla.es/;oai?verb=ListRecords&resumptionToken=eyJmcm9tIjoiIiwidW50aWwiOiIiLCJjdXJzb3IiOjI2NDkwMCwibWV0YWRhdGFQcmVmaXgiOiJvYWlfZGMiLCJzZXQiOiIifQ==

It's not clear to me why this might be happening, I've thought about the PHP.ini settings but the first 100 records it loads just fine, and by directly opening the URL with resumptionToken it's able to open it without a problem.

I hope you can help me, thanks in advance and best regards.

Dan Gillean

unread,
Aug 10, 2023, 9:53:04 AM8/10/23
to ica-ato...@googlegroups.com
Hi Isabel, 

When I tried to follow the ListIdentifiers link example in this message, I also got a 504 error (i.e. a timeout) returned. I don't think this is a bug in AtoM, but rather something to do with the network connectivity of your server... though it's very difficult to say without more information, and without being a system administrator myself. Some possible first steps: 
labels-in-atom-user-forum.png

Otherwise.... I'm not sure! But you might investigate some of the suggestions in this general article about 504-timeout errors: 
Good luck! And let us know if any of this helps. 

Cheers, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him


--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-user...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/34faca2d-f9a6-4f46-adb5-4a4b7d9a19a8n%40googlegroups.com.

ism...@gmail.com

unread,
Aug 17, 2023, 2:45:59 AM8/17/23
to AtoM Users
Good morning Dan, I'll tell you about the tests I've done without success:
  •     Checked that the machine has enough resources.
  •     Increased max_execution_time and memory_limit from php.ini in fpm several times, after nginx restart.
  •     I have been checking the processes in progress, the one that was consuming the most was the mysql process, although the query that was made was done in a short time and did not give an error.
  •     Tried decreasing the number of results per page from 100 to 50.
  •     Exported all XML using php symfony cache:xml-representations command
I am doing tests from the web https://validator.oaipmh.com/ but when obtaining ListIdentifiers and ListRecords it gives an error due to timeout

I have noticed that in the nginx error.log the following error appears:

2023/08/16 09:09:48 [error] 31761#31761: *130530 FastCGI sent on stderr: "PHP Message: Component does not exist: "arOaiPlugin", "badVerb"" while reading response header from the origin, server: file.dipusevilla.es, request: "GET/;oai HTTP/1.1", upstream: "fastcgi://unix:/run/php7.0-fpm.atom.sock:", host: "file.dipusevilla.es"

I have seen a forum topic that talks about this error: https://groups.google.com/g/ica-atom-users/c/GuZ-BfFkLho/m/JY5DUC4HAgAJ and the truth is that I am not very clear on how to configure sitebaseURL I have it like this:

siteBase.png

And the repository like this, I don't know if it will be correct. Greetings.

oai.png

Dan Gillean

unread,
Aug 18, 2023, 11:54:12 AM8/18/23
to ica-ato...@googlegroups.com
Hi Isabel, 

I don't immediately see any glaring issues in what you've shared. I will ask a developer to take a look at this thread and see if they have further ideas, but I am running out of them! 

In the meantime however, I did retry the ListIdentifiers OAI query against your site again (the link from your first message), and this time it resolved properly for me! I also got successful responses from the Identify and ListMetadataFormats verbs

So: we know it CAN work - i.e. that the arOAIPlugin is enabled, that your site recognizes OAI verbs and can generate responses, etc. Why not set the resumption token limit to something much lower for now (like 25), in case that helps with larger responses like ListRecords? And, keep monitoring your site to see if you can identify bottlenecks, etc. 

Cheers, 

Dan Gillean, MAS, MLIS

ism...@gmail.com

unread,
Sep 20, 2023, 7:10:21 AM9/20/23
to AtoM Users
Good morning Dan, thank you as always for your response.

Unfortunately, HISPANA is still unable to collect our OAI repository.

On the one hand, they tell me that the baseline that appears with the verb https://archivo.dipusevilla.es/;oai?verb=Identify is incorrect. Before changing it, <baseURL>https://archivo.dipusevilla.es/index.php</baseURL> appeared.

On the other hand, they tell me that I have to create a set of digital records so that the OAI repository only offers digital documents and not other records.

My doubts are:
- is the baseURL correct as I have it now?
- How can I create a set to add only digital records?

Thanks in advance, regards.

Dan Gillean

unread,
Sep 20, 2023, 9:55:40 AM9/20/23
to ica-ato...@googlegroups.com
Hi Isabel, 

RE: the base URL, you can try omitting the /index.php part of the URL, but otherwise... seems correct to me?

As for the second set of questions, I am not sure what to suggest, as the request is also not clear to me. Do they mean only descriptions with digital objects attached? If yes, there's no easy way to provide just this via AtoM's OAI-PMH configuration. It may not be possible via AtoM to provide them with what they want, short of creating a separate test instance of AtoM that only has your descriptions with digital objects included.

Theoretically, it should be possible to use the OAI-PMH protocol's Set attribute to create a virtual collection of only descriptions with digital objects - but nothing like this currently exists in AtoM, so it would require some fairly extensive analysis and development to implement. 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

ism...@gmail.com

unread,
Sep 21, 2023, 6:09:23 AM9/21/23
to AtoM Users
Good morning Dan, I have changed the URL as you have indicated so my repository would now be https://archivo.dipusevilla.es/;oai

I have also learned that Hispana uses the Marc Edit application for collection. I have downloaded it to test and it happens that, with the following configuration, it tells me that it cannot collect anything.

marcedit.png

Although if I access https://archivo.dipusevilla.es/;oai?verb=ListSets and paste one of the setSpec into the set field, for example this one:  <setSpec>oai:file.dipusevilla.es:dipusevilla_4</setSpec>

marcedit2.png

If it does collect, what could be the reason? Thanks in advance and greetings.

Dan Gillean

unread,
Sep 21, 2023, 2:13:24 PM9/21/23
to ica-ato...@googlegroups.com
I'm sorry Isabel, this is a bit beyond the kind of support we can offer via the forum. 

I am not familiar with this particular application, but if I put "Nombre del conjunto" into Google translate, it returns "set name" so... it seems like this application expects sets to be used?

As you have found AtoM does have a virtual set to return top-level collection records. At this time, we don't support any other sets. 

Regards, 

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056
@accesstomemory
he / him

Reply all
Reply to author
Forward
0 new messages