Hi Dataverse users,
I am trying to harvest metadata from Zenodo to Dataverse, using a harvesting client.
I have some failures and I am now trying to understand what goes wrong.
Settings: metadata format = 'oai_dc', archive type = 'Generic OAI archive'. I don't know if these are the best ones, as I haven't tried anything else so far. I have used the settings once mentioned in another post.
1) <message>Exception processing getRecord(), oaiUrl=
https://zenodo.org/oai2d, identifier=oai:
zenodo.org:2549479, edu.harvard.iq.dataverse.api.imports.ImportException, Failed to import harvested dataset: class edu.harvard.iq.dataverse.util.json.ControlledVocabularyException (Value 'eng' does not exist in type 'language')</message>
I feel this exception depends on the fact that Zenodo does not control values in the field 'language' and accepts free text. Suggested text is e.g. 'eng'. In fact, if one types 'eng' and has enough time to wait, after a while a drop-down menu appears where one can select "English", but "eng" is also accepted. If the value is selected from the drop-down, then import to Dataverse runs smoothly. So, I think that the only way to avoid this would be to correct metadata in Zenodo before importing into Dataverse (where the 'language' value is controlled). Any other ideas?
2) for some records I got these other two errors:
Error calling GetRecord - GetRecord request failed. HTTP error code 502
Error calling GetRecord - GetRecord request failed. HTTP error code 504
Do you think those could depend on time-out issues? (it seems Zenodo is very slow in replying...)
3) when setting the harvesting client, some times the list of available sets is completely empty, some other times it contains only part of the OAI sets (i.e. Zenodo communities) and in this case a warning is displayed, saying that not all sets have been retrieved due to time-out problems. Do you think that these issues could be solved anyway? e.g. by changing any setting?
Thanks for the help!
Best wishes,
IIT Dataverse (Istituto Italiano di Tecnologia)