[OAI-PMH] How to implement the datestamp in the record header for selective harvesting with until

81 views
Skip to first unread message

Oki Utamura

unread,
Jun 28, 2023, 6:05:10 AM6/28/23
to oai...@googlegroups.com

Hi,

 

I’m looking for clarification on how to implement the header datestamp on the server side to support selective harvesting with the until-parameter, e.g. ListIdentifiers or ListRecords.

 

The question is whether the returned record datestamp is static or must be generated dynamically based on the incoming request.

 

View one:

  • The datestamp of a record is static; it’s the last-update date of the record – unless the record has remained unchanged since its creation date.
  • In the latter case, the datestamp is taken from the creation date.

 

View two:

  • The content of the returned datestamp depends on the incoming request:
    • If the incoming list-request uses the from parameter, the datestamp is the last update date (or the creation date if there is no last update date).
    • If the incoming list-request uses the until parameter, the datestamp is the creation date.

 

Use case:

  • A user wants to harvest all records that were created/modified until a certain point in time, e.g., until 1 June.
  • If the server implements view one, a record that was created on 1 April but updated on 1 August would not be returned.
  • If the server implements view two, that record would be returned.

 

Both views have some support in the specs/implementation guidelines.

 

View one:

 

View two:

 

I would appreciate your thoughts on this. Many thanks!

 

Kind regards,

Oki

 

 

 

Oki Utamura (he/his)

Functional Application Manager

Department Digitization and Document Processing

Team KBGA

 

E oki.u...@kb.nl

 

 

John Flatness

unread,
Jun 29, 2023, 2:43:37 AM6/29/23
to oai...@googlegroups.com
The relevant bit of the spec is two parts of 2.7.1:

"Every header returned by the GetRecord, ListRecords or ListIdentifiers requests contains a datestamp, which reflects the most recent date and time of the creation, modification, or deletion according to the rules defined above."

I can see that there's some ambiguity with the "or" here, but taken together with this earlier:

"A repository must update the datestamp of a record if a change occurs, the result of which would be a change to the metadata part of the XML-encoding of the record. Such changes include, but are not limited to, changes to the metadata of the record, changes to the metadata format of the record, introduction of a new metadata format, termination of support for a metadata format, etc."

To me the only correct reading of this is that the header datestamp is always the most recent of the creation and modification dates (with an explicit rule elsewhere that the deletion date controls regardless if it's a deleted record and the repository tracks deletions). Your "view one" in other words.

I think the main mismatch here with what you've written is that the spec does not mandate that the selective harvesting match happens against the header datestamp. In your use case of an "until" request with a date of 1 June, a record created on 1 April but last updated on 1 August is required by the spec to be returned, as it was created within the boundaries specified in the request. It's just that the spec also says that the datestamp in the response will be 1 August. The creation date still remains relevant for matching even when it's not displayed.

-John Flatness
--

---
You received this message because you are subscribed to the Google Groups "OAI-PMH" group.
To unsubscribe from this group and stop receiving emails from it, send an email to oai-pmh+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/oai-pmh/DB8P191MB0839853FFA4043C8FF4B18CBF727A%40DB8P191MB0839.EURP191.PROD.OUTLOOK.COM.


Reply all
Reply to author
Forward
0 new messages