Re: [DigitalNZ] Digest for digitalnz@googlegroups.com - 2 updates in 1 topic

15 views
Skip to first unread message

Conal Tuohy

unread,
Aug 24, 2014, 4:33:38 PM8/24/14
to digi...@googlegroups.com
Hi Gordon - Long time no see!

I agree with your reasoning; in fact I'd already come to the same conclusion. With Trove Australia, because the OCR'd text can be corrected by humans, the newspaper articles often do change, and hence they do provide a "last-corrected" date. But with Papers Past that's not going to happen because the text remains uncorrected, hence the syndication_date will in fact = the last_update date.

I am still working on the OAI-PMH provider for Papers Past, but it shouldn't be too much longer.

I agree with Chris that it would be good to get both these dates surfaced in the API, as then it would be possible to do OAI-PMH and similar "pipeline" processing of Digital NZ data generally, not just static resources like the newspaper articles.

It would be good to be able to not just view and sort by those date fields, but also constrain the search results by them. At the moment I am having to make multiple HTTP requests to find a particular (re)starting position in a set of search results.

Incidentally, another thing that I would have preferred would have been to be able to get date results in UTC, rather than in local NZ time "+12". It makes date comparisons trivial if all dates are expressed in the One True Time Zone. This is how OAI-PMH does it, and actually many people think it's good practice.

Cheers

Con



On 23 August 2014 14:47, <digi...@googlegroups.com> wrote:
Gordon Paynter <gordon....@gmail.com>: Aug 23 11:09AM +1200

Hi Chris, Conal:
 
This again!
https://groups.google.com/forum/#!topic/digitalnz/bZNtCMob6_o
 
By way of a history lesson, syndication date was originally designed to
mean "last updated date" (specifically to support "what's changed" and
OAI-PMH) but at some point this was changed to function as "record created
date" as Chris describes (and as is described in the current "v1 & v2"
documentation).
 
As the designer I find this mildly annoying but I have to concede that when
it stopped working for a few months nobody seemed to notice but me, and I
can see uses for the "record created date" approach also (especially since
the bulk of DigitalNZ material does not tend to get updated).
 
In the strictest sense, I don't see how the DigitalNZ API will support
OAI-PMH in it's current (v2 or v3) form unless you harvest every DigitalNZ
record regularly.
 
But I don't think that you need to worry because historic newspapers are
very rarely changed after being made available online. They also tend to be
made available online in very large batches, which I assume will have the
same or similar syndication date in DigitalNZ. So you might harvest nothing
for months, then suddenly get 100,000 new records overnight.
 
Gordon
 
 
 
Chris McDowall <chris.m...@gmail.com>: Aug 23 11:47AM +1200

Hi Gordon,
 
Note: I don't work for DigitalNZ anymore but I still use the API.
 
If memory serves, all DNZ records store a last_updated timestamp in the
database. It would be a great to get that returned & sortable through the
API.
 
Chris
 
 
On Sat, Aug 23, 2014 at 11:09 AM, Gordon Paynter <gordon....@gmail.com>
wrote:
 
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to digitalnz+...@googlegroups.com.

Gordon Paynter

unread,
Aug 24, 2014, 9:30:07 PM8/24/14
to digi...@googlegroups.com
Hi Conal, hope you're doing well.

The other great thing about UTC is that you get to call it "Zulu Time".

Gordon


--

---
You received this message because you are subscribed to the Google Groups "DigitalNZ" group.
To unsubscribe from this group and stop receiving emails from it, send an email to digitalnz+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages