newspaper articles: relatedItem vs. hasMember

Eben English

unread,

Jan 27, 2017, 2:27:02 PM1/27/17

to PCDM

Hello again!

At a recent meeting of the Hydra Newspapers Interest Group, we were discussing several proposed PCDM-based models for newspaper content, including:

Nat'l Library of Wales: https://wiki.duraspace.org/pages/viewpage.action?pageId=68066807

U. of Maryland: https://wiki.duraspace.org/download/attachments/77447979/maryland2016hyrdraconnect.pdf

These models are quite similar, which I suppose is a testament to the grok-ability of the PCDM documentation (yay!). However, there is one difference that we felt we could use some guidance from the community on.

One model uses the "hasMember" predicate to connect newspaper issue objects with article objects that appear within the issue, while others have proposed using "hasRelatedItem" for the same purpose.

The documentation (http://pcdm.org/models):

- hasMember: "Links to a subsidiary Object or Collection. Typically used to link to component parts, such as a book linking to a page."

- hasRelatedItem: "Links to a related Object that is not a component part, such as an object representing a donor agreement or policies that govern the resource."

We're curious to get other opinions on this dilemma. Which predicate do people feel would be more appropriate for this use case? Would there be any difference if the article object is modeled as pcdm:Object vs. pcdm:Range?

Thanks again,

Eben

Boston Public Library

Peter Binkley

unread,

Jan 27, 2017, 4:08:14 PM1/27/17

to pc...@googlegroups.com

It seems to me that hasMember is the more logical choice. If you read the whole issue, you've read all the articles but you haven't necessarily read the subscription terms; if you read the issue except for the articles, you've read very little. If I wanted to grab an issue and its contents I'd reach for hasMember and I'd be disappointed if I just got mastheads. So I think to use hasRelatedItem is to leave the issue as an unnecessarily sparse container.

Peter

Peter Binkley, Ph.D., MLIS / Digital Initiatives Technology Librarian / peter....@ualberta.ca

2-10K Cameron Library / University of Alberta / Edmonton, Alberta / Canada T6G 2J8
phone 780-492-3743 / fax 780-492-9243

--
You received this message because you are subscribed to the Google Groups "PCDM" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pcdm+unsubscribe@googlegroups.com.
To post to this group, send email to pc...@googlegroups.com.
Visit this group at https://groups.google.com/group/pcdm.
For more options, visit https://groups.google.com/d/optout.

west...@umd.edu

unread,

Jan 28, 2017, 10:48:33 AM1/28/17

to PCDM

It would be great to build some consensus on this. Thanks for posting, Eben.

I can provide a bit more context on our thinking. In our repository, we draw a distinction between item-level objects and component-level objects. Items are the units that are managed independently. Thus, in our batch loading application, the item is wrapped in a transaction such that if some piece of the item cannot be loaded, the whole transaction will fail and we will not end up with orphaned components.

In this way of thinking, articles are clearly components of the newspaper issue. They are subsidiary, and should not exist in the repository without the issue and pages in/on which they appear. In another sense, however, they are not components in the same sense that page images are components. When we grab the members of in issue to create a iiif manifest, our application grabs the page-images, not the articles. Page images have a clear sequence that articles do not, and articles can also not be understood as members of pages because they can also appear on multiple pages.

This seems less clear than with a book of essays, where the essays appear on a sequential range of pages, and the volume can be understood as being composed of a series of essays in order. In that case, you could browse the volume by going through the essays in order, but with a newspaper the idea of browsing it by going through a series of articles seems wrong. Instead, you look through it by browsing a series of pages.

All this having been said, it is possible that we are overthinking it. Certainly there is *some* sense in which an issue is composed of a set of articles. If we went with the membership relation, obviously we need to distinguish between member-pages and member-articles, but we have rdf:Type properties that allow us to do that.

I look forward to hearing what others think about this.

Josh Westgard

University of Maryland

To unsubscribe from this group and stop receiving emails from it, send an email to pcdm+uns...@googlegroups.com.

Esmé Cowles

unread,

Jan 28, 2017, 11:24:51 AM1/28/17

to pc...@googlegroups.com

Josh-

Thanks for laying out your thinking. This sounds very similar to the way we are modeling books, chapters, and pages in Plum. You can't create a strict hierarchy of books containing chapters containing pages, because chapters might end and start in the middle of a page, a book might not have chapters at all, and there might be some pages that are between chapters.

So we create a book object and ingest a flat sequence of pages (<book1> pcdm:hasMember <page1>, <page2>, ... <pageN>). We then model the chapter hierarchy separately. We are currently modeling that as a second set of ore:Proxy objects, but we want to use the Range and TopRange classes from the Works extension (http://pcdm.org/works#), which are inspired by IIIF. Mapping to IIIF is another commonality here, thinking about the flat list of members as a Sequence, and the optional hierarchy as a hierarchy of Ranges.

-Esmé

Eben English

unread,

Feb 27, 2017, 4:59:40 PM2/27/17

to PCDM

To pick up this thread once again...

At the last meeting of the Hydra Newspapers Interest Group we got a bit deeper into discussing the relationship between article objects and page-level objects, and some questions came up:

1. Can a Range have member Objects?

http://pcdm.org/2016/02/16/works#Range

The description says "Has member FileSets representing the physical parts of the Work are part of the section (e.g., which pages are in a chapter)." But it seems more likely that a Range might have have member Objects representing the pages (and that these page-Objects would then have member FileSets)? I'm assuming that a Range can have member Objects, since Range is just a subclass of Object.

I suppose that FileSet is just a subclass of Object as well, and so perhaps there's not that much of a difference? But presumably the page object will have some descriptive metadata (text direction, page/leaf number, etc.) that might be less "appropriate" for a FileSet? The concept of a "page" also seems to have meaning beyond the FileSet definition: "A group of related Files, typically a single master File and its derivatives."

2. Ordering without proxies?

In the case of a newspaper article spanning 1..n pages, let's say that an implementer was willing to assert that there was ONE AND ONLY ONE order of the page objects that are members of an article object. You would still need to use proxies for ordering, however, since a page object would likely be a member of multiple article objects (since newspaper pages tend to have multiple articles on them), and without proxies there would be no telling which iana:next and iana:prev relationships correspond to the article you were trying to display. Correct?