URLs as URIs for digital objects

127 views
Skip to first unread message

Doron Shalvi

unread,
Nov 8, 2017, 9:06:46 PM11/8/17
to Fedora Community

Hello everyone,

I’m passing along a writeup regarding the use of URLs as URIs for digital objects in Fedora – see below.  This writeup was prepared by Nancy Fallgren, the metadata librarian for our digital repository here at the National Library of Medicine, and arose out of recent discussions in the DC Fedora User’s Group.

I’m not a metadata librarian, nor do I play one on TV.  However, I do share Nancy’s concerns regarding the ability to create meaningful RDF statements to objects in Fedora if the objects are identified by URLs that are the physical location of the object in Fedora, rather than URIs.  More specifically, it seems to us that it may be difficult to use Fedora as an LDP for usage by external entities (<Digital object URI> <has author> <Benjamin Bell>), although certainly there are internal use cases as well (<Parent URL> <has child> <Child URL>).

-Doron Shalvi, NLM

=========================================================

In the DC Fedora Users Group, we have seen several implementation examples that use a URL as the URI for a digital object.  Yes, a URL is a kind of URI, it is very specifically a pointer to a unique web location.  However, there is a difference between the location of a digital object on the web and the digital object itself.  In RDF this distinction is important because it is important to publish, for use and re-use, true statements on the Web.  For example, if my digital object is the book A System of Surgery by Benjamin Bell, I do not want to inaccurately state in RDF that 
<Web location URL> <has title> <A System of Surgery> 
<Web location URL> <has author> <Benjamin Bell>

Rather I want to accurately state in RDF
<Digital object URI> <has title> <A System of Surgery>
<Digital object URI> <has author> <Benjamin Bell>
<Digital object URI> <has web location> <URL>

The latter are true and accurate statements.  The web location of the digital object is a property of the digital object and not the digital object itself.  

The Europeana Data Model acknowledges this distinction with the properties edm:hasView and edm:isShownBy (both representing URLs where the digital object can be viewed) and edm:aggregatedCHO (representing a URI for all the pieces that together comprise the entire digital object).  See https://pro.europeana.eu/files/Europeana_Professional/Share_your_data/Technical_requirements/EDM_Documentation/EDM_Mapping_Guidelines_v2.4_102017.pdf at p. 11 for an example.

The W3C also makes this distinction in its discussion of Cool URIs https://www.w3.org/TR/cooluris/ .  Building on the W3C Cool URI documentation, the library cataloging community has acknowledged the need for URIs to represent Things in preparing its MARC format data for conversion to RDF.  MARC proposal 2017-08, which was approved and accepted, lays out the need to distinguish URIs for Things from URIs for descriptions of Things (e.g., a person from an authority record about that person) in MARC records and also includes a coherent explanation of the use of URIs and particularly Real World Object URIs in RDF.  https://www.loc.gov/marc/mac/2017/2017-08.html 

At NLM, we are concerned about the use of URLs as URIs for digital objects themselves in Fedora4 implementations and would like to open a discussion with the community on the topic.

-Nancy Fallgren, NLM

David Chandek-Stark

unread,
Nov 9, 2017, 9:41:24 AM11/9/17
to fedora-c...@googlegroups.com
I agree with Nancy Fallgren. Oddly enough Fedora 3 (and earlier?)
more or less got this right, at least conceptually, with the
registration and use of the info:fedora URI namespace. Fedora 4 it
seems to me has gone too far in obscuring the distinction between the
representation of a digital object (and indeed an entire repository)
and the persistence of its attributes. While everyone agrees RDF is
useful and important on many levels, it's far less clear to me at
least that it is an appropriate general purpose persistence mechanism
to manage content at scale.

--David
> --
> You received this message because you are subscribed to the Google Groups
> "Fedora Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to fedora-communi...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
David Chandek-Stark
dchand...@gmail.com

Daniel Bernstein

unread,
Nov 9, 2017, 10:29:11 AM11/9/17
to fedora-c...@googlegroups.com
Nancy, Doron, and David and.  Thank you for sharing these valuable perspectives.  
If any of you have the time, it might be fruitful to have you on the Fedora Tech call this morning to discuss how we might address some of these concerns in Fedora.  (It's happening at 11 Eastern today: https://wiki.duraspace.org/display/FF/2017-11-09+-+Fedora+Tech+Meeting).

David, in case you can't make it, I'm wondering if you can offer some more explanation of your post, specifically,

Fedora 4 it
seems to me has gone too far in obscuring the distinction between the
representation of a digital object (and indeed an entire repository)
and the persistence of its attributes.


I'm also interested in more of your thoughts on 
While everyone agrees RDF is
useful and important on many levels, it's far less clear to me at
least that it is an appropriate general purpose persistence mechanism
to manage content at scale.

************************************
Daniel Bernstein, DuraSpace




> For more options, visit https://groups.google.com/d/optout.



--
David Chandek-Stark
dchand...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Fedora Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fedora-community+unsubscribe@googlegroups.com.

Ralf Claussnitzer

unread,
Nov 9, 2017, 10:50:39 AM11/9/17
to fedora-c...@googlegroups.com
> While everyone agrees RDF is
> useful and important on many levels, it's far less clear to me at
> least that it is an appropriate general purpose persistence mechanism
> to manage content at scale.
I quite agree...

Daniel Bernstein

unread,
Nov 9, 2017, 10:53:51 AM11/9/17
to fedora-c...@googlegroups.com
Ralf:  could you elaborate on why you agree?  I'm trying to deepen my understanding of the issues.  A fuller explanation would be very helpful to me (as well as to others I'm sure).

************************************
Daniel Bernstein, Duraspace




For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Fedora Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fedora-community+unsubscribe@googlegroups.com.

Benjamin Armintor

unread,
Nov 9, 2017, 10:58:53 AM11/9/17
to fedora-c...@googlegroups.com
David,

It's hard for me to see the difference between a Fedora 3 URI and a Fedora 4 URI, excepting that the former requires a "well known" transformation to be retrievable/linkable. I get the sense that you mean something besides httprange-14*, and I think it's what you're referring to as "manag[ing] content at scale" - can you elaborate on what you mean there?

- Ben


> For more options, visit https://groups.google.com/d/optout.



--
David Chandek-Stark
dchand...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Fedora Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fedora-community+unsubscribe@googlegroups.com.

Benjamin Armintor

unread,
Nov 9, 2017, 10:59:20 AM11/9/17
to fedora-c...@googlegroups.com
Sorry, forgot to footnote: httpRange14 https://en.wikipedia.org/wiki/HTTPRange-14

David Chandek-Stark

unread,
Nov 9, 2017, 11:00:50 AM11/9/17
to fedora-c...@googlegroups.com
Danny, thanks. I have another commitment at 11. I guess in general
my concerns boil down to this, and forgive me if this is based on an
outdated understanding of the software and/or philosophy:

1) Fedora 4 imposes RDF/LDP as the principal way to interact with the
system and manage resources.
2) Related to 1), the definition of Resource as a "web-addressable"
entity (https://wiki.duraspace.org/display/FEDORA4x/Glossary#Glossary-Resource)
has consequences that may be undesirable.

If 1) is in fact true, I think it creates any number of "artificial"
problems by attempting to shoehorn more or less all interactions with
the repository into the RDF/LDP framework. The RDF representation of
resources, while important *for certain purposes*, is not
*fundamental* in the most general usage scenario. From an
administrative point of view, it seems to make things more complicated
and, most importantly, less efficient. From my point of view, the
persistence layer of the repository, which is represented (mediated)
by the Fedora application must prioritize performance and scalability
-- if it is intended to support the kinds of repositories that many of
our institutions are planning/building/supporting. To be clear,
performance was an issue in Fedora 3 as well, but Fedora 4's adoption
of LDP really seems aimed elsewhere, and does not appear to make sense
when the user-facing repository applications do not expose the Fedora
API (as in the case of Samvera and others).

I have to go ... may be able to say more later.

Thanks,
David
>> > email to fedora-communi...@googlegroups.com.
>> > For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> David Chandek-Stark
>> dchand...@gmail.com
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Fedora Community" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to fedora-communi...@googlegroups.com.
>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Fedora Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to fedora-communi...@googlegroups.com.

west...@umd.edu

unread,
Nov 9, 2017, 12:24:49 PM11/9/17
to Fedora Community
We have been having discussions that touch on many of these points as well, but from the opposite direction.  Specifically, I have been asking myself and others whether some of the common patterns and paradigms of application development (things like the Object-Relational Mapper, or Model-View-Controller) make sense in an LDP context.

I say this from the perspective of an institution that has taken a "repository-centric" approach to building with Fedora 4 and that does not have an existing application we are trying to migrate (instead we started over).

Josh Westgard
University of Maryland

Daniel Bernstein

unread,
Nov 9, 2017, 12:43:10 PM11/9/17
to fedora-c...@googlegroups.com
Hi David, 

Thanks for your thoughtful responses.  They helped to foster an energetic discussion on the tech call this morning.

Re 1)  I believe that point still stands.   But I invite others to weigh in if that is not the case.  You raise a point that is interesting to me around providing alternative ways (other than via RDF/LDP) to interact with the repository.   I agree that high performance and scalability essential.   I'm curious if you have anything specifically in mind as far as alternative interaction models are concerned.  


Re 2) I'm also curious to know more specifically what you see as some of the real or potential undesirable consequences of the web-addressability requirement.

************************************
Daniel Bernstein
Software Engineer, Duraspace
707.874.2045 (office)




>> > For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> David Chandek-Stark
>> dchand...@gmail.com
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Fedora Community" group.
>> To unsubscribe from this group and stop receiving emails from it, send an

>> For more options, visit https://groups.google.com/d/optout.
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Fedora Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> For more options, visit https://groups.google.com/d/optout.



--
David Chandek-Stark
dchand...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Fedora Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fedora-community+unsubscribe@googlegroups.com.

Daniel Bernstein

unread,
Nov 9, 2017, 12:49:10 PM11/9/17
to fedora-c...@googlegroups.com
Hi Josh - please elaborate:  from your point of view, which models/paradigms make sense or not within an ldp context and why?  I'd like to better understand your thinking.

Thanks, 

************************************
Daniel Bernstein, Duraspace


To unsubscribe from this group and stop receiving emails from it, send an email to fedora-community+unsubscribe@googlegroups.com.

David Chandek-Stark

unread,
Nov 9, 2017, 12:50:05 PM11/9/17
to fedora-c...@googlegroups.com
Ben,

FWIW, one difference between F3 and F4 URIs is that the former are
fixed insofar as they do not depend on the means of access -- which is
to say they are not URLs. That seems valuable and important to me.
(Persistent URIs can be addressed by other means of course, but ...)

W/r/t managing content at scale: imposing RDF/LDP as effectively the
sole interaction model of the repository artificially limits the
critical efficiency of CRUD operations. If we believe, perhaps
rightly, that repository resources are best handled natively in the
form of a graph, why not leverage a technology (graph db?) tuned to
serve that need? If folks want to build an LDP server on top of that,
wonderful, but that seems to me a secondary concern. I get the
conceptual appeal of the LDP model, but making it more or less *the*
API to the repository seems to me overly-opinionated.

Again, apologies if I have misunderstood or misconstrued F4 (or LDP)
in its intentions or actual development.

--David
>>> > email to fedora-communi...@googlegroups.com.
>>> > For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>>
>>> --
>>> David Chandek-Stark
>>> dchand...@gmail.com
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "Fedora Community" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an
>>> email to fedora-communi...@googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Fedora Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to fedora-communi...@googlegroups.com.

Ralf Claussnitzer

unread,
Nov 9, 2017, 12:54:30 PM11/9/17
to fedora-c...@googlegroups.com

I'm referring more to the RDF way of representing information than to the (mis)use of URLs in this context. That is probably a totally different thread. Basically, having a fully normalized information model (e.g. only Triples) does not free us from referring to a concise data model and shared understanding whenever representations of resources are exchanged (which REST is all about).

I noticed a couple of things about attempts to store and exchange data in RDF serializations:

  • It is hard to agree on a common vocabulary. Much like agreeing on formats, but somehow harder.
  • Often digital documents have well known representations (XML) but the underlying information model is not specified that well. The challenge is to uncover this information model to be able to transition to RDF (and back to XML, unfortunately)
  • Fully normalized databases have a significant performance impact on queries.

That's why I saw some people abandoning RDF (and graph databases) and turning to document stores instead. Bigger aggregates are simpler to manage and closer to the data that is currently produced in many contexts (and their use cases!). Fedora 4 is all about LDP, but it supports binaries as well and does not enforce the RDF data representation - which is a reasonable thing to do.

-Ralf

To unsubscribe from this group and stop receiving emails from it, send an email to fedora-communi...@googlegroups.com.

west...@umd.edu

unread,
Nov 9, 2017, 1:42:50 PM11/9/17
to Fedora Community
Hi Danny,

Well, I didn't say I had the answers, only that I asked the questions. It does seem that we should not assume that an ORM is the best way for an application to manage its interactions with a repository that is graph-based, operates over HTTP, and in part of a distributed and loosely-coupled architecture.  I just think it is worth questioning some of these fundamental assumptions about how one builds an app on top of a repository when the repository is an LDP server, not a relational database.

Josh

Daniel Bernstein

unread,
Nov 9, 2017, 1:56:33 PM11/9/17
to fedora-c...@googlegroups.com
Ah okay - got it.  Answers or not,  that clarifies it.   Sounds like what you're talking about touches on some of the performance issues we've seen with ActiveFedora.  
--db

************************************
Daniel Bernstein, Duraspace


To unsubscribe from this group and stop receiving emails from it, send an email to fedora-community+unsubscribe@googlegroups.com.

Jared Whiklo

unread,
Nov 9, 2017, 3:26:02 PM11/9/17
to Fedora Community
Hi all,

We had a good conversation about this on the Tech Call today and I think I have an understanding of the desire here.

So my take is that by removing the single subject restriction (SSR) from Fedora would then allow Nancy to be able to add triples to the the Fedora Resource URI (which is not the "real thing" but is a web presentation of that "real thing") which assert information about the "real thing". Removing the SSR would in fact allow anyone to add any triples to any resource.

Others have expressed a similar desire in past, so I think that as we work towards the 5.0 release of Fedora perhaps it is time to open this to discussion again.

I will say that I think that Fedora is useful as a LDP server and persistence layer for these instance objects. That storing triples is what a triplestore does best and that a solution to the issue right now would be to store these <real world object> <predicate> <information about real world object> triples in a triplestore and provide the "real thing" web presence via a separate layer from Fedora.

We are going to continue to discuss this at next Thursday's meeting.

If you have an interest in the existence of or removal of SSRs in Fedora, I ask that you attend to help move this discussion along.

We will try to avoid the HTTPRange-14 3rd rail.

cheers,
jared

Doron Shalvi

unread,
Nov 9, 2017, 4:01:08 PM11/9/17
to Fedora Community

For those of us not familiar with the history, could someone explain why there is a single subject restriction (SSR) to begin with in Fedora?

Thanks,

Doron

Durbin, Michael (md5wz)

unread,
Nov 9, 2017, 4:07:52 PM11/9/17
to fedora-c...@googlegroups.com
Reviewing IRC logs the simple answer seems to be:

<ajs6f> Fedora is an object repository. Not a triplestore.
<ajs6f> ...
The triples in RDF published by the repo actually represent the proerties of objects, not arbirary claims about the world.
properties


I'm not sure if that helps.

-Mike

________________________________________
From: fedora-c...@googlegroups.com <fedora-c...@googlegroups.com> on behalf of Doron Shalvi <sha...@mail.nih.gov>
Sent: Thursday, November 9, 2017 4:01:08 PM
To: Fedora Community
Subject: Re: [fedora-community] URLs as URIs for digital objects
To unsubscribe from this group and stop receiving emails from it, send an email to fedora-communi...@googlegroups.com<mailto:fedora-communi...@googlegroups.com>.

David Chandek-Stark

unread,
Nov 9, 2017, 4:12:18 PM11/9/17
to fedora-c...@googlegroups.com
Jared,

I don't read Nancy's concern that way, if I understand what you're
saying. She is simply pointing out that the subject of the RDF
statements is the Fedora resource URL and not an "independent" object
URI, and that there is a statement lacking which would show the
relationship of the Fedora resource URL (as a representation) to the
object URI. FWIW Fedora 3 RELS-EXT datastreams do use (in RDFXML) the
object URI <info:fedora/{PID}>, not the Fedora resource URL
<http[s]://{host}/fedora/object/{PID}>. ActiveFedora RDF datastreams
on Fedora 3 also serialize using the object URI as subject.

--David

Benjamin Armintor

unread,
Nov 9, 2017, 4:20:11 PM11/9/17
to fedora-c...@googlegroups.com
David,

The ability to use a different implementation is part of the impetus behind the API specification effort. But it might be worth observing that the LDP/Fedora API is an interoperability measure: If you're using the MODE implementation, you're using a JCR backend. If you wanted to leverage the particular strengths of that backend, it's available to you - but it will foreground questions about what a non-LDP api would be, and what the risks of approaching your datastore that way would be. Likewise, you could propose non-LDP HTTP APIs to be implemented alongside LDP - but I would observe the existing API spec effort suggests that defining these things and the process for maintaining them is no small task. For what it's worth, there are also prospective implementations built on graphstores.

- Ben




>>> > For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>>
>>> --
>>> David Chandek-Stark
>>> dchand...@gmail.com
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "Fedora Community" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an

>>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Fedora Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an

> For more options, visit https://groups.google.com/d/optout.



--
David Chandek-Stark
dchand...@gmail.com

--
You received this message because you are subscribed to the Google Groups "Fedora Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fedora-community+unsubscribe@googlegroups.com.

Jared Whiklo

unread,
Nov 9, 2017, 4:55:06 PM11/9/17
to fedora-c...@googlegroups.com
I see what you are saying David, but while Fedora 3 did give you an
<info:fedora/PID> URI it was not de-referencable and so was great for
its time, but now it seems to fall short of some of the needs of the
semantic web.

(If I am totally wrong here, you can let me know. Some of this stuff
makes my ears ring)

To make Fedora 4 do what Fedora 3 did (create and use non-referenceable
URIs) seems wrong.

If I understand the LDP spec and Cool URIs document then it seems to me
that you should be minting your own real world object URIs (RWOIDs) as URLs.

These when you read this RWOIDs you would redirect to the Fedora URL/URI
(which is an instance or representation) which provides information
about itself (as the representation) as well as about the RWOIDs.

Fedora does not allow you to provide that additional information about
the RWOID. It only provides information about itself.

My understanding is that Nancy and Doron are minting their own URLs for
RWOIDs. They are also redirecting requests to that URL to the Fedora URL.

What they can't do is return RDF with the RWOID as the subject and my
understanding is that is what they want to do.

cheers,
jared
Jared Whiklo
jwh...@gmail.com
--------------------------------------------------
It ain't the jeans that make your butt look fat.

signature.asc

west...@umd.edu

unread,
Nov 9, 2017, 5:17:58 PM11/9/17
to Fedora Community
And just to add to what Jared said, one of the requirements Nancy mentioned in the call today was that the RWOID URIs also be dereferencable http URIs. This is in keeping with the "Cool URIs" rules put forth by Tim BL.  So re-implementing the Fedora 3 model wouldn't meet today's requirements either.

David Chandek-Stark

unread,
Nov 9, 2017, 6:10:53 PM11/9/17
to fedora-c...@googlegroups.com
Jared,

> What they can't do is return RDF with the RWOID as the subject and my
> understanding is that is what they want to do.

I agree on that. But I don't see that persisting additional redundant
triples (if that's what you are suggesting) is the way to solve that
problem. Can we instead come up with something like "render resource
X as a representation of <URI>"? That seems to be the essential
assertion here: <Fedora resource X> is a representation of <RWO>.
Whether or not that fact - that the Fedora resource is a
representation of another resource - should be *persisted* to the
repository is an open question.

Thanks,
David

Aaron Birkland

unread,
Nov 9, 2017, 7:56:50 PM11/9/17
to fedora-c...@googlegroups.com

One can look to OAI-ORE for an opinionated perspective on using the architecture of the web to describe {concepts, RWO, whatever}. 

 

The key idea there is that somebody publishes a _document_ on the web which contains an authoritative description of a _thing_.  In the care of ORE, that _thing_ is an aggregation of resources as a conceptual entity, and the _document_ is called a resource map.  But you can imagine the pattern applied to RWOs.

 

Anyway, ORE stipulates that:

  • The authoritative _document_ is distinct from the _thing_ being authoritatively described within it
  • The URI of the _thing_ is distinct from the URI of the _document_, So the document can contain statements that unambiguously  pertain to either one
  • The URI of the _thing_ is a protocol-based URL that resolves to the authoritative document.  So is the URI of the _document_.

 

You can use that pattern today in Fedora 4 if you want to.  The big issue is that the onus is on _you_ to model things in that way.  The other, more esoteric issue is that there is only one way to have URIs of the _thing_ have the desired characteristic of being resolvable yet distinct from the authoritative document -- hash URIs.  Relaxing the single-subject restriction will allow more flexibility for the URIs of the _things_.

 

For example (in pseudo turtle) the contents of http://my.fedora/resource/painting123:

 

<> a fedora:resource ;

  fedora:serverManagedTripleIDontCareAbout “whatever” ;

  dc:creator “Fedo Raadmin” ;

  dct:created 2015 ;

  myns:describes <#RWO> .

 

<#RWO> a myns:RealWorldObject, myns:Painting ;

  dc:creator “Leonardo” ;

  dct:created 1503 ;

  myns:image <http://my.fedora.org/images/1234.jpg> .

 

Notice how both of the subjects we see in the RDF would resolve to the same authoritative document in a browser or client

http://my.fedora/resource/painting123

http://my.fedora/resource/painting123#RWO

 

Anyway, you can think of LDP as providing an API to a _document_ repository, where these documents are _web resources_ (some of which may contain RDF).  The document (web resource) is the unit of management in the fedora spec, which builds upon LDP; authorization is at the document level, notifications are at the document level, etc. 

 

Not interested in putting your resources on the web, or linking to your resources, or traversing hyperlinks within your resources?  Then LDP (and the fedora spec) might not be interesting do you.

 

Fedora 3 was not really a technology of the web.  Unlike fedora 4, it really _did_ have a specific object model, though (fedora object as as a collection of datastreams + properties + behaviours). There was a mapping of fedora 3 resources onto RDF (which included the contents of the RELS-EXT datastream, among others).  To be honest, it also suffers from the same RWO subtleties, at least if we look at that RDF.  <http://my.fedora/resource/painting123> is no more of a painting than <info:fedora/painting:123> is.  You sure have to do a lot more work to figure out what <info:fedora/painting:123> is, though.

 

  -Aaron

--

west...@umd.edu

unread,
Nov 10, 2017, 10:26:35 AM11/10/17
to Fedora Community
Thanks to everyone for the productive contributions to this thread. I have learned a lot here.  

In my reading, David's last message has captured what one of Nancy's main issues with the way the current Fedora works -- I say this merely as someone who has had conversations with her about this on several occasions, and with some reticence about speaking for someone else. 

Namely, Fedora doesn't allow RDF statements with subjects that are well known and widely used identifiers external to the Fedora instance itself (one example being what NLM in shorthand calls the "permalinks" that they mint). Aaron's concise demonstration of how a hash URI can be used to accomplish the same thing is one possible way forward.  Relaxing the single subject restriction would, in my mind, be another.  The two solutions are not mutually exclusive.

Josh

Ralf Claussnitzer

unread,
Nov 10, 2017, 12:08:29 PM11/10/17
to fedora-c...@googlegroups.com
> Fedora is an object repository. Not a triplestore.
This is actually a solid design decision right there.

+1 to what Jared wrote:

> I will say that I think that Fedora is useful as a LDP server and
> persistence layer for these instance objects. That storing triples is
> what a triplestore does best and that a solution to the issue right
> now would be to store these <real world object> <predicate>
> <information about real world object> triples in a triplestore and
> provide the "real thing" web presence via a separate layer from Fedora.

-Ralf

Aaron Birkland

unread,
Nov 13, 2017, 10:28:41 AM11/13/17
to fedora-c...@googlegroups.com

I think part of the problem is that Fedora and LDP (the spec) neither constrain nor inform one’s worldview of repository resources; they’re just documents on the web.  Fedora4 the software _does_ constrain resources in a particular way (single-subject restriction), but the explanation is unsatisfying.  It seems to suggest that the object model essentially JCR.  To  me, that feels flexible, constraining, and anemic at the same time. 

https://wiki.duraspace.org/display/FEDORA4x/The+Fedora+4+object+model

 

Another problem revealed in this thread is that some don’t seem to have use cases for repository resources as web resources at all.  In that world, resolvable URLs for resources don’t make sense, and HTTP and LDP are merely “the API and protocol of the day.”  Fedora 3 was firmly of this tradition, whereas Fedora 4 (and the fedora spec) are firmly in the “of the web” camp.  There is a bit of tension there that I don’t think has fully resolved.

 

In any case, eliminating the single-subject restriction would further tilt the balance toward Fedora being “of the web” and the implementation being more like a “document store”, where the notion of object modelling is purely a client/domain-level concern.  Depending on one’s perspective that may be a good or bad thing; so it is good that we’re having this debate.

 

  -Aaron

David Chandek-Stark

unread,
Nov 13, 2017, 10:58:22 AM11/13/17
to fedora-c...@googlegroups.com
Aaron,

Thank you for stating this so succinctly, and I think, accurately.

I've been struggling to communicate the concern that for many client
applications built on Fedora as an object store the use of HTTP, RDF,
and LDP are internal implementation details that are not *essential*
to the architectural function of the object store (at least as we have
conceived it). If Fedora resource URLs are only *internally*
resolvable -- because, for example, Fedora is behind a firewall and/or
accessible only to a dedicated client application, then are they
really "web" resources per se? The distinction in this context
between a "web resource" and, say, a record in a relational database
seems arbitrary and unhelpful. Creating web-resolvable links to
representations of digital objects is in many cases considered the
proper responsibility of the client application, not the object
store/repository itself.

The issue ultimately is not whether RDF and LDP are useful and
important, but where in the application stack it makes the most sense
to implement those patterns and protocols. From my point of view, the
object store -- insofar as that is what Fedora is -- does not seem to
be the best place to inject these concerns.

Thanks,
David

On Mon, Nov 13, 2017 at 10:28 AM, Aaron Birkland <a...@jhu.edu> wrote:
>
> Another problem revealed in this thread is that some don’t seem to have use
> cases for repository resources as web resources at all. In that world,
> resolvable URLs for resources don’t make sense, and HTTP and LDP are merely
> “the API and protocol of the day.” Fedora 3 was firmly of this tradition,
> whereas Fedora 4 (and the fedora spec) are firmly in the “of the web” camp.
> There is a bit of tension there that I don’t think has fully resolved.
>

--
David Chandek-Stark
dchand...@gmail.com

Doron Shalvi

unread,
Nov 13, 2017, 11:40:44 AM11/13/17
to Fedora Community
David and Aaron,

Thanks for these posts, which help clarify for us these issues.

Our primary motivation in using Fedora is to model and manage digital objects in a repository.  Yes, Fedora is an object repository, not a triplestore, and that is primarily what we look for from Fedora.

We would also like to provide these resource on the web, and to provide data about these resource on the web, so that others can link to them.  A LDP that is aware of our public URIs (permalinks) would seem to meet this need.  We can meet this need in a layer outside of Fedora, as some have suggested.

However, a LDP "defines a set of rules for HTTP operations on web resources, some based on RDF, to provide an architecture for read-write Linked Data on the web." (W3C)  If Fedora is not intended to serve the web layer, what is gained by Fedora as a LDP?  Some have mentioned they still find Fedora useful as a LDP server even if it is primarily storing instance objects - I'd love to learn more about this use case.

If Fedora is primarily an object repository, and not necessarily to provide an architecture for read-write Linked Data on the web, it may be more efficient to not require LDP in the Fedora specification or implementations.

Thanks all for the informative discussion.

Doron

Ralf Claussnitzer

unread,
Nov 13, 2017, 11:42:19 AM11/13/17
to fedora-c...@googlegroups.com
Hi,


On 11/13/2017 04:28 PM, Aaron Birkland wrote:

I think part of the problem is that Fedora and LDP (the spec) neither constrain nor inform one’s worldview of repository resources; they’re just documents on the web.  Fedora4 the software _does_ constrain resources in a particular way (single-subject restriction), but the explanation is unsatisfying.  It seems to suggest that the object model essentially JCR.  To  me, that feels flexible, constraining, and anemic at the same time. 

https://wiki.duraspace.org/display/FEDORA4x/The+Fedora+4+object+model

Good point. The F4 object model is "Cribbed heavily from the JCR Repository Model documentation". Maybe there is a good explanation somewhere on why JCR and LDP should be such a good match?


 

Another problem revealed in this thread is that some don’t seem to have use cases for repository resources as web resources at all.  In that world, resolvable URLs for resources don’t make sense, and HTTP and LDP are merely “the API and protocol of the day.”  Fedora 3 was firmly of this tradition, whereas Fedora 4 (and the fedora spec) are firmly in the “of the web” camp.  There is a bit of tension there that I don’t think has fully resolved.

In a microservices (and REST) world - and I would say Fedora is following this approach - resolvable URIs for resources are very helpful.

 

In any case, eliminating the single-subject restriction would further tilt the balance toward Fedora being “of the web” and the implementation being more like a “document store”, where the notion of object modelling is purely a client/domain-level concern.  Depending on one’s perspective that may be a good or bad thing; so it is good that we’re having this debate.

Fedora is designed to be web focused. Implementing LDP and its very flexible information model does not keep anybody from using it as a document store. But there should probably be more support for this kind of use case?

-Ralf

Benjamin Armintor

unread,
Nov 13, 2017, 1:29:47 PM11/13/17
to fedora-c...@googlegroups.com
When Fedora 4 was initially being developed, it had its own REST API that called back to the FCR 3 REST API in some ways, and introduced the notion of actual parent-child relationships (a feature request that had some traction in the FCR 3 days). We developed this API (more than one, actually*) because Modeshape (MODE) did not at the time have a REST API. When LDP 1 was finalized, we switched to it as the basis of the API because it was such a close match to what we were trying to do already (and for that matter the basic FCR 3 model) and was defined with recourse to a much broader community of interest.

But LDP was not the driving force behind the FCR4 effort - that effort was motivated in large part by a desire to reduce the number of lines of code in the product to something more in line with the community's demonstrated capacity for maintenance; secondary but important concerns included some modernization (fuller support of HTTP 1.1) and better scalability (it's true). MODE had support for clustering and at the time a very flexible storage approach via Infinispan (ISPN), and the fact that URIs by their structure create directed acyclic graphs meant the fundamental data model was in parallel with FCR and LDP.

MODE was not the only backend we considered. If I'm remembering correctly, we also experimented with:
1. a "brownfield" approach building on Fedora 3
4. the Oxford databank (I think this currently exists in project behind the OCFS threads on this list)

We also had a number of early experiments with triplestores, but as an index (not as the principal datastore) because binary storage was regarded as a requirement. Much of this is present in the "archived pages" in the wiki: https://wiki.duraspace.org/display/FF/Archived+Pages

If those are restricted, I hope they can be in some way published - we worked pretty in the FCR4 project to be transparent about the decisions that were being made.

In any case, the subject restriction is something that might be somewhat simpler to implement in MODE, but its not a JCR characteristic that we're exposing via the LDP interface: it's the product of affirmative work by committers who think it's a characteristic of a digital object repository. Some of the particular places this has gone through argumentation over the course of FCR4 development include the subject of statements describing NonRDFSources and statements in versions.

I'm suspicious that there's something of a red herring in this thread about the URI issue: Fedora 3 URIs would have had the same issue if you tried to use object properties, DC and RELS-EXT/INT exclusively to represent a description - the info:fedora URI would not have been the URI of the descriptum, but of the description. You would have had to solve this problem by using Fedora 3 to store a serialized description as binary, and translated it into description again in middleware. You could pursue the same strategy in Fedora 4. Or, in the same way you might have understood the FCR 3 info:fedora URI as substitutable prefix for an "actual" URI - depending on your PID management strategy - you could replace large portions of your Fedora 4 URI in a middleware layer.

However, the REST API was - for the constituencies represented in the Fedora 4 effort - the primary (if not principal) interface between the middleware (Samvera nee Hydra, Islandora, etc.) and the repository, and it too was regarded as a requirement. That said, if you earnestly feel that your descriptions are misrepresented by the subject URI, it's possible that your issue is not Fedora 4, but the particulars of how you've chosen (or your proxy has chosen) to persist those descriptions: Sometimes when we start criticizing LDP and Fedora 4, we're expressing a frustration with something else. For example:

You could configure MODE to use an unorderedHugeCollection as the root node. You could define a Fedora 4 class that was a BasicContainer with a single child - local name "datastreams" - that was a DirectContainer, with its own container as the membership resource and "info:fedora/fedora-system:def/view#disseminates" as the membershipRelation. In this you would essentially recapitulate a rudimentary Fedora 3 object model of the type used by Hydra. By moving all the business of description into attached nonRDFSources and moving the responsibility for verifying inter-object relatonships into the application layer, you would avoid the URI issue you describe and the performance problems associated with the "many members" scenarios. For that matter, you could contrive a Fedora 3 repository that presented itself as the above.

That wouldn't solve other problems we discuss in the context of LDP - for example, lack of bulk operations, or query APIs besides resource resolution, or the single subject restriction on "native" statements. But those are problems we have historically not solved or problems for which we have strained to reach consensus. It would also put the adopter of such an approach at odds with the software communities that - even in Fedora 3 - actively chose to use the repository resource as a surrogate for the subject of attached description, or those for whom having some sense of repository-side structural enforcement was desirable.

This is all to say that we should be very careful not to use "RDF" or "LDP" as a shorthand for "our patterns of use around RDF", etc. : I don't think there's anything in LDP per se that prevents you from using an implementation as a document store, or to store and manage for Europeana descriptions, around URIs distinct from the resource being managed via LDP, with the caveat that, if the question is the relationship between the resource as a representation of description and the subject as a digital object, HTTPRange14 is a long-standing conundrum. But it's very possible that the FCR4 committers, or the authors of your application platform, are discouraging that pattern of use.




- Ben

> To unsubscribe from this group and stop receiving emails from it, send an email to fedora-community+unsubscribe@googlegroups.com<mailto:fedora-community+unsubscribe@googlegroups.com>.

> For more options, visit https://groups.google.com/d/optout.

> 

 

 

--

You received this message because you are subscribed to the Google Groups "Fedora Community" group.

To unsubscribe from this group and stop receiving emails from it, send an email to fedora-community+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

 

--
You received this message because you are subscribed to the Google Groups "Fedora Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fedora-community+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Fedora Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fedora-community+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages