Publishing assertions for deletes vs. other reasons

12 views
Skip to first unread message

Jerome Grimmer

unread,
Mar 4, 2016, 11:46:10 AM3/4/16
to learnin...@googlegroups.com

All,

We’ve hesitated to publish deletes to the LR for resources it contains for a few reasons.  The big one is, while it may be a valuable resource to the educational community at large, it may not meet the criteria for our audience.  It wouldn’t be accurate, or appropriate, for us to assert that the resource is no longer valid when it simply doesn’t meet our criteria, so we would not be publishing those as deletes to the LR.

 

The reasons I’ve come up with so far for deletes are:

·         Not Found

o   This includes 404 errors as well as error pages which technically return a status of 200 OK but have a friendly error message indicating the resource is not found.

·         Domain for Sale

o   Obviously if the domain is for sale, the resource is dead.

 

We have run into situations where the resource leads to a page which has nothing to do with education, although the page programmatically returns 200 OK.  While the end result in our system is the same, I’m not sure if the assertion should be published as a “delete” or if we should publish it with some other verb:

·         Inappropriate

o   Contains text which makes it inappropriate for the education community as a whole.  For example, an online gaming site.  It may very well be that the resource was legitimate at one time, but someone else may have purchased the domain and set up a page which is not appropriate for the LR community.

 

I’d like to start some discussion on how best to publish assertions for this situation so that we can come up with a standard way of handling it.  This way everyone understands the assertion and can apply their business logic to it.  What we’re looking at here is essentially the opposite of the “endorse” verb.  What standard verb would be appropriate in this situation, so the most people can understand it?

 

Jerome Grimmer

Southern Illinois University Carbondale

2450 Foundation Drive Suite 100

Springfield, IL

Phone: 217-786-3010 ext. 5857

Toll-free: 1-800-252-4822 ext. 5857

NOTE: My E-mail address has changed

jgri...@siuccwd.com

This email sent using 100% recycled electrons.

 

Steve Midgley

unread,
Mar 4, 2016, 6:05:00 PM3/4/16
to learnin...@googlegroups.com
Hey Jerome,

Good topic! This is pretty important, and speaks to the larger question of paradata as well I think.

Right now LR supports pretty much one delete, and it can mean several things. Primarily:
  1. Delete the metadata that I previously published
  2. Delete the resource that the metadata referred to
  3. I would like to disavow my affiliation with the resource my metadata referred to
They are all sort of mixed up, as you point out. It seems smart to try to split up these statements. A big question is which of these should be processed internal to LR and which ones should be left for the data consumers to sort out?

Clearly item #1 should be handled within LR. I think item #2 should be handled at least to some degree in LR. Item #3 and related statements "feel" like paradata to me, which means LR should transmit them, but not process them?

Item #2 is the tricky bit. How to communicate that vs a delete for #1 (delete this resource from LR, delete this metadata from LR).

As I've talked with the developers in-house, we think that this can be solved by providing a single delete capability ("/resource" http verb:DELETE), and you can either specify a doc_id (LR envelope ID) or a URL. If you provide the URL, then it's #2, doc_id is #1.

If you delete your metadata, then that metadata is expunged. If you delete a resource, your metadata about that resource is expunged AND that resource gets a "delete vote" statement attached to it with your identity on it. 


??
Steve



--
You received this message because you are subscribed to the Google Groups "Learning Registry Developers List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to learningreg-d...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Abraham Sánchez

unread,
Mar 7, 2016, 11:12:24 AM3/7/16
to Learning Registry Developers List
Hi Steve & Jerome,

Steve's solution sounds very simple and straightforward to me. Just wanted to say a couple of things:

About #2 
I think it would be better (from a developer's point of view) to have separate endpoints for the "delete metadata" and "delete resource" operations, mainly to avoid confusion and make it clear they perform different operations.
But at the same time they're closely related, so they could be somehow nested entites. Something similar to this:
  • DELETE /documents/{doc-id} => Doc ID is specified so the document (LR envelope) is deleted
  • DELETE /documents/{doc-id}/resources/{resource-url} => Both Doc ID and Resource URL are specified, so metadata is removed and a new delete vote is added to the resource. This seems like the most thorough way to delete both metadata + resource. Using the unnested version (DELETE /resources/{resource-url}), we'd have to locate the LR envelope using the URL and the user in order to remove the metadata. If we don't allow more than one envelope for the same resource and user, then it's fine, but if the system can't guarantee that we'd have a hard time deleting the doc metadata.
About #3
What about having a "disavow vote", in the same way that you're proposing having delete votes. That would imply that someone does not agree anymore with a particular resource, but doesn't want it to be deleted. Does that make sense?

Abraham Sánchez

Steve Midgley

unread,
Mar 7, 2016, 4:18:43 PM3/7/16
to learnin...@googlegroups.com
Thanks Abraham. I think your point about not overloading the http verbs is very strong. I tend to make things too compact.

It raises the opportunity to create endpoints/apis for handling specialized metadata, which I think a "delete resource" request is. It's basically paradata, similar to "bookmarked" or "favorited."

So possibly we just need to start defining some new "paradata publish" interfaces for the system, to make it much easier to submit standardized forms of paradata?
    • DELETE /documents/{doc-id} => Doc ID is specified so the document (LR envelope) is deleted
      • +1 - this is right
    • DELETE /documents/{doc-id}/resources/{resource-url} 
      • I don't agree entirely with this one b/c I think it's possible we might want to submit a "delete" for a resource we've never submitted metadata on? What about:
      • DELETE /paradata/resource/{resource-url}
    Steve

    Jerome Grimmer

    unread,
    Mar 7, 2016, 4:52:06 PM3/7/16
    to learnin...@googlegroups.com

    Illinois has found that there are resources which were published to the Learning Registry which we have found through our automated link checker are no longer available.  In some cases, a request to the URL returns a 404 (Not Found) error; in other cases we’ve determined that the request is being redirected to a human-friendly error page, again indicating that the resource is no longer available.  In a high percentage of cases, these are about resources that Illinois *did not* publish the metadata about.  We’re simply wanting to let the LR community know that this resource is, in our experience, not available any more.

     

    In the third scenario (the “inappropriate” one), Illinois wants to let the LR community know that this resource may not be appropriate for the education community as a whole.  For example, let’s suppose that NSDL submitted a resource which at one time was valid.  Now, however, the domain is owned by an online casino, and they’ve programmed their website to redirect any 404 errors to their home page.  Programmatically, the URL is technically still good, as it probably redirects with a 302 Found status, but the resource is no longer appropriate for the education community.  In such a case, I think the “disavow” paradata assertion might work well enough; it can be up to the consumer to decide if they want to include the resource in their system.  For this, I’d think the LR would want to see a standard recipe for this paradata assertion.  If everyone follows the standardized format of the assertion, it makes it easier for everyone to understand and handle as they see fit, instead of having one group do it with one verb, and another group do it with another verb.

     

    There’s a fourth scenario that I haven’t even got into yet; it is likely that Illinois would not even publish paradata back to the LR for.  For example, we found lots of resources published by the American Association of Physics Teachers.  These are mostly research papers, and would definitely be valuable to certain segments of the education community; however we might want to remove them because they are resources for which there is a cost to access.  Illinois’ system is about open education resources, as well as free ones which might not be open, but when we encounter a site with nothing but resources for sale, we tend to exclude it from our index.  We would likely NOT publish paradata assertions about these back to the LR; we’d simply remove them from our system.

     

    Again, in a high percentage of cases, the paradata assertions we’d be publishing back would NOT be about resources that we published metadata for, as we could simply delete those easily enough with the replaces API already in place.

     

    Jerome Grimmer

    Southern Illinois University Carbondale

    2450 Foundation Drive Suite 100

    Springfield, IL

    Phone: 217-786-3010 ext. 5857

    Toll-free: 1-800-252-4822 ext. 5857

    NOTE: My E-mail address has changed

    jgri...@siuccwd.com

    This email sent using 100% recycled electrons.

     

    From: learnin...@googlegroups.com [mailto:learnin...@googlegroups.com] On Behalf Of Steve Midgley
    Sent: Monday, March 07, 2016 3:19 PM
    To: learnin...@googlegroups.com
    Subject: Re: [learningreg-dev] Publishing assertions for deletes vs. other reasons

     

    Thanks Abraham. I think your point about not overloading the http verbs is very strong. I tend to make things too compact.

     

    It raises the opportunity to create endpoints/apis for handling specialized metadata, which I think a "delete resource" request is. It's basically paradata, similar to "bookmarked" or "favorited."

     

    So possibly we just need to start defining some new "paradata publish" interfaces for the system, to make it much easier to submit standardized forms of paradata?

    ·  DELETE /documents/{doc-id} => Doc ID is specified so the document (LR envelope) is deleted

    o +1 - this is right

    ·  DELETE /documents/{doc-id}/resources/{resource-url} 

    o I don't agree entirely with this one b/c I think it's possible we might want to submit a "delete" for a resource we've never submitted metadata on? What about:

    o DELETE /paradata/resource/{resource-url}

    Reply all
    Reply to author
    Forward
    0 new messages