Fwd: Dryad and Utopia

5 views
Skip to first unread message

Todd Vision

unread,
Feb 19, 2012, 2:31:59 PM2/19/12
to drya...@googlegroups.com
Hello all,

Steve Pettifer of Utopia Documents (a v. cool interactive PDF renderer for scientific literature, see http://getutopia.com) has written a Dryad plugin and is looking for our feedback.  I've already responded to him that (hopefully) the DOI issue he had is due to the now-fixed DataCite glitch, and Ryan and I should speak with him to confirm how the API is being used.  And in the long run, we could probably just feed him an <iframe>.

But for now, what do people think of the way the Dryad package is displayed? (It would only be there for an article with data in Dryad obviously).

Some questions/issues that occur to me:
- Should it include links to individual files, just their names, or route users to the data package only? 
- If we expose individual files like that, we should try to ensure that all the contextual metadata is getting downloaded by users who click links to individual files.
- Are we counting hits to pages and files through the API in the view and download counts?

Any thoughts?

Todd

---------- Forwarded message ----------
From: Steve Pettifer <steve.p...@manchester.ac.uk>
Date: Sat, Feb 11, 2012 at 7:36 PM
Subject: Dryad and Utopia
To: Todd Vision <todd....@gmail.com>


Hi Todd

Rather later than promised, I've finally got round to writing a proper plugin to link Dryad to Utopia; a screenshot of my attempt is attached (the formatting is ugly and done in a bit of a rush, so please ignore that, and it'll improve quite quickly once I know I'm showing the right data from you).

Assuming this is something you'd still like to go ahead with, I wonder whether you could have a think about the words and content in the plugin screenshot, and make any suggestions for improvements; I'm conscious that its important to encourage people to cite the data, so I've put that fairly prominently in the box, but that can easily be moved if needed. Also I'm unsure about the terminology I've used for 'packages' etc -- I think I've probably made some of that up!

I'd also like to ask some questions about the data DOIs; I notice that many of them won't resolve through crossref, but do resolve to sensible things when clicked on Dryad itself (I think I can see what's happening on the dryad site, but would like to be able to have a quick chat about the technical aspects of that to make sure I'm dealing with those properly rather than reverse engineering from the HTML I can see). So I think I'm misunderstanding something important there!

We're hoping to be able to make a release of UD including these features within a week or two, and I'm hoping you still think this would be a useful addition for us both. 

I hope all is well with you

Best wishes

Steve



Screen Shot 2012-02-11 at 18.16.27.png

Ryan Scherle

unread,
Feb 19, 2012, 5:13:00 PM2/19/12
to Todd Vision, drya...@googlegroups.com
On Feb 19, 2012, at 2:31 PM, Todd Vision wrote:
Some questions/issues that occur to me:
- Should it include links to individual files, just their names, or route users to the data package only? 
- If we expose individual files like that, we should try to ensure that all the contextual metadata is getting downloaded by users who click links to individual files. 

This seems to be nearly the same case as the Elsevier integration, so I would prefer to have the same type of information displayed. For Elsevier, we decided that we will display the filenames, but all links will go to the data package page. See: http://wiki.datadryad.org/Elsevier_Integration

- Are we counting hits to pages and files through the API in the view and download counts?

Yes, all of the accesses go through the same mechanism.


--- Ryan

Hilmar Lapp

unread,
Feb 19, 2012, 6:22:00 PM2/19/12
to Ryan Scherle, Todd Vision, drya...@googlegroups.com
I find it rather odd frankly if links don't go to the place they suggest they go. So while I'm OK with the idea of not linking to the bitstreams, the links should then be such that they don't suggest they do, and not be such that they suggest they go to different places when in fact they go to the same.

In the attached screenshot (which suggests that this is a very nice plugin, BTW), and as I now see also in the mockup for the Elsevier integration, the links for the files very clearly suggest they go to the files, either to the bitstreams or to a file-specific page (as one would have in MediaWiki for uploaded files, for example). If that's not where they go to, don't have those links there. 

(The alternative of displaying the per-file links in a way that would make it clear that they go the package would probably result in rather redundant links - because in fact they are redundant, so I don't see a good reason to have them there if you don't want to link to the files.)

And BTW why was linking to the individual files considered undesirable?

-hilmar
 
--
You received this message because you are subscribed to the Google Groups "dryad-dev" group.
To post to this group, send email to drya...@googlegroups.com.
To unsubscribe from this group, send email to dryad-dev+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/dryad-dev?hl=en.

-- 
===========================================================
: Hilmar Lapp  -:- Durham, NC -:- informatics.nescent.org :
===========================================================



Greenberg, Jane

unread,
Feb 20, 2012, 5:00:42 AM2/20/12
to Ryan Scherle, Todd Vision, drya...@googlegroups.com

*context* is king, or at least  that’s the party line J.

 

dryad’s model is data-package centric, despite file representation near-independence.

theoretically, there may be no problems w/ utopia approach (i like it..);  but the pressing question is if users understand the context.  indeed, consistency across projects is key, and the Elsevier approach can serve as a good model.  (Hilmar conveys links are not working, but is this just the temp. DOI situation that has been resolved or more?).

 

best wishes, jane

--

Vision, Todd J

unread,
Feb 20, 2012, 7:26:31 AM2/20/12
to Dryad Developers
Jane, exactly.

We are trying to avoid users downloading a datafile without the accompanying readme or other metadata that they need to understand its contents, without the link to the paper, and without documentation of where/when it was published on Dryad. So my feeling is that even an individual file download should ultimately be packaged ina larger bundle (even if not a data package as we define it), and that bundle should at least include a human-readable manifest along with the datafile. I take this also to follow from the recommendations of the NISO-NFAIS for supplementary materials:
http://www.niso.org/apps/group_public/document.php?document_id=7964&wg_abbrev=suppbusiness

Our practice doesn't align with this very well yet, and I'm not sure how best to square the need to provide API access to the bitstream while preserving context. Ideas?

In the absence of an elegant, immediate solution for managing this tradeoff, isn't it preferable just to direct users to the landing page for the package?

Todd

Hilmar Lapp

unread,
Feb 20, 2012, 12:06:02 PM2/20/12
to Greenberg, Jane, Ryan Scherle, Todd Vision, drya...@googlegroups.com

On Feb 20, 2012, at 5:00 AM, Greenberg, Jane wrote:

(Hilmar conveys links are not working, but is this just the temp. DOI situation that has been resolved or more?

Just for clarification, I didn't say links are not working. I said that I think it's a bad (because counter-intuitive) user-experience if links that due to the way they are shown suggest they link to different data files, but in reality all link to same data package. That has nothing to do with DOI resolution.

-hilmar

Greenberg, Jane

unread,
Feb 21, 2012, 6:18:57 PM2/21/12
to Vision, Todd J, Dryad Developers
See below, following **

-----Original Message-----
From: drya...@googlegroups.com [mailto:drya...@googlegroups.com] On Behalf Of Vision, Todd J
Sent: Monday, February 20, 2012 1:27 PM
To: Dryad Developers
Subject: Re: [dryad-dev] Fwd: Dryad and Utopia

Jane, exactly.

We are trying to avoid users downloading a datafile without the accompanying readme or other metadata that they need to understand its contents, without the link to the paper, and without documentation of where/when it was published on Dryad. So my feeling is that even an individual file download should ultimately be packaged ina larger bundle (even if not a data package as we define it), and that bundle should at least include a human-readable manifest along with the datafile. I take this also to follow from the recommendations of the NISO-NFAIS for supplementary materials:
http://www.niso.org/apps/group_public/document.php?document_id=7964&wg_abbrev=suppbusiness

Our practice doesn't align with this very well yet, and I'm not sure how best to square the need to provide API access to the bitstream while preserving context. Ideas?

**if we were to implement the Dryad DCAP 3.0, there is a property, "dcterms:isPartOf/Associated Dryad Data Package Identifier" that could provide this link, so the user could easily get the full load of contextual material. Although, it is supposed to be the key to this sort of linking, we don't have evidence of it working, b/c it hasn't been implemented. Indeed, Dryad could follow our suggestion below, and it may provide a solution. My concern is that it may place too much of a burden on the user (searcher), and could cause frustration too. A users may say...what happened to the link for the dataset I found? The link clicked just gives me the publication or a higher level of information, and I want the nugget. The best approach in my opinion would be to provide the 'isPartOf" association, or perhaps the description for individual data files needs to be revisited.

Mark Diggory

unread,
Feb 21, 2012, 6:51:31 PM2/21/12
to Greenberg, Jane, Vision, Todd J, Dryad Developers
Hello Jane,

This should already be available in the Dryad metadata.

http://datadryad.org/handle/10255/dryad.37949?show=full
dc:relation.ispartof = doi:10.5061/dryad.8bp5pp44

http://datadryad.org/handle/10255/dryad.37948?show=full
dc:relation.haspart = doi:10.5061/dryad.8bp5pp44/1

====

To the Group,  

There is existing functionality for exporting via DisseminationPackagers in DSpace.

I would caution that once you head down the road of producing a custom package format, be careful about getting caught expending a great deal of time debating what this package should contain or what its formats should be.  Note that Dryad is already creating bagit packages for Treebase handshaking (http://wiki.datadryad.org/BagIt_Handshaking)

To give you an example in action today, MIT supports exporting the entire OCW package as a dissemination format on Open Courseware packages, this just leverages the Packager Framework in DSpace to attain a Zip package with IMS-CP manifest and contents of the Item (many html, css and javascript bitstreams), this could be a METS, or in your case, a Bagit package.


However, if you wanted the entire DataPackage+DataFiles, packaging more than one Item would need to be a more involved customization to such a DisseminationPackager, but could be feasible.

Best,
Mark

On Tue, Feb 21, 2012 at 3:18 PM, Greenberg, Jane <ja...@email.unc.edu> wrote:
See below, following **

-----Original Message-----
From: drya...@googlegroups.com [mailto:drya...@googlegroups.com] On Behalf Of Vision, Todd J
Sent: Monday, February 20, 2012 1:27 PM
To: Dryad Developers
Subject: Re: [dryad-dev] Fwd: Dryad and Utopia

Jane, exactly.

We are trying to avoid users downloading a datafile without the accompanying readme or other metadata that they need to understand its contents, without the link to the paper, and without documentation of where/when it was published on Dryad.  So my feeling is that even an individual file download should ultimately be packaged ina larger bundle (even if not a data package as we define it), and that bundle should at least include a human-readable manifest along with the datafile. I take this also to follow from the recommendations of the NISO-NFAIS for supplementary materials:
http://www.niso.org/apps/group_public/document.php?document_id=7964&wg_abbrev=suppbusiness

Our practice doesn't align with this very well yet, and I'm not sure how best to square the need to provide API access to the bitstream while preserving context.  Ideas?

**if we were to implement the Dryad DCAP 3.0, there is a property, "dcterms:isPartOf/Associated Dryad Data Package Identifier" that could provide this link, so the user could easily get the full load of contextual material.  Although, it is supposed to be the key to this sort of linking, we don't have evidence of it working, b/c it hasn't been implemented.  Indeed, Dryad could follow our suggestion below, and it may provide a solution.  My concern is that it may place too much of a burden on the user (searcher), and could cause frustration too.  A users may say...what happened to the link for the dataset I found? The link clicked just gives me the publication or a higher level of information, and I want the nugget.    The best approach in my opinion would be to provide the 'isPartOf" association, or perhaps the description for individual 
 



--
@mire Inc. 
Mark Diggory (Schedule a Meeting)
2888 Loker Avenue East, Suite 305, Carlsbad, CA. 92010
Esperantolaan 4, Heverlee 3001, Belgium
http://www.atmire.com


Greenberg, Jane

unread,
Feb 21, 2012, 7:08:21 PM2/21/12
to Mark Diggory, Vision, Todd J, Dryad Developers

Hello Mark,

 

Yes, great, what you list below is essentially the same as the Dryad AP 3.0; dcterms is viewed as a bit more compliant w/DCAM, but this doesn’t matter in this context.

 

So, now, maybe I’m not understanding Todd’s question (or he would like more context w/the file metadata?  The one thing I notice is that I can’t click on the dc:relation metadata to easily navigate up to the parent/package metadata or down to the children. In other words, the relation IDs  are not hypertext in the Dryad, or at least via my view in Firefox.  Is this the case w/other folks, or just me?

 

best wishes, jane

Description: Image removed by sender. @mire Inc. 

Mark Diggory

unread,
Feb 21, 2012, 8:17:15 PM2/21/12
to Greenberg, Jane, Vision, Todd J, Dryad Developers
On Tue, Feb 21, 2012 at 4:08 PM, Greenberg, Jane <ja...@email.unc.edu> wrote:

Hello Mark,

 

Yes, great, what you list below is essentially the same as the Dryad AP 3.0; dcterms is viewed as a bit more compliant w/DCAM, but this doesn’t matter in this context.


There is talk about improving DSpace to use dcterms rather than dc elements and the antiquated approach to qualification that was in play when DSpace was initially designed.  This would probably be along the lines of adding a dcterms namespace and assigning all the dcterms as elements and leaving the qualification empty, then creating a custom namespace for those "terms" that the DSpace community "invented". 

There is significant work that would need to happen to really improve the platform and provide a proper migration path for existing instances, given the degree of customization in Dryad, you can see how quickly the complexity increases if you consider the amount of custom metadata fields that have been added to all the existing DSpace instances.  But I digress, I'm just expressing why the community hasn't just "switched" at this time.
 

So, now, maybe I’m not understanding Todd’s question (or he would like more context w/the file metadata?  The one thing I notice is that I can’t click on the dc:relation metadata to easily navigate up to the parent/package metadata or down to the children. In other words, the relation IDs  are not hypertext in the Dryad, or at least via my view in Firefox.  Is this the case w/other folks, or just me?


The presentation of the list of DataFile items in the DataPackage view is based on these metadata fields for dc.relation.hasPart 


and the linking in the DataFile Summary View back to the DataPackage is based on dc.relation.isPartOf.


In production, these redirect through the doi proxy. However, in the next version, these have been improved/redesigned to utilize a local "resource" representation of the DSpace item as the following:

Data Package


and DataFile


This approach utilizes a new "External Identification" service we designed for DSpace that allows for the minting and resolution of external identifiers for DSpace resources during ingest so that additional identifier schemes (DOI in this case) can be supported easily in the platform.

Still, the related hasPart/isPartOf representation is captured in the contents of the page below the metadata table by embedding links and descriptive details for the related resource, not based on linking of the text within the table.

Cheers,
Mark



--
@mire Inc. 
Reply all
Reply to author
Forward
0 new messages