Tim,
first of all: wow:-) A deep bow towards Boston (oops, sorry, Cambridge:-) for the work you have put into this!
I have made some minor (editorial) comments; I've attached an annotated copy of the PDF file. I hope you can read it properly. I have only one slightly more substantial comment (also in the text): on the top of page 11, you write:
• Creator Identifier(s): ORCiD ID(s)6 of the individual creator(s).
I have lots of sympathy for ORCiD, but I wonder whether we are in position to put a stake on the ground and _require_ ORCiD. There are other possible identifier schemes that publishers are discussing (eg, ISNI), others are favouring a distributed approach whereby authors (and individuals) use HTTP URI-s to identify themselves (Web ID-s); I think the jury is still out on this. Just referring to unique identification of authors, and use ORCiD as a good example, might be a better approach.
As for the Data Access Method comment: I must admit I do not really understand the comments: 'Point 3 follows a linked data model that is library applicable but might not translate to other stakeholders.' The third option, ie, using a <link> element in an HTML file is not substantially different than points 1 or 2 in my view; all three aim at the same issue: find an associated reference when accessing a specific link.
What is indeed not really clear from the text, now that I re-read it, is what the goal of this section is in this specific context. Is it
- I dereference the URI, I get a landing page in HTML, and I want to have an automatic way to access the metadata?
- I dereference the URI, I get a landing page in HTML, and I want to have an automatic way to access the _real_ data content itself?
My feeling is that we are talking about the former; if so, this should be clearly stated in the section: I may need a way to access, say, the JSON version of the metadata based on the URI. And for that, I need a way to access that data itself. I wonder whether, just by making the goals a bit clearer, we would not answer the comment. (Actually, switching the order between this section and the previous one, ie, just talking about these mechanisms in relations to the 'content encoding of landing pages' may also help, possibly not even keeping it as a separate section. That being said: I am not even 100% sure that this section is strictly necessary for the paper. I goes into a level of technical details that may be unnecessary. Ie, another way of answering the comment is: nuke the section! :-)
If we decide to keep the text, some more comments came to my mind
- Item #1: in fact, it does not necessarily require a 'webmaster', but may require special web server knowledge as well as privileges (in Apache one might be able to use the '.var' mechanism to set up content negotiations; if the user has the right to do so, that may be enough, without being a webmaster. I would rather say 'requires more than average web knowledge and possibly privileges'.
- Item #1: it may be worth emphasizing that this may give the possibility to provide the same content in different format and the end user may set a priority. Ie, the same metadata may be there in Turtle, RDF/XML, and JSON, and the user decides which one to choose.
- Item #2: the same comments as item #1 above in terms of webmaster. People who have the right to provide, e.g., a PhP script on a web site may also implement this, no need for a webmaster position.
- Item #3: I think the example is unnecessary or, more exactly, either we have an example for all three bullet points or for none of the three.
I am around tomorrow (Monday) and Tuesday if you need more help.
Thanks and a happy new year to you (and everybody else!)
Cheers
Ivan
> To unsubscribe from this group and stop receiving emails from it, send an email to
idmeta+un...@force11.org.
> <Revisions letter accessibility PeerJ.doc>
>
> To unsubscribe from this group and stop receiving emails from it, send an email to
idmeta+un...@force11.org.
> <achieving-human-machine-73.pdf>
>> Begin forwarded message:
>>
>> From: PeerJ <
peer....@peerj.com>
>> Subject: Decision on your PeerJ submission: "Achieving human and machine accessibility of cited data in scholarly publications" (#2014:12:3509:0:0:REVIEW)
>> Reply-To: PeerJ <
peer....@peerj.com>
>> To: Tim Clark <
tim_...@harvard.edu>
>> Date: December 25, 2014 at 1:11:20 PM EST
>>
>> PeerJ
>> Thank you for your submission to PeerJ. I am writing to inform you that in my opinion as the Academic Editor for your article, your manuscript "Achieving human and machine accessibility of cited data in scholarly publications" (#2014:12:3509:0:0:REVIEW) requires a number of major revisions before we could accept it for publication.
>>
>> The comments supplied by the reviewers on this revision are pasted below. My comments are as follows:
>>
>> Editor's comments
>>
>> Regarding concerns expressed by reviewers 1 and 2, I will start by clarifying that I have been asked to consider this submission to be within scope for PeerJ, so questions of appropriateness have not been considered in this review.
>>
>> I am very supportive of the spirit and the goals of this work, and of the Force11 effort in general. However, I share Reviewer 3's concerns about the readability and clarity of this paper.
>>
>> As written, this submission does not provide enough context about the goals of the Force11 effort, the need for human and machine accessibility of data, and the specific mechanisms being discussed. As someone who has followed these efforts, and who is sympathetic to the goals, I found myself a bit befuddled by the content of the JDDCP principles (why not cite all 8?) and some of the acronyms in table 1 (NBN? N2T ARK?). I fear that readers who are less familiar with these topics would be thoroughly confused.
>>
>> Given that this paper is trying to argue for a set of practices that would involve a change of practice for many potentially recalcitrant investigators, I suggest the addition of some additional introductory material that would more clearly express the need for this sort of data description, the existing landscape, and the potential solutions. A strong and clear description that would convince readers that this sort of description is both possible and not unduly burdensome would be most effective for meeting the Force11 goals. I fear that the paper as is would befuddle readers and hinder realization of these goals.
>>
>> All reviewers provided useful feedback - I suggest accounting for their concerns. Defining acronyms and providing the complete JDDCP definitions would be particularly useful. I also identified two questions that I would like to see discussed:
>>
>> 1. Regarding machine accessibility, is REST the only possible approach? Some repositories might, for example, prefer SPARQL access to triple stores - would that not be considered accessible? Some discussion might help.
>>
>> 2. Regarding descriptions of software, is it reasonable to discuss URIs for software tools?
>>
>> Please be aware that we consider these revisions to be major, and your revised manuscript will probably have to be re-reviewed.
>>
>> If you are willing to undertake these changes, please submit your revised manuscript (with any rebuttal information*) to the journal within 60 days.
>>
>> * Resubmission checklist:
>>
>> When resubmitting, in addition to any revised files (e.g. a clean manuscript version, figures, tables, which you will add to the "Primary Files" upload section), please also provide the following two items:
>>
>> • A rebuttal Letter: A single document where you address all the Editor and reviewers' suggestions or requirements, point-by-point.
>> • A 'Tracked Changes' version of your manuscript: A document that shows the tracking of the revisions made to the manuscript. You can also choose to simply highlight or mark in bold the changes if you prefer.
>>
>> Accepted formats for the rebuttal letter and tracked changes document are: docx (preferred), doc, or PDF.
>>
>> As you previously uploaded a single manuscript file for your initial submission you will need to upload any primary high resolution image and table files separately if you have not already done so.
>>
>> Harry Hochheiser
>> Academic Editor for PeerJ
>>
>> Reviewer Comments
>>
>> Reviewer 1 (Anonymous)
>>
>> Basic reporting
>>
>> There are some confusing aspects to Table 1. Please clarify how the HTTP(s) and PURL URI identifier schemes meet JDDCP criteria if they ‘fail’ upon object removal; this does not appear to agree with Principle 6. How can these ‘achieve persistence’ if they may not persist?
>>
>> Further attention to clarity in the text, with an eye to removing possible ambiguities and redundancies, would make the article read better. I urge the authors to consider issues such as the following:
>> 1. Page 2, paragraph 2: use of parentheses
>> 2. Page 2, paragraph 3, sentence 2: what does ‘It’ refer to?
>> 3. Page 2, paragraph 3: introduce the acronym DCIG before using it later on the same page
>> 4. Throughout the manuscript: minor errors in spelling, punctuation, and sentence structure
>> 5. Page 3, paragraph 1, sentence 1: is ‘has’ the correct word here? Perhaps ‘reflects’ or ‘demonstrates’, etc?
>> 6. Page 3, paragraph 4: is ‘vend’ the correct word to use?
>> 7. Page 5: a bullet point related to #3 refers to “b”; do the authors mean “2”?
>> 8. Page 6, paragraph 3: do the authors mean ‘as a draft of NISO JATS version 1.1’, as ‘1.1d2’ denotes the second draft?
>>
>> Experimental design
>>
>> Many of these stated criteria for this area are not relevant to the article, which does not describe primary research in the Biological Sciences, Medical Sciences, or Health Sciences. However, the subject matter of the article (guidelines/proposed methods for improving access, citation, and deposition of data related to scholarly publications) is certainly applicable to Biological, Medical, and Health sciences. I leave it to the Academic Editor to determine whether the article is appropriate for this publication.
>>
>> Validity of the findings
>>
>> Please see my comments under Experimental Design above.
>>
>> Comments for the author
>>
>> The guidelines are generally clear and contain sufficient detail for implementation. The manuscript is well-organized in presenting them.
>>
>> Reviewer 2 (Tim Vines)
>>
>> Basic reporting
>>
>> This paper is a short note that provides practical guidance on the implementation of the standards for data deposition and discoverability formulated in the Joint Declaration of Data Citation Principles. The writing is generally clear, although a bit choppy in places. Please see the attached pdf for specific suggestions.
>>
>> I found it hard to identify the intended audience for the paper. I suspect that the technical details will make most sense to web designers or database engineers that are making decisions on the design and appearance of data entries and the scholarly work that cites those entries. However, the introduction seems to be aimed at a broader audience, perhaps those in publishing, libraries or research institutions that are encountering the Force11 initiative for the first time and need to be convinced of the need for standardizing data citation and hosting practices. Even if that’s not the intent, it would benefit the article to add a little more detail and explanation throughout, and to strenuously avoid acronyms and other terms that outsiders (like me) may find opaque. A more accessible article would likely reach a broader audience, which can only be a good thing.
>>
>> Experimental design
>>
>> There is no original primary research being presented here – the work of choosing these particular standards while rejecting others has clearly gone on beforehand, and the paper presents the conclusions of that work.
>>
>> Validity of the findings
>>
>> I am not well placed to comment on whether the guidance presented here is valid or the best possible practice, particularly because there is no detail on how the presented solution was decided upon.
>>
>> Comments for the author
>>
>> I'm supportive of this paper, as it's important to have a published version of record that others can point to when considering data citation issues. However, the paper is not obviously within the Biological, Medical or Health Sciences, and seems closest to Computer Science. Moreover, the paper does not present ‘research’ as such, as all the evaluation of various options (and the process behind that) is not presented. The article may therefore not be a good fit to PeerJ’s remit, and this is the reason behind my 'reject' recommendation. The final decision on suitability is, of course, up to the editor.
>>
>> Annotated manuscript
>>
>> The reviewer has provided feedback as annotations on the manuscript PDF.
>>
>> Reviewer 3 (Anonymous)
>>
>> Basic reporting
>>
>> Although this is intended as a brief piece to provide operational guidance, as it will be published as a journal article rather than a technical report, I suggest providing additional background and explanation throughout the paper. Consider that the relevant audience may be more than just repository managers who are highly proficient with technical jargon, but also other related stakeholders e.g., in managerial or other advisory positions. Additional explanation and clarification would make this paper more accessible and appropriate for the publication venue.
>>
>> Title:
>> The title is broad considering the specific focus of the article. The current title reflects the overall goal of the Force11 data citation principles, rather than the specific points addressed by this article.
>>
>> Introduction:
>> Provide some background on Force11 as not all readers may be intimately familiar with the organization. Why are they trustworthy? What is their mission? Why the JDCCP, given the existence of other guidelines for data citation? How does this relate to other guidelines for data citation and metadata? Consider that the reader might appreciate a listing of all 8 principles in the introduction for additional context.
>>
>> You state that the JDDCP “deliberately” does not provide implementation guidelines. Why not?
>>
>> Don’t leave me hanging: what are some other specific implementation issues that we can expect to be addressed in the future?
>>
>> Provide parathenetical acronym for Data Citation Implementation Group (DCIG) the first time it is used.
>>
>> What is Machine Accessibility?:
>> Only a very cursory definition/description of machine accessibility is provided, yet guidelines for machine accessibility is the main point of the article. Some additional description with attention to reasons for the importance of machine accessibility would be appreciated. Consider a brief description of RESTful Web services---although this is a standard for accessing functions for others to use they would need documentation. Is provision of documentation a best practice?
>>
>> Unique Identification:
>> What do you mean by “long term commitment to persistence”?
>>
>> If the criteria in Table 1 are important, why are they not introduced and discussed in the main body of the text? Is there a particular recommendation on any of these criteria?
>>
>> Landing pages:
>> “First, as ‘mandated’ in the JDDCP – consider word choice here with ‘mandated’, may be too strong.
>>
>> Sentence order in the first paragraph here does not flow appropriately. After the “First” point the sentence explaining more about metadata should follow. After the “Second” the sentence explaining credential validation should appear.
>>
>> “Landing pages should combine human-readable and machine-readable information on a selection of the following items” – What selection should I choose from the list? Are these all optional items?
>>
>> “Explanatory or contextual information” – Should the documentation be part of the landing page or a separate document?
>>
>> “Dataset descriptions” – Need better definition of what “description” is in this context.
>> Regarding persistence and data availability, and the persistence of metadata beyond
>> de-accessioning: should this be parallel to journal articles? Why is data different?
>>
>> Minimum acceptable information on landing pages:
>> Under 2, the 6th bullet, is there a particular ISO standard that can be referenced?
>>
>> Best practices for dataset description:
>> What kind of description are we talking about here exactly? Why is it safe to say that a standard that has only been very recently released is already widely used and settled? Do you anticipate the release of any additional domain specific standards?
>>
>> Data access methods:
>> Consider providing here some additional explanation. Points 1 and 2 are generically applicable. Point 3 follows a linked data model that is library applicable but might not translate to other stakeholders.
>>
>> Persistence guarantee:
>> Can you make the relationship to persistent identifiers more explicit? This section feels somewhat overreaching. Is this a little much for citation practices? Or is this really a trusted repository issue?
>>
>> Additional comments:
>> Are there any existing examples out there that already meet these criteria that you can share?
>>
>> References:
>> In an article on citation standards, it is imperative that the reference list be correctly formatted and provide all necessary information to easily retrieve the listed documents. Check the author guidelines and make certain that both in-text and full citations in the reference list are done appropriately. For example, provide URLs and access dates for technical reports. It sends a mixed message to promote higher standards for data citation than document citation.
>>
>> Should the JDDCP itself be included in the reference list?
>>
>> Experimental design
>>
>> Not applicable
>>
>> Validity of the findings
>>
>> Not applicable
>>
>> Comments for the author
>>
>> No comments - covered above.
>>
>> © 2014, PeerJ, Inc. PO Box 614 Corte Madera, CA 94976, USA
>>
>
>> Begin forwarded message:
>>
>> References:
>> In an article on citation standards, it is imperative that the reference list be correctly formatted and provide all necessary information to easily retrieve the listed documents.
>
>
> To unsubscribe from this group and stop receiving emails from it, send an email to
idmeta+un...@force11.org.
----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home:
http://www.w3.org/People/Ivan/
mobile:
+31-641044153
ORCID ID:
http://orcid.org/0000-0003-0782-2704