altmetrics standards events - your input requested

69 views
Skip to first unread message

Christopher Leonard

unread,
Jul 3, 2012, 3:49:55 AM7/3/12
to altme...@googlegroups.com
Dear all,

At the recent altmetrics meeting, one of the breakout groups was looking at the issues of standards in the altmetrics world.

Whilst I think we all agreed a light touch is what is needed for this developing area, nevertheless it would be advantageous to, as a first measure, list all the 'events' which may occur when looking at alternative measures of impact and at least standardize what we all mean by a 'Mendeley like' or a 'blog mention'. 

I started a spreadsheet which has been pre-populated with some of those events listed in Jason and Heather's paper:

As well as a description of the event in question, it is also good to list the minimum number of metadata fields required to uniquely identify this particular event. 
Take a look at the spreadsheet, it's more intuitive than I'm making it sound here:

You will notice many of the fields are still empty, awaiting experts like yourself to fill them in.
Also feel free to list additional events, I've already had a request to add in link resolver usage events.


Chris

-- 
Dr C J Leonard
Editorial Director
QScience.com - A Member of Qatar Foundation
Tornado Tower, Floor 11, PO Box 5825
Doha, Qatar

skype: chrisle1972

Martin Fenner

unread,
Jul 3, 2012, 4:28:49 AM7/3/12
to altme...@googlegroups.com
Chris,

thanks for starting the spreadsheet. I will add information to it, but want to make two comments here. 

it would be great if the altmetrics community could agree on naming conventions. PLoS uses "source" for the service proving an API that can be called (Mendeley, Twitter, etc.) and "event" for a single thing relating to an article (or other "object" we collect metrics about). I like the term "provider" to describe what PLoS and other altmetrics services are doing. In other words:

A provider contacts a source to retrieve events about an object.

There are other ways to describe these relationships, but I think this is a start. 

Secondly, it is important that not all sources provide events for one of two reasons: a) licensing and b) privacy. An example for the former is citation counts from Scopus, Web of Science and partly CrossRef, an example for the latter is usage data - it might not be a good idea to provide the full requesting IP address for every usage event in an open API. How best to provide usage events probably is a discussion in itself.

I suggest that we add a second sheet to the spreadsheet that lists sources that don't provide events. These sources provide counts, a timestamp, and optionally a landing page. PDF downloads, HTML pageviews, Scopus citations and Mendeley readers are examples.

Best,

Martin

--
You received this message because you are subscribed to the Google Groups "altmetrics" group.
To view this discussion on the web visit https://groups.google.com/d/msg/altmetrics/-/HCvSIQK2eIwJ.
To post to this group, send email to altme...@googlegroups.com.
To unsubscribe from this group, send email to altmetrics+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/altmetrics?hl=en.

Mike Taylor

unread,
Jul 3, 2012, 9:31:58 AM7/3/12
to altme...@googlegroups.com
I like the comtrolled vocabulary for describing events, thanks Martin

Regarding the second sheet idea, we should continue to list all possible event providers (including wos, scopus etc). It is possible that we may see change in corporations' policy regarding this usage, somwe shouldn't rule it out. And it is also possible that we will be able to use this data in respect of researching altmetrics, so we should include it, albeit marked by its limitation. Privacy, though, is a different issue.

Regarding the point re. what does an altmetrics event mean, absolutely we need ths research. Clearly it is highly variant, both across discipline (where we know behaviour is differnet) and temporally. I doubt that Pinterest is relevant yet, but you never know ;-) What is interesting is what "meaning" might translate to - a point well made by Kelli, and this requires some thought. I would be interested in modelling altmetric impact against citation count, both in terms of strength and degree of orthangonality -'both of itself, but also in attempt to make robust predictions of citation. I think in terms of gaining wider traction in the scholalrly community, being able to express altimpact in relation to traditional citation in a normalized manner would make it more digestible and comprehensible.

( The other issue i have floating in my head relates to breaking down citation by type. There are plenty of projects that have working sentiment analysis code, i wonder if we should throw this work into the altmetrics mix (i guess this is more like extended metrics). But something we can consider. )

Mike

Fred Zimmerman

unread,
Jul 3, 2012, 9:51:51 AM7/3/12
to altme...@googlegroups.com, Peter Jones
I added a line to represent the conceptual work that Pete Jones and I have been doing about enhanced science citations.  In a nutshell, we are looking at alt.metrics with a spin from the legal world, where Shepard's Citations has for decades provided citation signals that speak to the substance of the citing language -- e.g. cited case is "abrogated", "distinguished from", "followed," and so on.  We are wondering if perhaps this approach could not be added to the world of scientific citation, with general citation patterns such as "questions the methodology of X", "improves statistical methods", etc., paired with domain-specific patterns that reflect recurring themes in those articles.  Both LexisNexis and Westlaw also provide "headnotes" or concise topical summaries of the main issues discussed in the paper -- individual topical summaries, not a unitary abstract.

As a general comment, new to this group, I find myself wondering if perhaps it is a mistake to define the activity as alt.metrics -- what can we measure -- as opposed to alt.discovery -- what can we enable.  

Fred Zimmerman
ISciences/Nimble Books

The posts at 
http://www.quora.com/science-signals - provide some background.

--

Mike Taylor

unread,
Jul 3, 2012, 9:59:26 AM7/3/12
to altme...@googlegroups.com, Peter Jones
Classification of citation type is precisely what i was referring to, generally it seems to be know as sentiment analysis. I have played with some code writtenby agnes sandor of xerox research that seemed to work extremely well in automatically classifying citing sentences into one of five or six classes (builds upon , contradicts, supports, etc) of citation. Her software is tuned to (if memory serves) life science. If there is interest, i can get up to speed and circulate. David Shotton has also developed a very complex taxonomy of citation type.

Mike

Fred Zimmerman

unread,
Jul 3, 2012, 10:04:20 AM7/3/12
to altme...@googlegroups.com, Peter Jones
Yes, would certainly appreciate info being circulated.  

The experience at LexisNexis and Westlaw was that citation classification required a combination of algorithms and (hundreds of) human editors. We are wondering if perhaps the advent of massive crowdsourcing and social media have changed the equation so that fewer paid humans are required. ;-)

Martin Fenner

unread,
Jul 3, 2012, 10:13:27 AM7/3/12
to altme...@googlegroups.com
As Mike mentioned, David Shotton has developed the Citation Typing Ontology (CiTO): http://speroni.web.cs.unibo.it/cgi-bin/lode/req.py?req=http:/purl.org/spar/cito

Euan Adie and I have used a subset of CiTO in the CrowdoMeter project (http://crowdometer.org) - crowdsourcing the semantics of about 500 tweets linking to scholarly papers. We used the following CiTO statements:

cito:agreesWith
Agrees with statements, ideas or conclusions presented in the cited paper. Positive sentiment, e.g. "great work!".

cito:discusses
Discusses statements, ideas or conclusions presented in the cited paper. Neutral or simply mentioning an article.

cito:disagreesWith
Disagrees with statements, ideas or conclusions presented in the cited paper. Negative sentiment, e.g. "ridiculous paper".

cito:sharesAuthorWith
Written by an author or publisher of the cited paper. The tweeter was involved with the article mentioned. Don't include RTs by others.

cito:usesMethodIn
Describes a method detailed in the cited paper. Also description of study subjects, e.g. "A study of 10,000 runners found".

cito:usesDataFrom
Uses data presented in the cited paper. This should usually be actual numbers, e.g. "smokers die 25 years sooner".

cito:usesConclusionsFrom
Describes conclusions presented in the cited paper, e.g. "humanity is getting less violent". 

Best,

Martin

Fred Zimmerman

unread,
Jul 3, 2012, 12:44:15 PM7/3/12
to altme...@googlegroups.com, Peter Jones
Full list of Shepard's signals: https://web.lexis.com/help/research/shepeditorialmappings.htm


Visual example of how to use Shepard's:


These are all embedded within research/discovery task flow.

-----------------------------------------------------
Subscribe to the Nimble Books Mailing List  http://eepurl.com/czS- for monthly updates

Martin Fenner

unread,
Jul 3, 2012, 1:13:10 PM7/3/12
to altme...@googlegroups.com
Mike,

I've taken your suggestion and kept those events that are currently not openly available for licensing reasons. I've also simplified the table a bit by sorting the metadata into four standard metadata for altmetrics events:

event identifier
author identifier(s)
timestamp
content

Adding CiTO information would be marvelous.

Best,

Martin

Jason Priem

unread,
Jul 3, 2012, 4:52:43 PM7/3/12
to altme...@googlegroups.com
I'm glad to see a discussion of typed citations happening here. I find that it's one of the first questions I get lots of times when I discuss altmetrics.  

Citation typing or semantic citation is a bit different from a lot of the other work being done under the "altmetrics" banner in that there is a very long history of work in this area (unlike, say, mining Twitter citation), from a number of different directions.

Toulmin's model of argument, which establishes a taxonomy of link types (refutes, agrees, etc) dates back to the late 50s, and has I think been used a lot in legal education, for analyzing decisions. Ted Nelson was working on typed hyperlinks in the 60's with Xanadu, and more recently the "argument interchange format" has been proposed to link networks of claims and counterclaims across the Web. The Hypothes.is project is working to do the kind of crowdsourced annotation of arguments that Fred mentions, and while it's IMHO kind of a long shot, it has plenty of potential and awesome implications.  Wikimedia's Wikicite project, another crowdsourced citation database, may or may not support typed citations, but it certainly could.

In the academic sphere, Simon Buckingham-Shum has been working on ontologies of scholarly argument for decades, and has created the ScholOnto ontology as well as many annotation tools. Lighter-weight attempts have included things like Tree Trellis and Project Archelogos, which used simpler ontologies, sacrificing descriptive power for simplicity and ease of use.

This latter feature is a key concern in these typing systems, because one of the biggest impediment to their use is what Buckingham-Shum calls the "capture bottleneck:" the difficulty of converting (or converting authors to convert) unstructured text into semantic markup. Indeed, this is a problem for the whole semantic-web enterprise (as Cory Doctorow notes in his wonderfully prescient 2001 essay "metacrap" http://www.well.com/~doctorow/metacrap.htm#2.2). Crowdsourcing may be one answer, but it's far from a magic bullet (it's a naturalistic bullet?)...projects for crowdsourced annotation have certainly failed more often than they've succeeded.

Sorry if this rambles a bit, or if you're familiar with all these already. I think it's worth getting out there, because discussions of typed citations often seem to lack some of the historical and cross-domain contexts that they'd really benefit from; I don't think it's always appreciated that this is a Hard Problem.

That said, it's also clearly one that's worth working on, given the potential payoffs. Vanever Bush imagined the Memex as being able to traverse and navigate webs of ideas in a much more nuanced way than we can do with the current Web...it would be awesome if we could finally make it happen. 

Jason Priem
UNC Royster Scholar
School of Information and Library Science
University of North Carolina at Chapel Hill

J Britt Holbrook

unread,
Jul 3, 2012, 6:02:07 PM7/3/12
to altme...@googlegroups.com
Thanks for this, Jason. I find this invaluable in helping me to follow along.

Ian Mulvany

unread,
Jul 13, 2012, 4:42:41 AM7/13/12
to altme...@googlegroups.com
Also, of interest in terms of citation typing, the work that David Shotton has done has converged with work on Semantic Web Applications for Neursocience (SWAN) leading to the following proposal a few years ago to merge this typing ontology with an ontology for blogs: http://www.w3.org/TR/hcls-swansioc/

The SWAN ontology has been applied in the field of Alzheimer's to create a browser for navigating hypotheses: http://hypothesis.alzforum.org/swan/do!getHome.action based on papers that are marked up by the authors.

While I was at Mendeley we were awarded a grant to build out some infrastructure to support crowd sourced annotations that could work along these lines, but my understanding is that this project is not due to kick off until later in 2012/early 2013.

Not only is this a hard problem, it's a hard problem with many simultaneous vectors of attack. With my ALM hat on, I feel the appropriate approach for the ALM community is to build tools that can easily tap into the outputs of these kinds of projects, and tie such signals to identifiers. I don't think this community should try to solve this issue, though I'm sure that many of the people in this community may end up doing so with a different hat on.

- Ian
To unsubscribe from this group, send email to altmetrics+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/altmetrics?hl=en.

--
You received this message because you are subscribed to the Google Groups "altmetrics" group.
To post to this group, send email to altme...@googlegroups.com.
To unsubscribe from this group, send email to altmetrics+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/altmetrics?hl=en.

--
You received this message because you are subscribed to the Google Groups "altmetrics" group.
To post to this group, send email to altme...@googlegroups.com.
To unsubscribe from this group, send email to altmetrics+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/altmetrics?hl=en.

--
You received this message because you are subscribed to the Google Groups "altmetrics" group.
To post to this group, send email to altme...@googlegroups.com.
To unsubscribe from this group, send email to altmetrics+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/altmetrics?hl=en.
--
Jason Priem
UNC Royster Scholar
School of Information and Library Science
University of North Carolina at Chapel Hill

--
You received this message because you are subscribed to the Google Groups "altmetrics" group.
To post to this group, send email to altme...@googlegroups.com.
To unsubscribe from this group, send email to altmetrics+unsubscribe@googlegroups.com.

Dario Taraborelli

unread,
Jul 13, 2012, 10:32:23 AM7/13/12
to altme...@googlegroups.com
That was a grant proposal that I coauthored with Jason Hoyt and submitted as a co-PI to JISC while I was at Surrey. It didn't get funded at that time. I'm excited to hear it eventually got funded but I was not aware it had been resubmitted, who should I ask to learn more?

To view this discussion on the web visit https://groups.google.com/d/msg/altmetrics/-/lkaxfmhLbmwJ.

To post to this group, send email to altme...@googlegroups.com.
To unsubscribe from this group, send email to altmetrics+...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages