Why I wrote scirate and why I want to think about a broader vision

Dave Bacon

unread,

Oct 15, 2011, 10:46:13 PM10/15/11

to Scirate

At the time, Digg was actually a pretty useful resource. There were
some Digg like open source codes out there but they were going to
require a lot of effort to adapt to the arxiv. At conferences
everyone would talk about how it would be nice to have something like
Digg for the arxiv. At the time I wanted to learn PHP, Ajax, mysql
and how a web 2.0 website worked.

But most importantly Scirate was about solving one of my own
problems. Every day as a researcher I would wake up and read through
the abstracts of quant-ph. And truthfully a lot of it is garbage (no
claim to sainthood here, I contributed my own share over the decade.).
Wouldn't it be nice if there was a way to filter out the junky stuff?
Hey: can't I use my friends to do this? Scirate was essentially a way
to crowd source a task I performed every day :)

Interestingly I wasn't really thinking about the scite as anything
more than a sort of "hey you might want to read this" and truthfully I
think I've gotten more from papers with only a few scites...I would
have missed these. So I never thought much of scirate as a beauty
contest or measure of quality.

So I guess what Im thinking about now is what problem needs to be
solved. Having a friend structure would allow the crowd sourcing to
grow to more than just the quant-ph community? We know how to do the
crowd sourcing of our daily quant-ph reading, what other problem can
we solve?

Niel de Beaudrap

unread,

Oct 16, 2011, 6:52:32 AM10/16/11

to Scirate

One of the things I think that a new version of scirate could do is
add high-level meta-data.

For instance: papers often fall into something like lineages, and have
family resemblances with other papers. These papers might be in the
bibliography, or not; and might be forward references; but in any case
would represent be papers judged to be close-by in "paper space".
Being able to represent that a paper is a follow-up on the subject
matter of a second paper, and persues a complementary question to a
third paper, etc. is something that could be done with a sort of
'voting' system on connections, which may or may not be made to scale
positively with proximity of the sciters' own papers in the article
digraph (i.e. their likely expertise).

Sean Barrett

unread,

Oct 17, 2011, 11:59:05 AM10/17/11

to Scirate

Hi Dave,

Thanks for kicking of these questions, and congrats on the original
scirate.

I found the site was useful for a couple of things - one was that you
could get the abstracts for yesterdays "new" papers easily, (which the
arxiv itself doesn't seem to provide) so in that respect it was useful
for catching up on 1 weeks papers. In addition I did occasionally see
things on there that I might have missed from directly skimming the
arxiv. Finally, when discussions did occasionally kick off, they were
(more often than not) highly insightful.

One thing I didn't like about the site was the "popularity contest"
aspect of it - even if this wasn't your aim, it may have been an
indirect side effect of the design. There are well documented,
statistically significant "positional effects" on the arxiv listing on
subsequent citation (Paul Ginsparg has written about these), and one
effect of a site like scirate might be to amplify these effects: if a
few papers float to the top of the listings early on, they may be more
likely to get more "scites" and thereafter (if the site succeeds in
being a popular front-end for the arxiv), more citations. It's also
possible there are additional effects due to "groupthink," although I
don't know of any systematic evidence for this. One way of thinking
about these biases is that they are actually diminishing the value
(i.e. information content) of votes that you collect after the first
few scites have been given out and the initial ranking has been
established.

A compounding factor is that people may have vastly different
thresholds for when a paper is worth a scite. Aram has mentioned
elsewhere that he used scirate to compile a "reading list," for
himself, of things he thought looked interesting (presumably based on
the title, authors, and abstract), whereas others might be basing
their vote on having read the article. I fully agree with Aram on his
point that somehow this needs to be accounted for.

The papers I really want to know about are those in the "long tail" -
e.g. those that might overlap with my particular weird combination of
interests but that aren't obviously part of the mainstream of quantum
info. I'd also like to know about interesting papers from authors I
haven't heard of, either because they primarily work in a different
field, or because they're new in ours.

With the above in mind, here are some suggestions for how things could
be changed in scirate 2.0.

- Hide the votes/ranking for some period of time (e.g. 24-48 hours)
after each posting . This way you force your users to work a bit
harder on working out which papers to vote for, thereby increasing the
value of the votes.

- Consider not displaying the absolute number of votes at all, but
just rank the papers.

- Have a more nuanced way of collecting ranking information e.g. if
people just vote on a number of papers, that should count for less
than if they download the pdfs. (Ideally you would want to know how
many times, and for how long, people look at the paper, a la last.fm ,
but that might be overly ambitious). If someone downloads a pdf and
subsequently upvotes/downvotes a paper, perhaps that could count
more.

- The comments were great, and if Web 2.0 should be encouraging
anything, it should be more open discussion of scientific results.
Therefore in the ranking system, maximum points could be given for
someone who leaves a substantive comment on a paper.

- Whatever algorithm you settle on, keep it secret (and maybe keep
tweaking it) in order to discourage people from gaming it.

- A better recommendation engine.

Reply all

Reply to author

Forward