Sure. By the way - in that script example 'newrevisionid' should be
used instead of 'oldrevisionid'. Apparently this is the one that
uniquely identifies the edit and can be used to generate the diff,
like: 'newrevisionid': '327485098', 'editid': '17799' 'class':
'regular' 'totalannotators': '7'.
http://en.wikipedia.org/w/index.php?diff=prev&oldid=327485098
http://en.wikipedia.org/w/index.php?diff=327485098&oldid=327480713
http://en.wikipedia.org/wiki/User_talk:X-N2O
http://en.wikipedia.org/wiki/Special:Contributions/X-N2O
BTW - looks like another link spam/false negative?
-- Dmitry
On May 6, 1:01 am, Martin Potthast <
martin.potth...@uni-weimar.de>
wrote:
> Good contribution, Dmitry!
> This code snippet shows how simple it is to parse the PAN-WVC-10.
>
> If you don't mind, I'll put this in the readme of the final version.
>
> Best,
> Martin
>
> On Thu, May 6, 2010 at 4:18 AM, dmtr <
dchich...@gmail.com> wrote:
> > import csv
> > edits = csv.DictReader(open("edits.csv"))
> > gold = csv.DictReader(open("gold-annotations.csv"))
> > d = dict([(e['editid'], e['oldrevisionid']) for e in edits])
>
> > for g in gold: print d[g['editid']], g['class']
>
> > --
> > You received this message because you are subscribed to the Google Group "PAN".
> > Visit this group athttp://
groups.google.com/group/pan-workshop-series
>
> --
> Martin Potthast
> Bauhaus-Universität
Weimarwww.webis.de ---
www.netspeak.cc