Here is the word alignment interface I used in collecting word
alignment data
http://alt-aligner.appspot.com
It takes five URL parameters:
srcsent, tgtsent : source and target sentences, space separated
string, UTF-8, URL encoded.
idxonsrc : "true" or "false", whether you want people to label source
sentence or target sentence
idx : a zero-based sequence of integer, which words do you want users
to label, underscore separated. e.g. 0_1_3_5 means you want user to
label the first, second, fourth and sixth word.
assignmentId : used by mturk, automatically appended.
A sample URL:
When submit is clicked, it will post back result like this:
1-5 2-4 3-3 4-6 5-7 6-7 7-0 8-10 9-11 9-12 9-13 9-14 9-15 9-16 10-0
11-8
The first index is source, the second is target, and all 1-based, zero
means aligned to null-word.
Hope it helps.
Qin
http://code.google.com/p/alt-aligner/
The whole eclipse project is in the svn trunk, and you can directly
publish it to appspot with GWT eclipse plugin.
On Feb 22, 10:00 am, Edward Gao <q...@cs.cmu.edu> wrote:
> Hi All,
>
> Here is the word alignment interface I used in collecting word
> alignment data
>
> http://alt-aligner.appspot.com
>
> It takes five URL parameters:
>
> srcsent, tgtsent : source and target sentences, space separated
> string, UTF-8, URL encoded.
> idxonsrc : "true" or "false", whether you want people to label source
> sentence or target sentence
> idx : a zero-based sequence of integer, which words do you want users
> to label, underscore separated. e.g. 0_1_3_5 means you want user to
> label the first, second, fourth and sixth word.
> assignmentId : used by mturk, automatically appended.
>
> A sample URL:
>
> http://alt-aligner.appspot.com/?srcsent=%E8%BF%99%E4%BA%9B+%E8%A2%AB+...