Re: Abridged summary of - 1 update in 1 topic

Skip to first unread message

Aleksandar Petrovski

Jun 6, 2016, 2:58:43 AM6/6/16
Hello, thank you for responding promptly.

1. wc counts 2 lines in file c, but if you just delete the Њ character, it will count 1 line. Does it mean that wc counts Њ as a LF?

2. Could you recommend a script for pre-processing the corpora?

3. If I run pialign using corpora a and b, I get which is not what is expected. You will find the files in the attachment.


On Sun, Jun 5, 2016 at 12:17 AM, <> wrote:
Graham Neubig <>: Jun 04 10:04PM +0900

Your files have a different number of lines, as evidenced by the results of
"wc" below. pialign expects that both files have the same number of lines:
$ wc -l c d
2 c ...more
You received this digest because you're subscribed to updates for this group. You can change your settings on the group membership page.
To unsubscribe from this group and stop receiving emails from it send an email to
Reply all
Reply to author
0 new messages