Training Semafor - Alphabet Creation Step

kanan...@berkeley.edu

unread,

May 10, 2016, 4:46:41 PM5/10/16

to semafor-users

Hi, I had a problem attempting to retrain SEMAFOR. I am using the version available here: (https://github.com/Noahs-ARK/semafor/)

When running 3_1_idCreateAlphabet.sh, I get a Number Format Exception "for input string "3:4" "

I am using the naacl2012 splits, this string is part of the role span pairs in cv.train.sentences.frame.elements.

Is this step assuming the data will be formatted differently? In the training/data/README it describes the data as it is in the naacl2012 directory, however it seems like the colon is causing problems here.

I tried editing cv.train.sentences.frame.elements to only include the first token, rather than a span (so 3 instead of 3:4), just to see if it would run through like that, but this provides another error. (IndexOutOfBoundsExceptions: index (2) must be less than size (1).

Thanks in advance!

ju...@calabs.ca

unread,

Sep 14, 2016, 1:31:25 PM9/14/16

to semafor-users, kanan...@berkeley.edu

Hi,

I'm running into the same problem. Did you find a solution for it?

Thank you in advance!

giancarl...@gmail.com

unread,

Oct 4, 2016, 10:50:21 PM10/4/16

to semafor-users, kanan...@berkeley.edu

So it looks like in the method "processLine", they throw out to first two fields in toks, but then do not adjust the indices in the tokens.get(i) calls. So you could either remove the ".sublist(2,toks.length)", or adjust the indices in the tokens.get(i) calls.

Hope this helps!

nandi...@gmail.com

unread,

Mar 3, 2017, 7:23:44 PM3/3/17

to semafor-users, kanan...@berkeley.edu

Hello,

Where did you find cv.train.sentences.frame.elements?

Reply all

Reply to author

Forward