Hello,
Could anyone kindly clarify the following, regarding the usage of NTLK Boxer interface. (NLTK v 3.0.5; Python 2.7.4)
1). The results of interpretation differ with minor changes in punctuation. For example,
DRS for
intr = myBoxer.interpret_multi(['Every man runs.', 'He also walks.'])
is
<DRS ([e8,x6],[n_male(x6), (([x2,x3],[n_runs.(x2), of(x2,x3), n_man(x3)]) -> ([],[a_topic(x2)])), r_also(e8), agent(e8,x6), v_walks.(e8)])>
Now, if we simply remove the period (full-stops) in the sentences,
intr = myBoxer.interpret_multi(['Every man runs', 'He also walks'])
it changes to
<DRS ([e8,x6],[n_male(x6), (([x2],[n_man(x2)]) -> ([e4],[agent(e4,x2), v_run(e4)])), r_also(e8), agent(e8,x6), v_walk(e8)])>
As we see above, there is a difference in how the interpretation occurs. Is this expected ? Which form is preferred?
2) What is a better choice from an accuracy perspective -
- to use a Boxer interpret() with all sentences in a single string OR
- to use a Boxer interpret_multi() with the sentences as a list?
Thanks,
Regards,
Madhu.