PREPOST

17 views
Skip to first unread message

Brian MacWhinney

unread,
Sep 20, 2018, 5:37:51 PM9/20/18
to ChiBolts
Dear ChiBolts,
     In order to further improve the accuracy of the POST (part of speech tagger) program for English and other languages, we have added a new facility called PREPOST.  This facility supplements the operation of POSTMORTEM.  Unlike POSTMORTEM, it runs right at the end of MOR when ambiguities have not yet been resolved.  This provides additional  flexibility for correcting errors in difficult choices such as that between the auxiliary and the copula in English.  Here is the new description of PREPOST from the manual:

The PREPOST program provides a second method for resolving ambiguities that are not fully handled by POST.  Because it runs before POST, PREPOST has full access to the ambiguous structures created by MOR.  This means that it can refer to ambiguities in full detail to create a more specific environment for disambiguation.  This is particularly useful in resolving the ambiguity between the copula and auxiliary in English, as illustrated in these PREPOST rules:

cop|*^aux|* neg|not adj|* => cop|* neg|not adj|*

cop|*^aux|* det:num|* => cop|* det:num|*

In the first rule the copula is chosen in a sequence such as is not fun.  In the second rule, the copula is chosen in a sequence such as is five in a sentence such as Tim is five years old.  Care must be taken in the formulating PREPOST rules to avoid overgeneralizations to incorrect cases.

Unfortunately, there was an error in the PREPOST file on Monday.  The PREPOST rules file is included in the ENG MOR grammar. So, if you downloaded a copy of English MOR that day, you should get a new copy.  That error made MOR/POST totally fail.  So, it will be quite clear if you possibly got this bad copy.  The copies on the web now have no problems.

When using the new PREPOST facility, you also need to get a new version of CLAN.

Best regards,

-- Brian MacWhinney

jmwa...@uchicago.edu

unread,
Sep 21, 2018, 11:58:14 AM9/21/18
to chibolts
Thank you! I will look into this more in a moment but had a quick question/clarification. In your example with the copula/neg/adj would it apply to scenarios where the word in the adj position is also ambiguous (i.e. quick = adj/adv) or only in scenarios where the adjective is unambiguous

for example, would the rule apply to the string of words: "is not quick", or ignore since quick is adj|quick^adv|quick

Thanks again

-- Jimmy Waller

Brian MacWhinney

unread,
Sep 21, 2018, 12:15:44 PM9/21/18
to ChiBolts
Jimmy,
    Good question.  PREPOST rules require exact match.  So, if something is ambiguous, you have to enter it as ambiguous in the rule.  If it is not ambiguous, it has to be entered as a single form in the rule.  Often, this might mean that you would create two rules -- one for the ambiguous case and one for the unambiguous case.  There is a small number of typical ambiguities in English.  They are mostly adj vs adv, cop vs aux, adj vs noun, and noun vs. verb.  However, a few words like "back" have lots of alternative readings.  So, sometimes these rules will cover virtually everything and sometimes they can't easily cover all things.

--Brian

--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.
To post to this group, send email to chib...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/cad68dad-58d9-4599-98ee-b712086c3c91%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages