Import CFG .txt, no spaces between non terminals

51 views
Skip to first unread message

Devin Johnson

unread,
Sep 30, 2020, 5:08:52 AM9/30/20
to nltk-users
I want to define a CFG txt file to read into NLTK using `nltk.CFG.fromstring()`. Problem is, when I define rules, I want to make rules that don't output spaces between non terminals. For example, say I have this grammar:

    X -> TENS ONES
    TENS -> '二十' | '三十' | '四十' | '五十' | '六十' | '七十' | '八十' | '九十'
    ONES -> '一' | '二' | '三' | '四' | '五' | '六' | '七' | '八' | '九'

If I want the word "二十一", I cannot generate it because TENS ONES will insert a space and make '二十 一". If I instead make the rule as `X -> TENSONES`, TENSONES is treated as one non-terminal, not two and thus there is no parse. Is there a way I can use two non terminals in a production of a .txt file without the need of a space between them?

Reply all
Reply to author
Forward
0 new messages