Corpus Reader for Bigger Files

24 views
Skip to first unread message

Richard Parker

unread,
Mar 3, 2016, 3:28:51 PM3/3/16
to nltk-users
I am trying to create tagged corpus with from
nltk.corpus.reader import TaggedCorpusReader

it is working fine. But it seems it is showing each .pos file as a sentence,
but one file may contain multiple lines. How may I get them as separate
lines?

Please suggest the error I am making.

Steven Bird

unread,
Mar 3, 2016, 3:43:09 PM3/3/16
to nltk-users

I suggest that you take a look at other POS tagged corpora in NLTK that also have more than one sentence per file, such as the Brown Corpus.


--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Richard Parker

unread,
Mar 5, 2016, 3:17:09 PM3/5/16
to nltk-users
Dear Sir,

Thank you for your kind help, it seems working now. Initially I was trying to see raw() in brown but as it did not help,
I went to NLTK_Data and opened corpora and then brown. I gave new line to each new line as was there, and it worked.
If there is any better solution I am eager to know.

Regards,
RP.
Reply all
Reply to author
Forward
0 new messages