doubts about use of raw() and others functions

45 views
Skip to first unread message

ines faravelli

unread,
Apr 20, 2022, 3:22:08 PM4/20/22
to nltk-users
Hello, I'm new on this. 
It is possible to use the raw(), sents(), and words() functions in texts that don't belong to corpus from nltk?

thanks
Inés

Jordi Carrera

unread,
Apr 21, 2022, 12:26:47 PM4/21/22
to nltk-users
Hey!

It sounds like you just need to initialize a `CorpusReader` object from a custom set of text files in your local drive.

You can do that adapting this code:
```
from nltk.corpurs.reader import PlaintextCorpusReader
root = '/usr/local/corpora/folder_with_corpus_files/'
reader = PlaintextCorpusReader(root, '.*\.txt')
```

You should then be able to call the `Reader`'s  methods you need:
`reader.words()`
`reader.sents()`

ines faravelli

unread,
Apr 21, 2022, 5:33:11 PM4/21/22
to nltk-users
great! I'll try with the code, thank you very much
Reply all
Reply to author
Forward
0 new messages