doubts about use of raw() and others functions

21 views
Skip to first unread message

ines faravelli

unread,
Apr 20, 2022, 3:22:08 PMApr 20
to nltk-users
Hello, I'm new on this. 
It is possible to use the raw(), sents(), and words() functions in texts that don't belong to corpus from nltk?

thanks
Inés

Jordi Carrera

unread,
Apr 21, 2022, 12:26:47 PMApr 21
to nltk-users
Hey!

It sounds like you just need to initialize a `CorpusReader` object from a custom set of text files in your local drive.

You can do that adapting this code:
```
from nltk.corpurs.reader import PlaintextCorpusReader
root = '/usr/local/corpora/folder_with_corpus_files/'
reader = PlaintextCorpusReader(root, '.*\.txt')
```

You should then be able to call the `Reader`'s  methods you need:
`reader.words()`
`reader.sents()`

ines faravelli

unread,
Apr 21, 2022, 5:33:11 PMApr 21
to nltk-users
great! I'll try with the code, thank you very much
Reply all
Reply to author
Forward
0 new messages