About Text Classification

Ajay Victor

unread,

Sep 30, 2017, 6:46:17 AM9/30/17

to nltk-users

Hello everyone,
I'm doing a project with machine learning and a newbie to machine learning as well. What I currently want is to classify whether some words comes under a category or not..

Let me be more specific, On inputting some words I need to check whether those words comes under a language known as "Malayalam".

Example: enthayi ninakk sugamanno?

These are some malayalam words which are expressed in english. On giving some input like this, it need to check the trained data and if any of the input words comes under the category 'Malayalam' then it needs to display that it's Malayalam.

What I've tried to do..

I tried to classify it with a NaiveBayesClassifier, but it always shows a positive response for all the input data.

train = [
('aliya','Malayalam')]
cl = NaiveBayesClassifier(train)
print cl.classify('enthayi ninakk sugamanno')

But the print statement gives an output 'Malayalam'
Hope You'll help me out.. :)

Denzil Correa

unread,

Sep 30, 2017, 6:51:51 AM9/30/17

to nltk-...@googlegroups.com

What is your training data?

--Regards,
Denzil
http://correa.in

--
You received this message because you are subscribed to the Google Groups "nltk-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nltk-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ajay Victor

unread,

Sep 30, 2017, 11:02:17 AM9/30/17

to nltk-users

Well training data consists of some words which are "Malayalam" Words like this train = [