Basic question about extracting info from package (classifier object)

149 views
Skip to first unread message

Colin Goldberg

unread,
Jul 13, 2017, 4:54:30 PM7/13/17
to python-weka-wrapper
Hi,

I know I'm a little out of my depth here, but I am trying to learn - and find out where to get info (it's not always obvious).

My current test is to run an (official) package - Auto-WEKA, which is described as automatically determining the best (?) classifier for a specified data file.

I successfully ran a test to run its algorithm and save a model file. The script is as follows:

import weka.core.packages as packages
from weka.classifiers import Classifier
from weka.core.converters import Loader
from weka.core.dataset import Instances
from weka.classifiers import Classifier
import weka.core.serialization as serialization
import weka.core.jvm as jvm

jvm.start(packages=True)
loader = Loader("weka.core.converters.ArffLoader")
data = loader.load_file('./weather.arff')
data.class_is_last()
classifier = Classifier("weka.classifiers.meta.AutoWEKAClassifier")
classifier.build_classifier(data)
print classifier
outfile = './autoweka.model'
serialization.write_all(outfile, [classifier, Instances.template_instances(data)])
jvm.stop()
quit()

In the process (script statement after build_classifier), a 'print classifier produced the following output:

>>>Start of print classifier output

best classifier: weka.classifiers.functions.SMO

arguments: [-C, 0.5740001496936735, -N, 0, -K, weka.classifiers.functions.supportVector.PolyKernel -E 4.884175976813503]

attribute search: weka.attributeSelection.GreedyStepwise

attribute search arguments: [-B, -R]

attribute evaluation: weka.attributeSelection.CfsSubsetEval

attribute evaluation arguments: [-M]

metric: errorRate

estimated errorRate: 0.0

training time on evaluation dataset: 0.05 seconds


You can use the chosen classifier in your own code as follows:


AttributeSelection as = new AttributeSelection();

ASSearch asSearch = ASSearch.forName("weka.attributeSelection.GreedyStepwise", new String[]{"-B", "-R"});

as.setSearch(asSearch);

ASEvaluation asEval = ASEvaluation.forName("weka.attributeSelection.CfsSubsetEval", new String[]{"-M"});

as.setEvaluator(asEval);

as.SelectAttributes(instances);

instances = as.reduceDimensionality(instances);

Classifier classifier = AbstractClassifier.forName("weka.classifiers.functions.SMO", new String[]{"-C", "0.5740001496936735", "-N", "0", "-K", "weka.classifiers.functions.supportVector.PolyKernel -E 4.884175976813503"});

classifier.buildClassifier(instances);



Correctly Classified Instances          14              100      %

Incorrectly Classified Instances         0                0      %

Kappa statistic                          1     

Mean absolute error                      0     

Root mean squared error                  0     

Relative absolute error                  0      %

Root relative squared error              0      %

Total Number of Instances               14     


=== Confusion Matrix ===


 a b   <-- classified as

 9 0 | a = yes

 0 5 | b = no


=== Detailed Accuracy By Class ===


                 TP Rate  FP Rate  Precision  Recall   F-Measure  MCC      ROC Area  PRC Area  Class

                 1.000    0.000    1.000      1.000    1.000      1.000    1.000     1.000     yes

                 1.000    0.000    1.000      1.000    1.000      1.000    1.000     1.000     no

Weighted Avg.    1.000    0.000    1.000      1.000    1.000      1.000    1.000     1.000     

Temporary run directories:

/tmp/autoweka5492011793094219801/



For better performance, try giving Auto-WEKA more time.

Tried 734 configurations; to get good results reliably you may need to allow for trying thousands of configurations.


>>>End of print classifier output


My question is: How do I get the individual pieces of information from the classifier object - as an alternative to using 'print classifier'?


I know that I can extract classifier.classname (which gives 'weka.classifiers.meta.AutoWEKAClassifier'). But what about 'best classifier: weka.classifiers.functions.SMO', etc. Do I need to refer to Auto-WEKA documentation to find out? (They are oriented towards java, so I am not sure how to translate to python statements). 


Are there common classifier attributes that I can use to extract detailed information that span all packages?


Any help is appreciated.


Colin Goldberg


Peter Reutemann

unread,
Jul 13, 2017, 6:16:50 PM7/13/17
to python-weka-wrapper
No, there isn't. It is always classifier specific.

You can use the "jwrapper" property to retrieve the underlying Java
object. Then you should be able to retrieve the best classifier setup
identified by autoweka.

NB: I never used autoweka.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
Reply all
Reply to author
Forward
0 new messages