Basic question about extracting info from package (classifier object)

149 views

Skip to first unread message

Colin Goldberg

unread,

Jul 13, 2017, 4:54:30 PM7/13/17

to python-weka-wrapper

Hi,

I know I'm a little out of my depth here, but I am trying to learn - and find out where to get info (it's not always obvious).

My current test is to run an (official) package - Auto-WEKA, which is described as automatically determining the best (?) classifier for a specified data file.

I successfully ran a test to run its algorithm and save a model file. The script is as follows:

import weka.core.packages as packages

from weka.classifiers import Classifier

from weka.core.converters import Loader

from weka.core.dataset import Instances

from weka.classifiers import Classifier

import weka.core.serialization as serialization

import weka.core.jvm as jvm

jvm.start(packages=True)

loader = Loader("weka.core.converters.ArffLoader")

data = loader.load_file('./weather.arff')

data.class_is_last()

classifier = Classifier("weka.classifiers.meta.AutoWEKAClassifier")

classifier.build_classifier(data)

print classifier

outfile = './autoweka.model'

serialization.write_all(outfile, [classifier, Instances.template_instances(data)])

jvm.stop()

quit()

In the process (script statement after build_classifier), a 'print classifier produced the following output:

>>>Start of print classifier output

best classifier: weka.classifiers.functions.SMO

arguments: [-C, 0.5740001496936735, -N, 0, -K, weka.classifiers.functions.supportVector.PolyKernel -E 4.884175976813503]

attribute search: weka.attributeSelection.GreedyStepwise

attribute search arguments: [-B, -R]

attribute evaluation: weka.attributeSelection.CfsSubsetEval

attribute evaluation arguments: [-M]

metric: errorRate

estimated errorRate: 0.0

training time on evaluation dataset: 0.05 seconds

You can use the chosen classifier in your own code as follows:

AttributeSelection as = new AttributeSelection();

ASSearch asSearch = ASSearch.forName("weka.attributeSelection.GreedyStepwise", new String[]{"-B", "-R"});

as.setSearch(asSearch);

ASEvaluation asEval = ASEvaluation.forName("weka.attributeSelection.CfsSubsetEval", new String[]{"-M"});

as.setEvaluator(asEval);

as.SelectAttributes(instances);

instances = as.reduceDimensionality(instances);

Classifier classifier = AbstractClassifier.forName("weka.classifiers.functions.SMO", new String[]{"-C", "0.5740001496936735", "-N", "0", "-K", "weka.classifiers.functions.supportVector.PolyKernel -E 4.884175976813503"});

classifier.buildClassifier(instances);

Correctly Classified Instances 14 100 %

Incorrectly Classified Instances 0 0 %

Kappa statistic 1

Mean absolute error 0

Root mean squared error 0

Relative absolute error 0 %

Root relative squared error 0 %

Total Number of Instances 14

=== Confusion Matrix ===

a b <-- classified as

9 0 | a = yes

0 5 | b = no

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class

1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 yes

1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 no

Weighted Avg. 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000

Temporary run directories:

/tmp/autoweka5492011793094219801/

For better performance, try giving Auto-WEKA more time.

Tried 734 configurations; to get good results reliably you may need to allow for trying thousands of configurations.

>>>End of print classifier output

My question is: How do I get the individual pieces of information from the classifier object - as an alternative to using 'print classifier'?

I know that I can extract classifier.classname (which gives 'weka.classifiers.meta.AutoWEKAClassifier'). But what about 'best classifier: weka.classifiers.functions.SMO', etc. Do I need to refer to Auto-WEKA documentation to find out? (They are oriented towards java, so I am not sure how to translate to python statements).

Are there common classifier attributes that I can use to extract detailed information that span all packages?

Any help is appreciated.

Colin Goldberg

Peter Reutemann

unread,

Jul 13, 2017, 6:16:50 PM7/13/17

to python-weka-wrapper

No, there isn't. It is always classifier specific.

You can use the "jwrapper" property to retrieve the underlying Java
object. Then you should be able to retrieve the best classifier setup
identified by autoweka.

NB: I never used autoweka.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 858-5174
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/

Reply all

Reply to author

Forward

0 new messages