Re: Need help with implementing JRip for my master thesis!

359 views
Skip to first unread message

Peter Reutemann

unread,
Jun 11, 2021, 12:51:51 AM6/11/21
to python-weka-wrapper
> I need help using the JRip classifier in Python.
>
> For my master thesis I am comparing different machine learning models based on rules in a stock selection framework. For all the other models there was a nice python package that I could easily use, however I was not able to find something for the Ripper algorithm. Weka has their JRip variant, which I would like to use. However, I am looking at quarterly data ranging from 2005 to 2020 where I also consider different window sizes (all my different train windows en test datasets are stored in separate csv files). Hence, doing it manually in weka would take for ever and I simply want to loop over my different datasets in python and retrieve an array of predictions.
>
> Currently, I have looked at the example codes but I cannot figure out how to get it working. If someone knows how, it would really mean a lot if you can help me as it is the only thing I still need to do for my thesis in terms of results and my deadline is getting closer. If someone knows a different package in python for Ripper or knows how to loop over different datasets in weka itself that would also help me a lot!

I've attached an example script with comments that should get you started.

Cheers, Peter
--
Peter Reutemann
Dept. of Computer Science
University of Waikato, NZ
+64 (7) 577-5304
http://www.cms.waikato.ac.nz/~fracpete/
http://www.data-mining.co.nz/
jrip.py

Sem Keegstra

unread,
Jun 11, 2021, 6:02:36 AM6/11/21
to python-weka-wrapper
Hi Peter,

Thank you very much for your help, looking at your example script made me figure realize that the mistake was not in my code but is software related, since our scripts are very similar.
However, because of this I have been able to locate the problem and everything is working now!

Again thank you for the quick reply on my question.
 
Kind regards,

Sem

Op vrijdag 11 juni 2021 om 06:51:51 UTC+2 schreef Peter Reutemann:

Peter Reutemann

unread,
Jun 11, 2021, 6:09:26 AM6/11/21
to python-we...@googlegroups.com
Good stuff! :-)

Cheers, Peter
>> +64 (7) 577-5304 <+64%207-577%205304>
>> http://www.cms.waikato.ac.nz/~fracpete/
>> http://www.data-mining.co.nz/

Sem Keegstra

unread,
Jun 11, 2021, 8:01:28 AM6/11/21
to python-weka-wrapper
Hi,

I have one final question regarding the structure in which python-weka-wrapper3 assigns the class attribute. As you can see in the picture that I added to this message, it does not assign the classes in a consistent manner. Currently, I am loading my train and test set using a CSVLoader and then use class_is_last() to set the last attribute equal to the class that I want to predict. 

Is it possible (using some sort of function) to make sure that it always sets @attribute {long,short} such that I know it assigns the right classes? As I belief that this would also guarantee that when I call distribution_for_instance(inst) the first element corresponds to the long class, while the second probability corresponds to the short class.

Thank you in advance,

Sem

Op vrijdag 11 juni 2021 om 12:09:26 UTC+2 schreef frac...@gmail.com:
target problem.PNG

Peter Reutemann

unread,
Jun 11, 2021, 3:35:53 PM6/11/21
to python-we...@googlegroups.com
For ensuring that the datasets have the same structure (not guaranteed when using CSV!), use the equal_headers method. See my code example. That will check the index of the class attribute as well.

Cheers, Peter
>> >> +64 (7) 577-5304 <+64%207-577%205304> <+64%207-577%205304>
Reply all
Reply to author
Forward
0 new messages