Another strategy tested:
This is inspired by the current binary classification strategy which discovers the threshold automatically.
Assume that all classes have the same number of items. Lets say that we have 1000 items and 10 classes, that means 100 items / class. The same reasoning can be applied if the classes are not balanced.
We have the values of an expression for all training data (1000 in our case). We sort these values ascendingly.
If the items would be perfectly separable, it means that the first 100 items would belong to one class, the next 100 would belong to another class ... and so on.
We don't know where each class will fall (for instance items belonging to class #7 can be in the first 100 positions, or in the next 100 positions ... or in the last 100 positions etc), so we have to make a decision here: For instance we can say that a class is allocated to the range of positions where it has the most number of representatives. For instance, if class #7 has been represented most in range 300-400, we decide that that slot is for class #7. All other items not belonging to class #7 and still falling within slot 300-400 will be counted as incorrectly classified. The fitness is total number of incorrectly classified items (for all classes).
We have also tested this strategy against problems with 10 classes, and the results are not as good as the official multi-class strategy that we currently have implemented in MEPX.
still experimenting,
mihai