Feature selection for single label classification

23 views
Skip to first unread message

sonvx....@gmail.com

unread,
Feb 5, 2016, 9:15:53 AM2/5/16
to dkpro-tc-users
Hello there,

I am changing to do feature selection for single label classification (e.g, with SMO) but It doesn't work so far.

Could anyone please show me proper setting of dimFeatureSelection for single label classification?

For your convenient, I address here the current setting for multi-label FS:
// multi-label feature selection (Mulan specific options), reduces the feature set to 10
Map<String, Object> dimFeatureSelection = new HashMap<String, Object>();
dimFeatureSelection.put(DIM_LABEL_TRANSFORMATION_METHOD,
"BinaryRelevanceAttributeEvaluator");
dimFeatureSelection.put(DIM_ATTRIBUTE_EVALUATOR_ARGS,
asList(new String[] { InfoGainAttributeEval.class.getName() }));
dimFeatureSelection.put(DIM_NUM_LABELS_TO_KEEP, 10);
dimFeatureSelection.put(DIM_APPLY_FEATURE_SELECTION, true);

Kind regards,

Johannes Daxenberger

unread,
Feb 5, 2016, 12:08:11 PM2/5/16
to sonvx....@gmail.com, dkpro-tc-users
Hi,

the following snippet applies InfoGain to reduce the feature set to the 100 "best" features in a single-label Weka-based experiment:

def dimFeatureSelection = Dimension.createBundle("featureSelection", [
featureSearcher: [Ranker.class.name, "-N", "20"],
attributeEvaluator: [InfoGainAttributeEval.class.name],
applySelection: true
])



See also: https://zoidberg.ukp.informatik.tu-darmstadt.de/jenkins/job/DKPro%20TC%20Documentation%20(GitHub)/de.tudarmstadt.ukp.dkpro.tc%24dkpro-tc-doc/doclinks/1/#Discriminators

Best,
Johannes




Am 05.02.16 15:15 schrieb "dkpro-t...@googlegroups.com im Auftrag von sonvx....@gmail.com" <dkpro-t...@googlegroups.com im Auftrag von sonvx....@gmail.com>:
>--
>You received this message because you are subscribed to the Google Groups "dkpro-tc-users" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to dkpro-tc-user...@googlegroups.com.
>For more options, visit https://groups.google.com/d/optout.

rainbo...@googlemail.com

unread,
Mar 9, 2016, 7:22:09 PM3/9/16
to dkpro-tc-users, sonvx....@gmail.com
Heyho,

Thanks for the hint! Does this work for all single-label experiments?

I just tried to implement your code in one of my experiments, but since I am not using groovy, I had to translate it into regular Java code and I am not entirely sure, whether I did that correctly. (It had no effect on my experiment, in any case.)

// Filter only best Features
@SuppressWarnings("unchecked")


Map<String, Object> dimFeatureSelection = new HashMap<String, Object>();
dimFeatureSelection.put(

"featureSearcher",
Arrays.asList( new String[] {
Ranker.class.getName(),
"-N",
"20"
})
);
dimFeatureSelection.put(
"attributeEvaluator",
Arrays.asList( new String[] {
InfoGainAttributeEval.class.getName()
})
);
dimFeatureSelection.put("applySelection", new Boolean(true));

Afterwards, I added the dimension as a bundle to the parameterspace.

Johannes Daxenberger

unread,
Mar 10, 2016, 2:12:17 AM3/10/16
to rainbo...@googlemail.com, dkpro-tc-users, sonvx....@gmail.com
Hi,

feature selection should work for all single-label experiments using Weka.

> I just tried to implement your code in one of my experiments, but since I am not using groovy,
> I had to translate it into regular Java code and I am not entirely sure, whether I did that correctly.


See e.g. ComplexConfigurationSingleDemo in the Java examples module, which implements this in pure Java.


> It had no effect on my experiment, in any case.

This can happen for a number of reasons, e.g. if the number of features used in your experiments is below 20. Also, if I remember correctly, exceptions during feature selection are catched, so you should check what happens exactly using e.g. the debugger.

Best,
Johannes




Am 10.03.16 01:22 schrieb "dkpro-t...@googlegroups.com im Auftrag von rainbow.head6 via dkpro-tc-users" <dkpro-t...@googlegroups.com>:

Marius Hamacher

unread,
Mar 10, 2016, 4:13:49 AM3/10/16
to dkpro-tc-users, rainbo...@googlemail.com, sonvx....@gmail.com
Heyho,

thanks for the fast reply!

I set it up properly now. Still has no effect, probably due to my small list of features used (as you already mentioned). Will keep an eye on it in the future.

Best Regards
Marius
Reply all
Reply to author
Forward
0 new messages