Apr 7, 2011, 5:33:18 PM4/7/11
to Transfer Learning Challenge
If we do a histogram on transfer labels, we can find that some
datasets have overlaps on different classes, some are not.
AVICENNA, 5 classes in transfer labels. A lot of overlaps (by overlap,
I mean a record is labeled by different classes as the same time)
HARRY: 4 classes in transfer labels. A small portion of class overlaps
RITA: 4 classes in transfer labels, no overlap between classes
SYLVESTER: 2 classes in transfer labels, no overlap
TERRY: 4 classes in transfer labels, some overlaps.

How do we explain the label overlaps in the transfer labels? Does
this happen in final and valid dataset too?

Causality Workbench

Apr 8, 2011, 10:02:21 PM4/8/11
to Transfer Learning Challenge
Some of the tasks are "multilabel" which means that a given pattern
can belong to several classes. This is quite common for instance in
text categorization (TERRY). For example if a document talks about a
fund raiser using a sport event to raise money for a political event,
it could belong to the categories "sports", "finances", and
"politics". For AVICENNA, some scripts could belong to several
classes. I do not know Arabic, but the Latin alphabet has that too:
"I" (Capital I), "1" (digit 1), and "l" (lowercase L) can all be
written as a vertical bar. For HARRY, the labels are actions in video
clips. Clearly several actions can occur in the same video (e.g.
someone opening a car door and someone running; someone hugging
someone else and someone drinking, etc.)

