I did watch the video. It is helpful conceptually, not practically. We
tried PCA and clustering in phase 1. Clustering can generate pretty
good learning curve on valid sets. We did k-means directly on valid
set and final set, even not used the devel set. There are two factors
which are critical to clustering: number of clusters, distance matrix.
We got these two factors from the feedback of valid set and used the
same setting for final. The risk is that if the final set has
different number of classes from valid, our approach may fail badly.
does each column represent same variable in devel, vaid and final?
does devel set contain examplars of valid and final?
I can not train a model using labels from devel and apply it to valid
and final since they contain examples for disjoint sets of classes .
If I stay with the clustering approach, I have not seen an easy way to
utilize the transferred labels. I post my puzzles here and hope they
can stimulate more discussions.
> > transferred labels be helpful here?- Hide quoted text -
>
> - Show quoted text -