LDA and IVectors question

Tiago Freitas Pereira

unread,

May 24, 2018, 5:16:17 AM5/24/18

to bob-...@googlegroups.com

Hi guys,

I'm in a quest here.

I was travelling around our iVector implementation with all our linear bells and whistles (Whitening, WCCN, LDA and PLDA) and I stumbled with this:

https://gitlab.idiap.ch/bob/bob.bio.gmm/blob/0ef9210cd84c1e92b15347a953fc7d29b595a0c1/bob/bio/gmm/algorithm/IVector.py#L79

As you can see, strip_to_rank is set to False.
This will make bob.learn.linear (via LAPACK) NOT to stick to rank of the projection matrix and return all the eigenvalues/eigenvectors even the ones that are supposed to be zero in some special cases.

For instance, this is particularly critical when the number of classes are lower then the dimensionality of features and we can end up with some imprecise projected iVectors

The default in bob.learn.linear is strip_to_rank=True, but this was specially set to False here https://gitlab.idiap.ch/bob/bob.bio.gmm/blob/0ef9210cd84c1e92b15347a953fc7d29b595a0c1/bob/bio/gmm/algorithm/IVector.py#L79

The question is: Why?

I'm using this channel to ask this, because I can reach more people and I may have an answer from former Idiapers :-)

Thanks

--

Tiago

André Anjos

unread,

May 24, 2018, 11:21:06 AM5/24/18

to bob-...@googlegroups.com

If I remember correctly, we used to do this on the user side for LDA training. Maybe the default is not good.

Best, André

--
-- You received this message because you are subscribed to the Google Groups bob-devel group. To post to this group, send email to bob-...@googlegroups.com. To unsubscribe from this group, send email to bob-devel+...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/bob-devel or directly the project website at http://idiap.github.com/bob/
---
You received this message because you are subscribed to the Google Groups "bob-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bob-devel+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Dr. André Anjos
Idiap Research Institute
Centre du Parc - rue Marconi 19
CH-1920 Martigny, Suisse
Phone: +41277217763
http://andreanjos.org

Manuel Günther

unread,

May 24, 2018, 1:10:08 PM5/24/18

to bob-devel

Tiago,

I am not sure, why we have strip_to_rank set to False. You are right that eigenvectors for zero eigenvalues are also returned. If you have a look into https://gitlab.idiap.ch/bob/bob.bio.gmm/blob/0ef9210cd84c1e92b15347a953fc7d29b595a0c1/bob/bio/gmm/algorithm/IVector.py#L116 you will find that the LDA projection matrix is reduced (the default for lda_dim is 50), so you do not actually use all eigenvectors.

In a special case, where we used plain LDA, we found that the eigenvectors corresponding to zero eigenvalues were still useful. Hence, if our LDA would return 35 non-zero eigenvalues, it was still better to work with 50 eigenvectors. In the LDA implementation, you can see that strip_to_rank might also be set to False, depending on parameters: https://gitlab.idiap.ch/bob/bob.bio.base/blob/37c03d67183442c0c20221ecab92d959248be8fa/bob/bio/base/algorithm/LDA.py#L178

So, I guess the best option would be to change the default of False in the IVector implementation to strip_to_rank=lda_dim>0 (or something similar) to avoid awkward situations when lda_dim is set to 0. Otherwise, the implementation looks correct to me.

Manuel

Tiago Freitas Pereira

unread,

May 30, 2018, 3:32:10 AM5/30/18

to bob-...@googlegroups.com

Hey Manuel,

Sorry for the delay in this one, I was busy with other stuff.

So, it is conceptually wrong to clip dimensions that are above the rank of the projection matrix.

The estimative of the eigenvalues/eigenvectors above the rank are simply not reliable and we shouldn't propagate that in bob.bio.gmm.

In a special case, where we used plain LDA, we found that the eigenvectors corresponding to zero eigenvalues were still useful. Hence, if our LDA would return 35 non-zero eigenvalues, it was still better to work with 50 eigenvectors

This could mean anything, since this is conceptually wrong.

I will push a MR fixing this issue in bob.bio.gmm.
Furthermore, I will print a warning if rank(W) < lda_dim and set lda_dim to the rank in this case.

Cheers

--
-- You received this message because you are subscribed to the Google Groups bob-devel group. To post to this group, send email to bob-...@googlegroups.com. To unsubscribe from this group, send email to bob-devel+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/bob-devel or directly the project website at http://idiap.github.com/bob/

---
You received this message because you are subscribed to the Google Groups "bob-devel" group.

To unsubscribe from this group and stop receiving emails from it, send an email to bob-devel+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--

Tiago

Manuel Günther

unread,

May 30, 2018, 11:58:28 AM5/30/18

to bob-devel

Tiago,

> This could mean anything, since this is conceptually wrong.

I totally agree. It is also conceptually wrong to use LBP as image preprocessing. Nevertheless, we have it in our list of image preprocessors, and it is the best preprocessing for some algorithms such as PCA or LDA.

> I will push a MR fixing this issue in bob.bio.gmm.

> Furthermore, I will print a warning if rank(W) < lda_dim and set lda_dim to the rank in this case.

Printing a warning seems to be OK, but forcing the lda_dim to the rank is wrong in my eyes.

Maybe it would be better to make the strip_to_rank https://gitlab.idiap.ch/bob/bob.bio.gmm/blob/0ef9210cd84c1e92b15347a953fc7d29b595a0c1/bob/bio/gmm/algorithm/IVector.py#L79 a parameter to the algorithm. I am OK with changing the default to True. Add some documentation that setting it to False might result in unreliable eigenvectors.

I think, we also should change the default value of lda_dim to lda_dim=None. In combination with strip_to_rank=True we should get the optimal default behavior.

Manuel

Tiago Freitas Pereira

unread,

May 31, 2018, 7:57:45 AM5/31/18

to bob-...@googlegroups.com

I totally agree. It is also conceptually wrong to use LBP as image preprocessing. Nevertheless, we have it in our list of image preprocessors, and it is the best preprocessing for some algorithms such as PCA or LDA.

Ok, just for the sake of curiosity (if you have the time to answer that), why is it conceptually wrong to use LBP as image preprocessing?

Printing a warning seems to be OK, but forcing the lda_dim to the rank is wrong in my eyes.
Maybe it would be better to make the strip_to_rank https://gitlab.idiap.ch/bob/bob.bio.gmm/blob/0ef9210cd84c1e92b15347a953fc7d29b595a0c1/bob/bio/gmm/algorithm/IVector.py#L79 a parameter to the algorithm. I am OK with changing the default to True. Add some documentation that setting it to False might result in unreliable eigenvectors.

I think, we also should change the default value of lda_dim to lda_dim=None. In combination with strip_to_rank=True we should get the optimal default behavior.

Fine, I will do that in https://gitlab.idiap.ch/bob/bob.bio.gmm/merge_requests/21

--
-- You received this message because you are subscribed to the Google Groups bob-devel group. To post to this group, send email to bob-...@googlegroups.com. To unsubscribe from this group, send email to bob-devel+unsubscribe@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/bob-devel or directly the project website at http://idiap.github.com/bob/
---
You received this message because you are subscribed to the Google Groups "bob-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bob-devel+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.