Kernel Nearest-Neighbor Algorithm

55 views
Skip to first unread message

krne...@gmail.com

unread,
Jan 28, 2014, 9:06:21 AM1/28/14
to accor...@googlegroups.com
Hello César,
Is it possible to simply combine existing Accord.net components with the KDtree class to put together a Kernel Nearest-Neighbor Algorithm [1]? I suspect it is more involved that that but am not (yet :-) intimate enough with the framework and all the ways it can be combined.

Thanks in advance,
Keith

[1] Substitution of a kernel distance metric for the original one in Hilbert space, and the corresponding algorithm is called kernel nearest-neighbor algorithm.
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.125.3253

César

unread,
Jan 29, 2014, 4:54:25 AM1/29/14
to accor...@googlegroups.com
Hi Keith!

Hmmm from the linked paper, it seems all it would be necessary would to replace the Euclidean distance with a kernel distance instead. I didn't try it yet, but perhaps it should work. One possible way to do it, for example, would be:

    Gaussian kernel = new Gaussian(0.6);


   
Func<double[], double[], double> distance = (x, y) => kernel.Distance(x, y);


   
double[][] inputs =
   
{
       
new double[] { -5, -2, -1 },
       
new double[] { -5, -5, -6 },


       
new double[] {  2,  1,  1 },
       
new double[] {  1,  1,  2 },
       
new double[] {  1,  2,  2 },
       
new double[] {  3,  1,  2 },


       
new double[] { 11,  5,  4 },
       
new double[] { 15,  5,  6 },
       
new double[] { 10,  5,  6 },
   
};


   
int[] outputs =
   
{
       
0, 0,
       
1, 1, 1, 1,
       
2, 2, 2
   
};


   
int k = 4;
   
int classes = 3;


   
KNearestNeighbors target = new KNearestNeighbors(k, classes, inputs, outputs, distance);


Hope it helps!

Best regards,
Cesar

krne...@gmail.com

unread,
Jan 30, 2014, 5:09:18 AM1/30/14
to accor...@googlegroups.com
Hi César thanks for the quick reply.

Passing kernel distance method in as a delegate was the first thing I tried (using KDTree directly) before posting the question :-). I found the Gaussian distance function almost always returns NaNs due to calling Math.Log on a negative with most of the data I am passing it:

public double Distance(double[] x, double[] y)
...
// TODO: Verify the use of log1p instead
return (1.0 / -gamma) * Math.Log(1.0 - 0.5 * norm);

Which led me to believe I need to do something more involved with the framework than using the KDTree with a kernel distance function and filling it with my data directly?

Also I noted that the KDTree will still return a list of nodes despite the distance function returning a lot of NaNs, which may catch the unwary out. Would it be better for KDTree to throw an exception if distance functions return NaNs/Infinity?

César

unread,
Feb 6, 2014, 9:46:53 AM2/6/14
to accor...@googlegroups.com
Hi Keith,

The issue might actually be in the definition of the Gaussian kernel distance measure. Please, do you think you could try again, but this time specifying the following formula as the distance measure?

    Gaussian kernel = new Gaussian(0.6);



   
Func<double[], double[], double> distance = (x, y) => kernel.Function(x, x) + kernel.Function(y, y) - 2 * kernel.Function(x, y);



   
double[][] inputs =
   
{
       
new double[] { -5, -2, -1 },
       
new double[] { -5, -5, -6 },


       
new double[] {  2,  1,  1 },
       
new double[] {  1,  1,  2 },
       
new double[] {  1,  2,  2 },
       
new double[] {  3,  1,  2 },


       
new double[] { 11,  5,  4 },
       
new double[] { 15,  5,  6 },
       
new double[] { 10,  5,  6 },
   
};


   
int[] outputs =
   
{
       
0, 0,
       
1, 1, 1, 1,
       
2, 2, 2
   
};


   
int k = 4;
   
int classes = 3;


   
KNearestNeighbors target = new KNearestNeighbors(k, classes, inputs, outputs, distance);


Best regards,
Cesar

krne...@gmail.com

unread,
Feb 7, 2014, 1:00:08 PM2/7/14
to accor...@googlegroups.com

Great that worked Cesar, thanks very much!

Does this mean that the existing Gaussian.Distance function is incorrect and should be replaced with the equivalent of that formula (1 + 1 - 2 * Gaussian.Function(x, y))? Or is the existing Distance function useful in some specific circumstances where the "1.0 - 0.5 * norm" is not negative?

Thanks again,
Keith

César

unread,
Feb 7, 2014, 4:36:27 PM2/7/14
to accor...@googlegroups.com
Hi Keith!

I am pretty sure it was correct under a particular context I was working on a long time ago, but perhaps the implemented version is not the general notion of kernel distance that would make more sense in most general applications. I am still investigating, but most likely I will make this new version the default one in the next releases.

Best regards,
Cesar

Mauricio Cirelli

unread,
Feb 7, 2014, 4:41:18 PM2/7/14
to accor...@googlegroups.com
Hi Cesar,

I would like to know if other kernel functions should work to calculate distance between vectors in a Kernel KNN Algorithm. I am experimenting some different functions in a dataset and it would be very interesting to evaluate.

Thanks!

César

unread,
Feb 7, 2014, 6:04:35 PM2/7/14
to accor...@googlegroups.com
Hi Mauricio,

From what I am learning they should, as long as you define the distance as:

Func<double[], double[], double> distance =

 
(x, y) => Math.Sqrt(kernel.Function(x, x) + kernel.Function(y, y) - 2 * kernel.Function(x, y));

As it is given on equation 9 of [1]. The framework distances were set differently to work on another context, but I am planning to change this behavior in the next release. Hope it helps!

Regards,
Cesar

[1] Schölkopf, B. (2000), The Kernel Trick for Distances., in Todd K. Leen; Thomas G. Dietterich & Volker Tresp, ed., 'NIPS' , MIT Press, , pp. 301-307 . http://www.kyb.mpg.de/fileadmin/user_upload/files/publications/attachments/scholkopf00kernel_3781%5b0%5d.pdf

Mauricio Cirelli

unread,
Feb 10, 2014, 8:24:26 AM2/10/14
to accor...@googlegroups.com
It definetely helps, Cesar! Thank you!

krne...@gmail.com

unread,
Feb 11, 2014, 8:27:08 AM2/11/14
to accor...@googlegroups.com
Defiantly a big help César, thanks very much for looking into this. I look forward to testing the new update when it is ready.
Reply all
Reply to author
Forward
0 new messages