Calculate vec-mat Cosine distance

143 views
Skip to first unread message

Alon Agmon

unread,
Jun 17, 2022, 12:27:34 PM6/17/22
to gonum-dev
Hello, 
Im trying to find a more efficient way to calculate cosine distance between a vector and multiple vectors in a matrix. 
My current implementation simply loops over the vec and the matrix vectors as below. I'm wondering whether this can be implemented more efficiently by vectorizing this calculations (I did something similar in Python but not sure how to go about this here) 

Thanks

// ItemMatrix is a [][]float64
//matrixVec is []float 64 
// itemVec is []float 64
for i, matrixVec := range ItemsMatrix { 
vectorX := mat.NewDense(vecSize, 1, matrixVec)
vectorY := mat.NewDense(vecSize, 1, itemVec)
result := Distance(vectorX, vectorY)
cosineScores[i] = result
}

// from GoLearn
func Distance(vectorX *mat.Dense, vectorY *mat.Dense) float64 {
 dotXY := Dot(vectorX, vectorY)
 lengthX := math.Sqrt(Dot(vectorX, vectorX))
 lengthY := math.Sqrt(Dot(vectorY, vectorY))

 cos := dotXY / (lengthX * lengthY)
 return 1 - cos
}
func Dot(vectorX *mat.Dense, vectorY *mat.Dense) float64 {
 subVector := new(mat.Dense)
 subVector.MulElem(vectorX, vectorY)
 result := mat.Sum(subVector)
 return result
}

Vladimír Chalupecký

unread,
Jul 13, 2022, 1:00:35 PM7/13/22
to gonum-dev
Hi,

If representing the input data as slices is enough for you, then you don't need the 'mat' package for this small calculation and you can use 'floats' instead:

package main

import (
    "fmt"

    "gonum.org/v1/gonum/floats"
)

func main() {
    // Vectors are stored in the rows of the "matrix" A.
    a := [][]float64{
        {2, 2, -1},
        {-1, 3, 1},
        {0, 1, 3},
    }
    x := []float64{3, 1, 3}
    normx := floats.Norm(x, 2)
    for _, ai := range a {
        normai := floats.Norm(ai, 2)
        cosDist := 1 - floats.Dot(ai, x)/normai/normx
        fmt.Println(cosDist)
    }
}


You could "vectorize" the code a bit more by using the matrix-vector product A*x instead of 'floats.Dot' but for that you'd have to change how A is stored and switch to using the mat package throughout. At this moment without further requirements/information I don't think it's worth it.
Reply all
Reply to author
Forward
0 new messages