About multiplying a matrix with a vector: Why cblas_?gemm is faster than cblas_?gemv ?

164 views
Skip to first unread message

mintaka

unread,
Feb 14, 2016, 2:04:08 PM2/14/16
to OpenBLAS-users
I'm trying to perform the multiplication between a matrix (`N x p`) and a vector (`p x 1`), where `N = 1000000` and `p = 600`. I find out that `cblas_?gemm` is around 2~3 times faster than `cblas_?gemv`. Shouldn't `cblas_?gemv` always be faster than `cblas_?gemm`?

Thanks in advance.

Below is my code:

#include <opencv2/core/core.hpp>
#include <iostream>
#include <cblas.h>
#include <cstdlib>

using namespace std;
using namespace cv;

int main(int argc, char* argv[]) {
    clock_t start_time
;
    RNG
&rng = theRNG();
   
float alpha = 1.f, gamma = 0;

   
float time_openblas;
   
int m = 1000000, p = 600, n = 1;

   
Mat_<float> A(m, p, CV_32F), B(p, n, CV_32F);
   
Mat C = Mat(m, n, CV_32F, 0.0);
    rng
.fill(A, RNG::NORMAL, 1, 100);
    rng
.fill(B, RNG::NORMAL, 2, 50);
   
float *src1, *src2, *src3;
    src1
= (float *) A.data;
    src2
= (float *) B.data;
    src3
= (float *) C.data;

    start_time
= clock();
    cblas_sgemm
(CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, p,
                alpha
, src1, p, src2, n, beta, src3, n);
   
float duration = (clock() - start_time) / (CLOCKS_PER_SEC / 1000.0);
    cout
<< "cblas_sgemm: " << duration << " ms" << endl;

    start_time
= clock();
    cblas_sgemv
(CblasRowMajor, CblasNoTrans, m, p, alpha, src1, p, src2, 1, beta, src3, 1);
    duration
= (clock() - start_time) / (CLOCKS_PER_SEC / 1000.0);
    cout
<< "cblas_sgemv: " << duration << " ms" << endl;

   
return 0;
}

Zhang Xianyi

unread,
Feb 14, 2016, 9:43:58 PM2/14/16
to mintaka, OpenBLAS-users
What's your CPU and OS?

Xianyi

--
You received this message because you are subscribed to the Google Groups "OpenBLAS-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openblas-user...@googlegroups.com.
To post to this group, send email to openbla...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

mintaka

unread,
Feb 14, 2016, 10:48:40 PM2/14/16
to OpenBLAS-users, lxd...@gmail.com
Forgot to mention that :p

Ubuntu 14.04 64-bit with Intel® Core™ i7-4770 CPU @ 3.40GHz × 8

Zhang Xianyi

unread,
Feb 15, 2016, 4:21:37 PM2/15/16
to mintaka, OpenBLAS-users
I can reproduce this on my haswell machine. Looks like a performance bug for sgemv.

mintaka

unread,
Feb 15, 2016, 4:49:03 PM2/15/16
to OpenBLAS-users, lxd...@gmail.com
Hi Xianyi, thanks for confirming that. Any plan to fix it in the near future?
Reply all
Reply to author
Forward
0 new messages