Concurrent DGEMMS

Skip to first unread message

aran nokan

Jun 7, 2021, 6:12:30 PMJun 7
to MAGMA User

I am trying to run two DGEMM kernels concurrently. For this purpose I am using two queues, but after running they are not running concurrently (after finishing first DGEMM, next DGEMM will start).

I think I have enough free resources for this operation.

Should I put "--default-stream per-thread" somewhere? or define a macro? or do we have any example in MAGMA for this purpose?

Best regards,

Ahmad Abdelfattah

Jun 8, 2021, 4:19:00 AMJun 8
to aran nokan, MAGMA User
Can you please post a code example of what you are trying to achieve? It is also helpful if you mention the range of sizes for each DGEMM. 

If you are calling the MAGMA wrapper for cuBLAS (magma_dgemm), then it is possible that you don’t have enough resources to launch the two DGEMMs concurrently. cuBLAS tries to fill up the GPU even if the sizes are relatively small. 

Another suggestion is to play with the sizes a little bit (e.g. making them really small) to see if you get any overlap on the tracer. 


You received this message because you are subscribed to the Google Groups "MAGMA User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit

Reply all
Reply to author
0 new messages