I want to use efficient matrix multiplication in my project and fortunately i found that blis_gemm just meet my requirement well.
While due to hardware limit(it is too big to use libblis.a directly), i want to port code as small as i can.
So my queston is are there any easy approach to quickly finish the port job? Thanks in advance.
Thanks and Regards,
YiDing
--
You received this message because you are subscribed to the Google Groups "blis-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blis-discuss+unsubscribe@googlegroups.com.
To post to this group, send email to blis-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/blis-discuss.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to blis-discuss...@googlegroups.com.
If the linker suggestion from Devin doesn't do what you want, you can tediously figure out all the dependencies of blis_gemm and paste that code into a single file and see how small the compiler can make it. This is, of course, a completely unsustainable hack that you only one to do once a year or so.The other question is how close to optimal of performance do you need and for what problem sizes? If you are trying to multiple matrices of 100x100 to 500x500 on an embedded processor and would be happy with 75% of peak, then you could use BLIS and the associated academic material as a guide and write your own super-lean kernel. See https://github.com/flame/how-to-optimize-gemm/wiki.
Jeff
On Thu, Jan 12, 2017 at 9:51 AM, Devin Matthews <devinam...@gmail.com> wrote:
If you are linking BLIS statically, then I think the linker should delete all of the symbols that aren't actually used (at least with the right flags). If it is still too big after that, then I think you are out of luck since BLIS has a significantly smaller footprint than e.g. OpenBLAS or MKL. What hardware is this?
Devin Matthews
On 1/12/17 4:12 AM, hfbl...@gmail.com wrote:
Hi, blis masters:
I want to use efficient matrix multiplication in my project and fortunately i found that blis_gemm just meet my requirement well.
While due to hardware limit(it is too big to use libblis.a directly), i want to port code as small as i can.
So my queston is are there any easy approach to quickly finish the port job? Thanks in advance.
Thanks and Regards,
YiDing
--
You received this message because you are subscribed to the Google Groups "blis-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blis-d...@googlegroups.com.
To post to this group, send email to blis-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/blis-discuss.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to blis-discuss...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/blis-discuss/edf57d13-8ac3-40d7-af60-b9e7ee9fef95o%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/blis-discuss/FF1C01BA-8945-494E-8548-B5265E5F7144%40leekillough.com.