Problems using library

62 views
Skip to first unread message

Hugo Delgado

unread,
Nov 3, 2012, 9:15:25 PM11/3/12
to cl...@googlegroups.com
Hi, i'm testing your library on two diferent cards, quadro fx 3800 and gtx 680. Running the benchmark tests for scan, sort, sort_kv gives me the following results on the diferent cards:

OpenCL Platform : NVIDIA CUDA
OpenCL Device   : Quadro FX 3800


--------------- Scan : GPU scan
Performance for data-set size[1024] time (ms): 0.239
Performance for data-set size[4096] time (ms): 0.238
Performance for data-set size[8192] time (ms): 0.285
Performance for data-set size[16384] time (ms): 0.378
Performance for data-set size[32768] time (ms): 0.56
Performance for data-set size[65536] time (ms): 0.928
Performance for data-set size[131072] time (ms): 1.665
Performance for data-set size[262144] time (ms): 3.138
Performance for data-set size[524288] time (ms): 6.219
Performance for data-set size[1048576] time (ms): 11.957
--------------- GPU : Key : Satish radix sort
Performance for data-set size[1024] time (ms): 0.7802 KPS[1312483]
Performance for data-set size[4096] time (ms): 0.8133 KPS[5036272]
Performance for data-set size[8192] time (ms): 0.9089 KPS[9013093]
Performance for data-set size[16384] time (ms): 1.14915 KPS[14257496]
Performance for data-set size[32768] time (ms): 1.83015 KPS[17904544]
Performance for data-set size[65536] time (ms): 2.9178 KPS[22460758]
Performance for data-set size[131072] time (ms): 5.2972 KPS[24743638]
Performance for data-set size[262144] time (ms): 10.2883 KPS[25479938]
Performance for data-set size[524288] time (ms): 22.5626 KPS[23236988]
Performance for data-set size[1048576] time (ms): 44.8432 KPS[23383168]
--------------- GPU : Key : Bitonic sort
Performance for data-set size[1024] time (ms): 0.3498 KPS[2927386]
Performance for data-set size[4096] time (ms): 0.47515 KPS[8620436]
Performance for data-set size[8192] time (ms): 0.57405 KPS[14270535]
Performance for data-set size[16384] time (ms): 0.73525 KPS[22283576]
Performance for data-set size[32768] time (ms): 1.0371 KPS[31595794]
Performance for data-set size[65536] time (ms): 1.74885 KPS[37473768]
Performance for data-set size[131072] time (ms): 3.16675 KPS[41390068]
Performance for data-set size[262144] time (ms): 6.1113 KPS[42894964]
Performance for data-set size[524288] time (ms): 12.356 KPS[42431856]
Performance for data-set size[1048576] time (ms): 25.4608 KPS[41184016]
--------------- GPU : Key-Value : Bitonic sort
Performance for data-set size[1024] time (ms): 0.40805 KPS[2509496]
Performance for data-set size[4096] time (ms): 0.56515 KPS[7247632]
Performance for data-set size[8192] time (ms): 0.70525 KPS[11615740]
Performance for data-set size[16384] time (ms): 0.9635 KPS[17004668]
Performance for data-set size[32768] time (ms): 1.6511 KPS[19846162]
Performance for data-set size[65536] time (ms): 2.84725 KPS[23017298]
Performance for data-set size[131072] time (ms): 5.5524 KPS[23606370]
Performance for data-set size[262144] time (ms): 11.0944 KPS[23628496]
Performance for data-set size[524288] time (ms): 23.0779 KPS[22718232]
Performance for data-set size[1048576] time (ms): 48.459 KPS[21638438]

******
OpenCL Platform : NVIDIA CUDA
OpenCL Device   : GeForce GTX 680

--------------- Scan : GPU scan
Performance for data-set size[1024] time (ms): 0.551
Algorithm FAILED : Scan
.. failed all scans until daset-size is 1048576
Performance for data-set size[1048576] time (ms): 3.52
--------------- GPU : Key : Satish radix sort
Performance for data-set size[1024] time (ms): 0.52085 KPS[1966017]
Performance for data-set size[4096] time (ms): 0.87865 KPS[4661697]
Algorithm FAILED : LoopId[0] Radix sort
.. only worked on datasets of size 1024, 4096 and 1048576
Performance for data-set size[1048576] time (ms): 9.5992 KPS[109235776]
--------------- GPU : Key : Bitonic sort
Performance for data-set size[1024] time (ms): 0.26455 KPS[3870724]
Performance for data-set size[4096] time (ms): 0.66065 KPS[6199955]
Performance for data-set size[8192] time (ms): 0.56665 KPS[14456895]
Performance for data-set size[16384] time (ms): 0.4216 KPS[38861480]
Performance for data-set size[32768] time (ms): 0.48585 KPS[67444688]
Performance for data-set size[65536] time (ms): 0.62995 KPS[104033648]
Performance for data-set size[131072] time (ms): 0.9029 KPS[145167808]
Performance for data-set size[262144] time (ms): 1.53655 KPS[170605584]
Performance for data-set size[524288] time (ms): 2.72285 KPS[192551168]
Performance for data-set size[1048576] time (ms): 5.17375 KPS[202672320]
--------------- GPU : Key-Value : Bitonic sort
Performance for data-set size[1024] time (ms): 0.25245 KPS[4056248]
Performance for data-set size[4096] time (ms): 0.32875 KPS[12459315]
Performance for data-set size[8192] time (ms): 0.37625 KPS[21772760]
Performance for data-set size[16384] time (ms): 0.42595 KPS[38464608]
Performance for data-set size[32768] time (ms): 0.55895 KPS[58624212]
Performance for data-set size[65536] time (ms): 0.8074 KPS[81169192]
Performance for data-set size[131072] time (ms): 1.3996 KPS[93649624]
Performance for data-set size[262144] time (ms): 2.49 KPS[105278712]
Performance for data-set size[524288] time (ms): 4.7239 KPS[110986264]
Performance for data-set size[1048576] time (ms): 9.43535 KPS[111132712]

On the gtx 680 card, the scan and the radix-sort failed on some cases and only worked on the specified data-set sizes. Everything worked ok on quadro fx 3800 card. The count algorithm failed on both cards. Can you give me some insights about why this is happening?

Thanks

kr...@polarlights.net

unread,
Nov 4, 2012, 8:57:25 AM11/4/12
to cl...@googlegroups.com
Some algorithms request a specific dataset size.
You can change theses parameters inthe test cases.

BTW, if you need to store any size of data-set, simply create a larger dataset with faked (0 or INFINITE) values. By example.

Regards

*.. failed all scans until daset-size is 1048576*


Performance for data-set size[1048576] time (ms): 3.52
--------------- GPU : Key : Satish radix sort
Performance for data-set size[1024] time (ms): 0.52085 KPS[1966017]
Performance for data-set size[4096] time (ms): 0.87865 KPS[4661697]
Algorithm FAILED : LoopId[0] Radix sort

*.. only worked on datasets of size 1024, 4096 and 1048576*

Reply all
Reply to author
Forward
0 new messages