Carlos T
unread,Nov 19, 2009, 10:09:58 AM11/19/09Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to CUDPP
Hi all,
Running the following barebones sorting code
#include <stdio.h>
#include <cudpp/cudpp.h>
#include <cuda_runtime.h>
#include <cutil_inline.h>
typedef unsigned int uint;
#define N 12
uint keys[N] = {111, 37, 430, 433, 431, 357, 6190, 6193, 6191,
6117, 6837, 6911};
uint values[N] = {37, 111, 433, 430, 357, 431, 6193, 6190, 6117,
6191, 6911, 6837};
int main(){
cudaSetDevice(0);
int* keys_dev = 0;
int* vals_dev = 0;
cutilSafeCall(cudaMalloc((void**)&keys_dev, sizeof(uint) * N));
cutilSafeCall(cudaMalloc((void**)&vals_dev, sizeof(uint) * N));
CUDPPConfiguration sortConfig;
sortConfig.algorithm = CUDPP_SORT_RADIX;
sortConfig.datatype = CUDPP_UINT;
sortConfig.op = CUDPP_ADD;
sortConfig.options = CUDPP_OPTION_KEY_VALUE_PAIRS;
CUDPPHandle sortPlan;
cudppPlan(&sortPlan, sortConfig, 100 /* num elements */, 1 /* num
rows */, 100 /* pitch */);
printf("Before\n");
for (uint i = 0; i < N; i++) {
printf("(%d,\t%d)\n", keys[i], values[i]);
}
cutilSafeCall(cudaMemcpy(keys_dev, keys, sizeof(uint) * N,
cudaMemcpyHostToDevice));
cutilSafeCall(cudaMemcpy(vals_dev, values, sizeof(uint) * N,
cudaMemcpyHostToDevice));
cudppSort(sortPlan, keys_dev, vals_dev, 32, N);
cutilSafeCall(cudaMemcpy(keys, keys_dev, sizeof(uint) * N,
cudaMemcpyDeviceToHost));
cutilSafeCall(cudaMemcpy(values, vals_dev, sizeof(uint) * N,
cudaMemcpyDeviceToHost));
printf("After\n");
for (uint i = 0; i < N; i++) {
printf("(%d,\t%d)\n", keys[i], values[i]);
}
}
results in an incorrect answer in emulation mode:
Before
(111, 37)
(37, 111)
(430, 433)
(433, 430)
(431, 357)
(357, 431)
(6190, 6193)
(6193, 6190)
(6191, 6117)
(6117, 6191)
(6837, 6911)
(6911, 6837)
After
(37, 111)
(111, 37)
(357, 431)
(357, 431)
(357, 431)
(430, 433)
(6117, 6191)
(6117, 6191)
(6117, 6191)
(6190, 6193)
(6837, 6911)
(6911, 6837)
whereas the output is correct when run on the device.
I'm using the CUDA 3.0 beta1 toolkit (with the bundled cudpp) on MacOS
& Ubuntu 9.04 (32 bit). Any thoughts?
Thanks