Hi Gary,
Glad you got it working to your satisfaction. RadixSort is definitely linked in whether you need it or not, along with all the other algorithms.
Unfortunately once we add support for Fermi code, the code size will only grow, so we need to address these problems. These are all things we can hopefully investigate, perhaps for the release after next. Another option is to divide CUDPP into separate algorithm libraries.
Note that there is another library,
Thrust, that is a template library and hence only the code you use is compiled. However Thrust is designed for development efficiency and generality, not runtime efficiency, so often their algorithm implementations are not as fast as CUDPP's.
Mark