Hi Mike,
I agree the answer should be NO. If this becomes a big problem for users, we should fix it inside the library rather than requiring users to hack around. Are you interested in this because you are really tight on memory, or are you just looking to tie up loose ends? If the latter, I would move on to bigger problems. :)
If you are really short on memory, you could technically do the allocation of internal data yourself. For example, you could create a scan plan normally using cudppPlan(), but then create the compact plan yourself by accessing the internal CUDPPCompactPlan structure and assigning its m_scanPlan pointer the pointer from the CUDPPScanPlan created with cudppPlan. But I would NOT recommend this, because if you make subtle mistakes, they could bite you later. You should also be REALLY careful about this if you are using multiple GPUs and CUDA contexts. CUDPP 1.1 is not safe for use in this situation, but the next release will be, but you will not be able to share plans across CUDA contexts.
One thing we're leaving open as a possibility for the future is to give CUDPP its own internal heap manager so that the user can specify the maximum heap size so they have some control on internal memory used by the library, and then CUDPP will reuse internal temporary allocations across plans when possible.
Mark