Correct Way To Copy Between Processes?

10 views
Skip to first unread message

Kevin T.

unread,
Dec 16, 2022, 3:52:38 PM12/16/22
to UPC++
I'm attempting to grok upc++, and am having trouble getting an rget to work without segfaulting.

I've made a "dist_object<StorageBuffer>", where "StorageBuffer" contains a vector of complex doubles.  I then want to copy directly from one rank's vector to another.

My approach is a RPC to the "from" rank 0 that returns a global pointer and a size, where the global pointer is constructed by rank 0 as "to_global_pointer(local_ptr_0)", where "local_ptr_0" is local to rank 0.  The "to" rank 1 then attempts an "rget(global_ptr_0, local_ptr_1, num_elements)", where "global_ptr_0" and "num_elements" are the result of the RPC, and "local_ptr_1" is a regular pointer local to rank 1.

The problem is that this immediately throws a gasnet segfault, both on a test platform where all ranks are on the same node, and on a cluster where they're not.

Is the procedure I've described above something that should work?  I'm not sure if I've misunderstood upc++, misconfigured upc++/gasnet/*, or am otherwise being dumb.

Dan Bonachea

unread,
Dec 16, 2022, 6:09:21 PM12/16/22
to Kevin T., UPC++
Hi Kevin -

Without seeing your actual complete code, I'm stuck guessing what the problem might be. Do you have a minimal reproducible example you can share?

In lieu of that, I can recommend this list of general recommendations for debugging UPC++ programs, where the first and most important guideline is to compile using the debug codemode during development. This will hopefully supply you with a useful assertion message explaining what's gone wrong before you reach the segfault...

-D


--
You received this message because you are subscribed to the Google Groups "UPC++" group.
To unsubscribe from this group and stop receiving emails from it, send an email to upcxx+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/upcxx/883fafae-85e1-4541-aeca-b2f7897ee72bn%40googlegroups.com.

Kevin T.

unread,
Dec 16, 2022, 7:35:11 PM12/16/22
to UPC++
Hi Dan,

I think I figured out where I was going wrong.  I was assuming that a distributed object (in this case composed of toy "StorageManager"s that each contained a vector) would be fully encapsulated in shared memory, but it appears that was incorrect.

I added a "new_array" call external to the "StorageManager" distributed object, replaced the vector inside the "StorageManager" with a pointer to the "new_array", and things are working as expected now.

So I think I was just confused about what is and is not shareable between processes.  It was initially a little surprising to me that a distributed object could have "not shareable" parts, but after some thought I think that makes sense - I'm not sure that upc++ would have the ability to intercept all of the allocates or other memory shuffling those parts might be doing.

-Kevin

Kevin T.

unread,
Dec 16, 2022, 8:17:12 PM12/16/22
to UPC++
For the record/completeness, this is a MWE representative of the terrible things I was doing initially:

#include <vector>
#include <complex>
#include <upcxx/upcxx.hpp>


class StorageBuffer
{
    public:

    std::vector<std::complex<double>> buffer = std::vector<std::complex<double>>(5);

    StorageBuffer(){};

    //-------------------------------------------------------------------------
    void readBuffer(std::complex<double>* &firstElement,
                    std::size_t &numElements) {

        firstElement = &(buffer[0]);
        numElements = buffer.size();
    }
};


class DistributedStorageBuffer
{
    public:

    using upcxxDistBuffer = upcxx::dist_object<StorageBuffer>;
    using readResultType = std::pair<upcxx::global_ptr<std::complex<double>>, std::size_t>;

    upcxxDistBuffer distBuffer;

    DistributedStorageBuffer() : distBuffer(StorageBuffer()) {};

    //-------------------------------------------------------------------------
    upcxx::future<readResultType> readBuffer(const upcxx::intrank_t fromRank) {

        return upcxx::rpc(fromRank,
                          [] (upcxxDistBuffer &localBuffer)
                          {
                            std::complex<double>* firstElement;
                            std::size_t numElements;

                            localBuffer->readBuffer(firstElement, numElements);

                            return readResultType(upcxx::to_global_ptr(firstElement), numElements);

                          }, distBuffer);
    };

    //-------------------------------------------------------------------------  
    upcxx::future<> copyBuffer(upcxx::global_ptr<std::complex<double>> &remoteBuffer,
                               const std::size_t &numElements) {

        bool wantSegfault = true;

        if (wantSegfault) {
            return upcxx::rget(remoteBuffer, &(distBuffer->buffer[0]), numElements);
        } else {
            return upcxx::make_future();
        }
    }
};


int main(int argc, char *argv[])
{
    upcxx::init();

    using readResultType = std::pair<upcxx::global_ptr<std::complex<double>>, std::size_t>;

    DistributedStorageBuffer buffer;

    if (upcxx::rank_me() == 1) {
        readResultType readResult = buffer.readBuffer(upcxx::intrank_t(0)).wait();
       
        buffer.copyBuffer(readResult.first, readResult.second).wait();
    }

    upcxx::barrier();
    upcxx::finalize();

    return 0;
}

Dan Bonachea

unread,
Dec 16, 2022, 8:26:05 PM12/16/22
to Kevin T., UPC++
Hi Kevin - 

Your explanation makes sense, and I apologize for the confusion. 

The upcxx::dist_object<T> comprising a distributed object need not involve any storage in the shared memory segment. In fact, in many common use cases for dist_object<T> these objects live on the program stack, meaning both the dist_object and embedded T object reside entirely in private memory (and attempting to call upcxx::to_global_pointer() on the address of such an object should yield an assertion failure in debug codemode). Another common use case is dist_object<global_ptr<T>>, where the dist_object and global pointer themselves usually live in private memory, but the contained global pointer references an object in the shared heap. It might be helpful to think of distributed objects as a collective abstraction which is mostly useful for RPC and dist_object::fetch(), and is mostly orthogonal to object storage in the shared segment, which is mostly used for RMA operations (rput/rget/copy) or local_team bypass.

FWIW we also provide a upcxx::try_global_pointer(lp) function which operates analogously to upcxx::to_global_pointer(lp), except that it guarantees that passing an lp referencing private memory will yield a null global pointer, even in opt codemode, rather than undefined behavior. So that variant might be a good choice anywhere you're not 100% certain the input points into the shared segment.

Cheers,
-D

Reply all
Reply to author
Forward
0 new messages