UPC++ is a C++ library, so your favorite C++ profiling tool might just work.
That being said, many profiling tools assume single-process, so those may not work as well for multi-process UPC++ runs.
As an example you can use gprof (which is mostly a single process tool, but is more or less universally available), although you'll need some minor tweaks to get usable output due to the way GASNet tears down the job at exit time - see the `#if USE_GPROF` blocks in
bench/put_flood.cpp
Another tool worth mentioning is
HPCToolkit, although I'm unsure of their current support level for UPC++/GASNet jobs - I highly recommend asking them.
I don't have any other concrete tool suggestions at the moment.. as library developers we generally optimize bottom-up instead of top-down, so usually use hand-rolled timers instead of all-encompassing profiling tools.
Hope this helps..
-D