Hi Alexey,
On Sun, Nov 17, 2013 at 02:28:47PM +0400, Alexey Samsonov wrote:
> > It would be also great to have an annotation mechanism for specifying
> > racy memory regions, e.g.:
> >
> > __tsan_add_racy_region(void *addr, uptr size);
> > __tsan_remove_racy_region(void *addr);
> >
> > Some scientific applications are racy on purpose,
>
> (comment from a peanut gallery)
> That's what frustrates us and what we're trying to fight with.
>
http://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> The correct way to express such pattern is to use atomic operations.
Exactly. One can annotate local memory accesses with atomics. But what
about remote memory accesses?
What should the atomic version of e.g. gaspi_write() [1] do? Should it
write the whole block atomically? Should it know the structure of the
data being sent and write each field atomically? Is anything of that
efficiently implementable on the existing interconnect hardware? A quick
scan of Infiniband specification says: "No". And a study of existing
PGAS APIs indirectly confirms this: none of them has atomic bulk
operations.
One can say then: "You should synchronize". But synchronization means
more overhead, longer execution, higher power consumption, and,
therefore, costs. (I guess, you in Google know it best of all.)
I see two ways out: 1) making racy_gaspi_write() working exactly as the
usual one and ignoring races caused by it; 2) marking memory regions as
racy and ignoring races within them (provides somewhat finer control).
And in the end one has to hope that the observed, formally undefined,
behavior will not change in the future.
However, any other ideas will be warmly welcome.
[1]
http://www.gaspi.de/fileadmin/GASPI/pdf/GASPI-1.0.1.pdf
--
Yegor Derevenets