typedef int32_t rc_t;
/* returns zero only if count drops to zero */
int rc_dec(rc_t *_this) {
rc_t l;
int rel_mb = 0;
do {
l = (*_this);
/* release barrier is count > 1 */
if (l > 1 && ! rel_mb) {
membar #LoadStore|#StoreStore;
/* exec the barrier only one time! */
rel_mb = 1;
}
} while (! CAS(_this, l, l - 1));
/* acquire barrier if count dropped to zero */
if (l == 1) {
membar #StoreLoad|#StoreStore;
}
return (l == 1) ? 0 : 1;
}
IIRC, this is the correct way to do things wrt ref-count decrements?
Since CAS is both a load and a store for membar purposes, you can use
#LoadLoad instead of the more expensive #StoreLoad.
Most hardware doesn't allow stores to cross an earlier conditional
branch, so you can probably omit the #StoreStore part if there's a
performance advantage.