Iirc, back when hyperthreading first came out, there was/is still an
issue, with two concurrent threads heavily working on their respective
contiguous 64-byte structures. They can interfere with one another wrt
using lock'ed rmw. Therefore, its is recommended to align _and_ pad
things on a 128-byte boundary. Something like the following structures,
just some pseudo-code:
_____________________
struct cache_buf
{
char buf[128];
};
struct cache_half
{
char buf[64];
};
struct cache_line
{
cache_half low;
cache_half high;
};
union cache_line_buf
{
cache_line line;
cache_buf buf;
};
_____________________
a cache_line_buf needs to be aligned in memory on a sizeof(cache_buf)
boundary. Its already padded.