np, my pleasure.
>
>>> MSVC likes it and everything is padded and aligned. Iirc, std::vector
>>> should honor alignas, right?
>>>
>> I don's think it is the vector specifically that honors alignas; I
>> rather think the data structure of type ct_page, whether allocated by
>> vector or in any other legal way, can get aligned accordingly to its
>> alignment specifier (on my system, it's implementation specific as
>> alignof(std::max_align_t) is 16 so the alignments of 64 are "extended"
>> -- but happen to be supported.
>
> Remember those early hyperthreaded Pentium's? There was something called
> the aliasing problem that would make two hyperthreaded threads falsely
> share cache lines and would destroy performance? Iirc, the l2 cache
> lines were 128 bytes split into two 64 byte regions. The workaround was
> to offset the stacks of each thread using alloca.
I remember there was a problem with them simulated threads not having
cache of their own (don't recall whether it was L2 or L1) but don't
remember what exactly. The advice I remember was to not use
hyperthreading mode :-).
>
>
>> On a side note, see if your code benefits from using
>> std::hardware_constructive_interference_size instead of hardcoded 64
>> for the cacheline_size.
>
> It will definitely benefit. I am wondering if
> std::hardware_constructive_interference_size is always guaranteed to be
> the l2 cache line size?
Not in general, the Holy Standard definition is expectedly L-agnostic :-).
Gcc in particular implements both as L1 cache line size (at least for
architectures I care of):
Quote:
‘destructive-interference-size’
‘constructive-interference-size’
The values for the C++17 variables
‘std::hardware_destructive_interference_size’ and
‘std::hardware_constructive_interference_size’. The
destructive interference size is the minimum recommended
offset between two independent concurrently-accessed objects;
the constructive interference size is the maximum recommended
size of contiguous memory accessed together. Typically both
will be the size of an L1 cache line for the target, in bytes.
For a generic target covering a range of L1 cache line sizes,
typically the constructive interference size will be the small
end of the range and the destructive size will be the large
end.
The destructive interference size is intended to be used for
layout, and thus has ABI impact. The default value is not
expected to be stable, and on some targets varies with
‘-mtune’, so use of this variable in a context where ABI
stability is important, such as the public interface of a
library, is strongly discouraged; if it is used in that
context, users can stabilize the value using this option.
The constructive interference size is less sensitive, as it is
typically only used in a ‘static_assert’ to make sure that a
type fits within a cache line.
End-of-quote
HTH
-Pavel