Am 20.10.18 um 00:35 schrieb Alf P. Steinbach:
> On 19.10.2018 23:19, Öö Tiib wrote:
>> On Friday, 19 October 2018 23:42:43 UTC+3, Christian Gollwitzer wrote:
>>> Hi,
>>>
>>> I've got a strange error message while developing a program for OpenCL
>>> when compiling on gcc 4.8.5
>>>
>>> clfdk.cpp:322:23: error: cannot bind packed field ‘geom.fdk_geom::Pmat’
>>> to ‘cl_float4 (&)[4]’
>>> mat4copy(P, geom.Pmat);
>>> gcc 4.8 seems to have a problem with the reference to the element of the
>>> packed struct, which seems strange to me, since the other compilers all
>>> accept it (and it works). I'm especially astonished because the template
>>> mat4copy has exactly the purpose of copying data between different
>>> containers of 4 cl_float4s, like std::vector, std::aray, plain array...
>>> but the compiler refuses to instantiate the template.
>>
>> Sounds like that bug/problem:
>>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=36566
>
> As I read the comments on that it's a problem, not a bug.
>
> Because in general an member of a packed struct needs not be properly
> aligned for the member type, and then using a reference to that member
> might crash on some architectures, or with OS intervention for the trap
> cause Real Slow Execution on others.
>
> Since the member of interest is at the start of the structure, and since
> the standards require no padding before that member, practical solution
> might be to reinterpret_cast and `std::launder` a pointer to the
> complete packed struct, assuming that it in turn isn't part of packed
> struct.
>
> An alternative practical solution that should work in general, is to
> just overload the function that unpacks, with a variant that accepts the
> whole struct as argument.
>
> I would not go as far as to copy bytes.
My "practical solution" now is a C-style cast to cast away the
"packed"-thing. What bugs me most is that basically this downgrades the
applicability of a template. The reason to have the template at all was
to copy, e.g. the content of a std::vector<cl_double4> with a length of
4 to a, e.g. std::array<cl_float4, 4>. There are at least three
different, but functionally equivalent data types in this program.
If I'd write it out by hand, it would work. If I used a macro, it would
work. If I could overload the operator= from the "outside" for a
std::array, it would work, but it must be a member (why?). If I drop the
packed, it compiles but breaks with invalid memory accesses due to
different alignments between the two compilers. Only because of a
pedantic compiler I'm forced to jump through these hoops.
Christian