memcpy(), padding, and initialization

1,045 views
Skip to first unread message

Caleb Raitto

unread,
Jul 14, 2020, 7:05:41 PM7/14/20
to c...@chromium.org, khusha...@chromium.org
I'm working on some code that serializes cc:PaintOp instances to update a digest [0] for a study [1] -- the idea is the same op should produce the same hash digest. 

One issue I ran into is padding -- many of the simpler cc::PaintOp subclasses serialize via memcpy() [2], but some subclasses include padding introduced by the compiler to make the sizeof() word-aligned. 

I noticed that this padding is initialized to something like 0xaaaaaa on my out/Default build, but to 0x0 on ASAN / TSAN builds (debugging notes in crrev.com/c/2288872). This results in test failures, as the calculated digest is different than expected on ASAN / TSAN.

There are 2 issues I want to better understand:

1. First, I wanted to make sure that memcpy() over uninitialized padding doesn't introduce undefined behavior. Some of the discussion in [3] makes me think that there are enough carve-outs in the standard such that this is OK (memcpy() uses char types to copy the padding, so the copied padding has indeterminate values, but we don't encounter undefined behavior). Is this understanding correct?

2. Second -- would we ever see variation in the padding's values on official Chrome builds (the value needs to be the same on Windows, Mac, Android, etc.), or just on ASAN / TSAN builds? 

We can declare a char[] for the padding in each PaintOp subclass, use a static_assert(sizeof(foo) == blah) to ensure no additional padding is added, and zero the padding (but only for digest calculation, since the code is hot), but it would be good to know if such changes are necessary or not.

-Caleb




Peter Kasting

unread,
Jul 14, 2020, 9:34:59 PM7/14/20
to Caleb Raitto, cxx, Khushal Sagar
Terminology-wise, the C++ standard calls these "padding bits", and they are the portions of the "object representation" of the object that are not part of its "value representation".  Re: your (1), I believe that memcpy reads from the "object representation" and thus using memcpy in this way is not UB.

Re: your (2), we should only assume what the standard guarantees us. From http://eel.is/c++draft//dcl.init#6.2 , zero-initializing an object zero-initializes its padding bits.  AFAICT, there's nothing else that is guaranteed to set these bits.  So we're only guaranteed anything if these values are zero-initialized.  But...

From http://eel.is/c++draft//dcl.init#8 , zero-initialization doesn't happen during value-initialization of a class with a user-provided or deleted default constructor.  PaintOp's default constructor is (implicitly) deleted, so this applies.  So, I don't think the standard ever guarantees you the value of these padding bits.

Therefore, I think if you want to do something like you propose, you do indeed need to convert the padding bits to be part of the value representation.  Some ways to do this:
  • You could use structure-packing to effectively remove the padding, but this could have noticeable performance effects depending on your code.
  • You could insert char[] as you suggest.  If you do, it becomes part of the value representation of the object.  If you make PaintOp's constructor zero-init this region, you'll get well-defined behavior not only where you're looking for it but everywhere else too.  I would think zero-initing this during construction ought to be fast since those bytes are in the dcache at that point anyway.
  • You could also just write zeroes during digest calculation, but I would suggest explicitly benchmarking scenarios you care about against doing it during construction to make sure that it's actually a perf win, since it's more complicated/surprising.
PK

Primiano Tucci

unread,
Jul 15, 2020, 5:00:18 AM7/15/20
to Peter Kasting, Caleb Raitto, cxx, Khushal Sagar
If the object you want to serialize has copy constructors (implicit or not) you could just let the compiler deal with it, something like

template <typename T>
size_t SimpleSerialize(const PaintOp* op, void* memory, size_t size) {
  if (sizeof(T) > size)
    return 0;
  memset(memory, 0, size);  // Clear the padding bits that won't be copied
  new (memory) T(*op);
  return sizeof(T);
}

This will work as long as the struct doesn't have unique pointers or other uncopyable fields.
If It does.... does it make sense to digest-compare for idempotence objects that contain pointers? I guess depends on the lifetime of those objects vs time span of the idempotence check

--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/cxx/CAAHOzFDrUrpuB-6ZRfyVu6j1%2B3k0Qnd%2B-y3D1YgfdS%2Bz-cavVQ%40mail.gmail.com.

Jeremy Roman

unread,
Jul 15, 2020, 10:27:04 AM7/15/20
to Primiano Tucci, Peter Kasting, Caleb Raitto, cxx, Khushal Sagar
At the risk of stating the obvious...you could just make your digest calculation logic be the salient fields for each op fed into a proper hash-mixing/digesting function, rather than operating on the in-memory representation. Trying to figure out where the compiler will put padding and account for it seems very brittle.

Primiano Tucci

unread,
Jul 15, 2020, 11:37:54 AM7/15/20
to Jeremy Roman, Peter Kasting, Caleb Raitto, cxx, Khushal Sagar
I think the real problem in this specific case is that they want to compute the hash of a very high number of subclasses and writing the code for each of them is too cumbersome.
By looking more, I am not sure how that would work, regardless of the padding problem. The base class (PaintOp) has a whole bunch of pointers. 

K. Moon

unread,
Jul 15, 2020, 12:11:24 PM7/15/20
to Primiano Tucci, Jeremy Roman, Peter Kasting, Caleb Raitto, cxx, Khushal Sagar
Yes, that sounds problematic regardless of approach. The robust solution would be some sort of IDL, I suppose...

Caleb Raitto

unread,
Jul 15, 2020, 2:31:15 PM7/15/20
to K. Moon, Primiano Tucci, Jeremy Roman, Peter Kasting, cxx, Khushal Sagar
prim...@chromium.org -- where are the pointers in the base cc::PaintOp class? I only see 2 data members (in a bitfield of total size 32 bytes):  |type| and |skip|. My compiler also tells me that sizeof(cc::PaintOp) is 4.

While some subclasses may hold pointers, I think these don't qualify for the "simple serialization" memcpy()-based code path -- they have their own serializers. Serializing pointers would break the original use case of passing PaintOps to the GPU process. (I don't own / have in-depth knowledge of this code, so khusha...@chromium.org can verify).

-Caleb

Primiano Tucci

unread,
Jul 15, 2020, 3:04:19 PM7/15/20
to Caleb Raitto, K. Moon, Jeremy Roman, Peter Kasting, cxx, Khushal Sagar
Oooh nvm I was looking at those  sk_sp<SkColorSpace> color_space = nullptr; but didn't realize that they are members of a nested class (PaintOp::SerializeOptions), not PaintOp, I missed that.
Still, don't you have that in the derived classes you want to deal with?
e.g.  PaintOpWithFlags : public PaintOp has a PaintFlags field which has several sk_sp<SkPathEffect> path_effect_;

Caleb Raitto

unread,
Jul 15, 2020, 3:25:36 PM7/15/20
to Primiano Tucci, K. Moon, Jeremy Roman, Peter Kasting, cxx, Khushal Sagar
Right, but I don't think the cc::PaintOpWithFlags uses SimpleSerialize<T>(), which is memcpy() serialization. 

I can't see how the original purpose of this code, sending memory to the GPU process, could work if we were serializing pointers directly.

-Caleb

Khushal Sagar

unread,
Jul 15, 2020, 6:23:56 PM7/15/20
to Caleb Raitto, Primiano Tucci, K. Moon, Jeremy Roman, Peter Kasting, cxx, Khushal Sagar
On Wed, Jul 15, 2020 at 12:25 PM Caleb Raitto <cara...@chromium.org> wrote:
Right, but I don't think the cc::PaintOpWithFlags uses SimpleSerialize<T>(), which is memcpy() serialization. 

I can't see how the original purpose of this code, sending memory to the GPU process, could work if we were serializing pointers directly.

-Caleb

On Wed, Jul 15, 2020 at 3:04 PM Primiano Tucci <prim...@chromium.org> wrote:
Oooh nvm I was looking at those  sk_sp<SkColorSpace> color_space = nullptr; but didn't realize that they are members of a nested class (PaintOp::SerializeOptions), not PaintOp, I missed that.
Still, don't you have that in the derived classes you want to deal with?

CC already has serialization code to flatten these data structures for the purpose of sending them from renderer/browser to the GPU process (here) which are used by PaintOp sub-classes that hold references (like DrawRectOp). The memcpy method is only used by PaintOp sub-classes which are trivially copyable.

Most of the serialization logic works out of the box for the identifiability use-case except for a few quirks like this. Your suggestion above to deal with the padding in SimpleSerialize sounds like a neat way to handle this.

Maksim Ivanov

unread,
Jul 16, 2020, 8:55:32 AM7/16/20
to Khushal Sagar, Caleb Raitto, Primiano Tucci, K. Moon, Jeremy Roman, Peter Kasting, cxx
Is the trick with the placement-new in SimpleSerialize() fully correct? Specifically, does anything prevent the compiler from implementing the implicit copy constructor, called from the placement new, as a single memcpy()-like operation that copies the bytes of all fields including the padding bytes between them?


--
You received this message because you are subscribed to the Google Groups "cxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cxx+uns...@chromium.org.

Caleb Raitto

unread,
Jul 17, 2020, 4:31:27 PM7/17/20
to Maksim Ivanov, Khushal Sagar, Primiano Tucci, K. Moon, Jeremy Roman, Peter Kasting, cxx
Yep, that appears to be the case -- somehow, the padding bits values vary on default vs ASAN/TSAN builds.

However, the "char padding[n] = {};" approach seems to work: crrev.com/c/2302798. Is there a better / more terse way, or is this as good as it gets? 

-Caleb

Caleb Raitto

unread,
Jul 17, 2020, 4:36:17 PM7/17/20
to Maksim Ivanov, Khushal Sagar, Primiano Tucci, K. Moon, Jeremy Roman, Peter Kasting, cxx
To clarify, this approach:

  memset(memory, 0, size);  // Clear the padding bits that won't be copied
  new (memory) T(*op);

didn't seem to work, as the padding bits varied on default vs ASAN/TSAN builds.

-Caleb

Daniel Cheng

unread,
Jul 17, 2020, 5:53:45 PM7/17/20
to Caleb Raitto, Maksim Ivanov, Khushal Sagar, Primiano Tucci, K. Moon, Jeremy Roman, Peter Kasting, cxx
It doesn't seem great to read from the padding bits? It's kind of an implementation detail of C++, and the result seems fragile. Looking at the linked CL, manually adding padding doesn't seem much more maintainable than writing hashers for each subclass: we still have to make sure the padding bytes are correctly distributed in each subclass.

In fact, we actually ban memcpy()ing structs wholesale in IPC serialization/deserialization for this reason.

Daniel

Maksim Ivanov

unread,
Jul 17, 2020, 8:25:20 PM7/17/20
to Daniel Cheng, Caleb Raitto, Khushal Sagar, Primiano Tucci, K. Moon, Jeremy Roman, Peter Kasting, cxx
FWIW, if we had C++17, then I think std::has_unique_object_representations could be used for automatically checking that the padding is "eaten" correctly. That would still not save from having to manually maintain those filler fields though.

Roland McGrath

unread,
Jul 17, 2020, 8:36:50 PM7/17/20
to Maksim Ivanov, Daniel Cheng, Caleb Raitto, Khushal Sagar, Primiano Tucci, K. Moon, Jeremy Roman, Peter Kasting, cxx
Except has_unique_object_representation isn't true for floating-point types and thus isn't true for structs containing FP fields and there's no easy way to weed that out.

Maksim Ivanov

unread,
Jul 21, 2020, 3:23:55 PM7/21/20
to Roland McGrath, Daniel Cheng, Caleb Raitto, Khushal Sagar, Primiano Tucci, K. Moon, Jeremy Roman, Peter Kasting, cxx
Roland, that's a fair point about floating-point types, thanks.

However, checking the bug attached to this thread, the original motivation seems to approximate the number of different values in some set, and I assume memcmp'ing floating-point numbers wouldn't suit that purpose anyway? At least there's "+0" and "-0" with different bit representation; there might also be surprises related to precision and accuracy. So it might even be WAI that structs with floating-point fields would fail the "has_unique_object_representations" check and require manual treatment?..

Caleb Raitto

unread,
Jul 21, 2020, 4:16:33 PM7/21/20
to Maksim Ivanov, Roland McGrath, Daniel Cheng, Khushal Sagar, Primiano Tucci, K. Moon, Jeremy Roman, Peter Kasting, cxx
Yep, if Javascript code can see different behavior on different platforms (method returns +0.0f vs -0.0f etc.), then the page can use that to learn about the user's device.

-Caleb
Reply all
Reply to author
Forward
0 new messages