- For some use cases, it would be very valuable to be able to get a direct pointer to an underlying array of primitives, e.g. to pass off to a GPU without copying. Currently this is not possible due to the aforementioned possibility of arbitrary spacing. See: https://github.com/kentonv/capnproto/pull/87
- Existing code will actually, as an optimization, encode structs as primitive lists, when those structs are small enough to fit. E.g. a struct containing two Int16s will be encoded as a list of 32-bit elements. With the proposed change, we'd stop writing structs that way and instead always encode them as composite lists. But, we probably need to continue accepting structs encoded as primitives because there is probably existing data out there that we need to go on supporting. Perhaps we can at least assume that no one relies on single-bit structs, though?
- Of course, dropping said optimization would be sad in itself, although in practice I don't think it would be a huge impact. It does mean that you'd no longer want to encode pixel data as a list of structs containing four 8-bit r/g/b/a channels, but I doubt anyone is doing that anyway specifically because if you're encoding pixel data you probably want direct access to the bytes in order to pass them to a GPU as mentioned above. In the new world, List<UInt8> would be the appropriate encoding for such data, and you'd just have to manually group the list into 4-element groups for each pixel.
Hi Kenton,Of all the points you mentioned, this one is the killer argument to me. Getting zero copy from the marshaling buffer is big win in some scenarios. To me, this carries a lot of weight.- For some use cases, it would be very valuable to be able to get a direct pointer to an underlying array of primitives, e.g. to pass off to a GPU without copying. Currently this is not possible due to the aforementioned possibility of arbitrary spacing. See: https://github.com/kentonv/capnproto/pull/87
I like it the way it is now, and would be a bit sad too, if the packed version
went away.
possible as long as the list has not been upgraded. And whether this is theAbout being able to pass off a direct pointer to the GPU, that ought to be
case or not is detectable at runtime, right?
So, an application could check if the list is suitable for passing off with a
direct pointer as-is, or if a copy is required (and could act accordingly).
Does this fall under the category "application shoots schema author in the
foot"?
I think it would be a fair trade-off that those who need to use lists in such
ways, keep their schema compatible with such uses. No need to use `Data` and
type-casts.
As for a desire to get primitive lists by a pointer, the reasoning about the API is rather straightforward. A list can either require this access or not. This requirment is driven by the application semantics from the very beginning. It cannot just happen without the application writer realizing it or forgetting about it in version 2 and replacing a primitive with a struct.
As long as there is a function to get a pointer to a list of a particular type, everyone will be happy.
On Thu, May 15, 2014 at 8:50 PM, Igor Lubashev <igor...@gmail.com> wrote:
> I am currently working with a system, where upgrades of primitive lists in messages via parallel arrays happened multiple times. Sad. The ability to promote primitives to structures is an asset.
>
Is this the kind of list where someone would have been willing to add
an upgradable annotation?
Cap'n Proto currently allows e.g. List(Int32) to be upgraded to List(MyStruct) where MyStruct's @0 field is of type Int32. I'm starting to wonder whether this ability should be abandoned.
--
You received this message because you are subscribed to a topic in the Google Groups "Cap'n Proto" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/capnproto/lRlWBOglQv4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to capnproto+...@googlegroups.com.
Visit this group at http://groups.google.com/group/capnproto.
Without getting into implementation details, if list(primitive) cannot be upgraded into list(struct{primitive, primitive}) later, then we would never design messages with list(primitive) and world always have list(struct{primitive}) instead.
Paul, I think it's safe for you to proceed assuming that you don't need to support list-of-struct-of-bool encoded as a bit list. But note that regardless of what we decide, it will probably remain necessary to support reading lists of structs encoded using 8, 16, or 32 bits. I think we can safely discard the bool case because probably no one relies on this currently, but it's very likely that lists of 16-bit and 32-bit structs exist in the wild, and possibly even 8-bit, and those will need to continue to be supported on the read end.
Perhaps what this argues for is that we should have two separate types for these two use cases. List(primitive) should actually be encoded as List(struct{primitive}) (with each element taking at least a full word), and we should have a separate "Data(primitive)" which is used for large lists of numeric data that shall never be upgraded to structs. (We already have "Data" representing specifically the UInt8 -- aka byte -- case of this.)
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+...@googlegroups.com.
Perhaps what this argues for is that we should have two separate types for these two use cases. List(primitive) should actually be encoded as List(struct{primitive}) (with each element taking at least a full word), and we should have a separate "Data(primitive)" which is used for large lists of numeric data that shall never be upgraded to structs. (We already have "Data" representing specifically the UInt8 -- aka byte -- case of this.)
A question for changing 'List(primitive)' like this (non-upgradable and with direct access to the underlying memory) is what to do with big-endian clients?
A big-endian client's reads/writes to a 'List(UInt16)' are byte-swapped by the ListReader/ListWriter, so direct access is unsafe. I think the majority of clients are probably expecting this behaviour, and that a 'List(UInt16)' expresses 'a list of unsigned 16-bit integers'---would they have to migrate to 'List(struct{UInt16})' to preserve this (even if upgradability is not required)?
Adding 'Data(primitive)' provides an opportunity for a different semantic: a 'Data(UInt16)' can express 'a blob of 16-bit wide elements'. Byte order and encoding are the responsibility of the client (or something else). This could also be done through an annotation applied to 'List(primitive)', but that doesn't seem any less complex than extending 'Data'.
--
You received this message because you are subscribed to the Google Groups "Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+...@googlegroups.com.
Having slept on this, I'm not sure I'm (yet?) convinced. There are
already upgrades that are only one-way compatible. For example,
upgrading a non-union field to a union will allow new code to read old
data, but not vice versa (although the failure may be silent).
This *might* mean that it would be okay to have List(primitive) allow
raw pointers; instead, reading a List(primitive) that was written by a
newer version of the schema using INLINE_COMPOSITE encoding would just
fail.
What was the example for the schema schema? Would old code still
interpret the upgraded list correctly, or would failing to read it be
just as good an outcome?
FWIW, I vote to retain support for upgrading List(primitive) to List(struct) - this is a common case that happened to me in the past. Parallel arrays are awful.
pixels @0 :Blob(Pixel, 4);# Pixels, each packed as a 4-byte value.
1) The Data(T) idea should henceforth be called Blob(T) instead. While we could easily extend the Cap'n Proto language to allow `Data` to be optionally parameterized, we can't very well do the same thing for the `Data` type that exists in C++ (without breaking people). So I'd rather use a new name so that the naming in C++ can be consistent with the naming in the schema definitions.
singlePixel @1 :Pixel@4;# Denotes a pixel struct, which can be included _inline_ in the parent struct, since we know it's size.# Could be very useful for e.g. an inline Vector3 type