I often deal with large arrays of binary data for scientific applications. For example:
myMultiChannelData:
numChannels: 1024
numSamples: 4096
sampleRate: 48000
samples: float[numChannels][numSamples]
I want to avoid copying the samples because they're large and that would be expensive. I'd like to get a pointer to the samples and process them directly where they are. It's not hard to imagine a deserializer where you could do that.
That also means I'd like the samples to be aligned. Not just float-aligned, but maybe even vector-aligned (AXV-512 is a thing now). Perhaps I want to use FFTW, MKL, BLAS/LAPACK, ATLAS, or any other high-performance library with this data. Here is an old article on the importance of alignment [
http://www.ibm.com/developerworks/library/pa-dalign/]. It's not as bad as it used to be, but it's still very important for high performance.
I haven't seen anything in the msgpack format that could be used as padding for alignment. A possibility is a no-op format byte similar to that in UBJSON [
http://ubjson.org/type-reference/value-types/#noop]. The 0xc1 format code seems like a prime candidate.
Another possibility is a dummy alignment-padding field right before the samples field. That's really ugly, but could be made to work.
The general idea could be, for example (sorry, I'm not real familiar with the mpack API):
// to serialize
void* myPageAlignedPtr = mmap(some_file_for_writing, len);
inplace_packer packer(myPageAlignedPtr,len);
// pack the other fields in to the packer
// and then pack the samples with alignment
unsigned arrayLen = numChannels*numSamples*sizeof(float);
unsigned alignment = 16*sizeof(float); // 256-bit alignment
void* samples = packer.pack_aligned_binary_array(arrayLen, alignment);
// this issues up to (alignment-1) no-ops such that
// (samples%alignment == myPageAlignedPtr%alignment)
// now we can use samples with any high-performance library, e.g.
fft(source, samples, numChannels*numSamples);
Then we could send this through any middleware that preserves alignement. On the deserialize side we could get a pointer to samples, and similarly have it be aligned.
Thoughts or suggestions?
Thanks,
-Greg