Absolutely on both points. We cannot perform any sort of migration or upgrade without versioning and extension mechanisms.
With a good serialisation mechanism we can also have multi language support.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
The more I think about this the more I'm convinced that inserting NVRAM between DRAM and SSDs does not work very well when you consider the page cache.The page cache exists in DRAM and will likely continue to do so, especially for reads. Below the page cache we have block devices. One of the major distinctions with most types of NVRAM is that they are byte addressable, rather then just block addressable. As I mentioned previously I think the case for NVRAM in this context as a write buffer to SSDs/HHDs makes the most sense rather than as a full layer.
When considering the syscall overhead, and the required copies, I think this could be more hurtful than helpful. Take a normal write for example, we copy from user space to the page cache, the OS would then need to copy again from page cache to another address range for the NVM. These copies need to be coherent and would wash the L3 cache as a result. This will also invalidate private L1 and L2 caches due to the inclusive nature of the Intel x86 cache design. A raw syscall can be fast but when it involves locking a page in the page cache and the copy then it is many 100s of nanoseconds which is at least as costly as the NVRAM access. Having byte rather than block level access reduces the surface area potential for contention from pages down to cache lines. O_DIRECT does not help in this context as languages like Java cannot open files with O_DIRECT and Linus really hates it. Plus with O_DIRECT we have all the alignment and block issues exposed to the user to the extent they might as well deal with the NVRAM directly.A more efficient design would be to have a NVRAM address range that we deal with as mapped memory. "Transactions" are serialised into this memory by fronting it with CPU cache, which fast combines the word level writes, and the TX is committed with PCOMMIT plus fences as appropriate. This makes a strong case for having a flyweight design over the NVRAM that can write or read without copying. Avoiding the copies will be critical to preserving the efficiency of the L3 and other private caches unless we constrain this memory traffic with CAT (Intel Cache Allocation Technology). By taking a memory mapped approach and using flyweights we can avoid at least two copies and work with the existing cache sub-system rather than against it. Copies are super fast but it is easy to forget the impact they have on the efficiency of the cache by evicting other critical code and data.A flyweighting mechanism that validates alignment and other concerns that also provides versioning and extension feels like the best fit. We would need language/compiler/tooling support to do this elegantly. All of this with no user "willy nilly" writing all over it :-)
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
Also, these still don't have the endurance of DRAM, so allowing userland processes to write all over them willy-nilly is going to be a major liability.
Hi Martin,At the lowest levels, where we're simply solving the problem of how something gets from kernel space exposed to user space, I agree with what you're saying. But even in system programming languages like C, people have a hard time understanding what the heck to do with a memory-mapped file unless they have previous experience with them.
So after solving the problem of how to expose persistent memory to user space, rather than just tell programmers "Okay, you have a big range of load/store persistence -- have fun!" we started building libraries on top of the mapped areas. These libraries provide memory allocation, transactions, optimized ways to flush changes to persistence, etc. So that's a bit better, and yet the APIs we invented still seem to low-level for Java. I think we need to abstract persistent memory into the Java garbage-collected object model to make it less error prone. But I am far from a Java expert, so that's why I'm looking for more brainstorming in this area.