---- Gil.
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
---- Gil.
For C#, besides struct or value type support, there is also a supported way to do "union" (http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute.aspx). Struct without union is less useful. Personally I am not a fan of multi-dimentional arrays. Of course union with an array address will bring us into the "unsafe" world of off-GC heap.
There are other problems with proper array style indexer support in Java: [] could not be overloaded (C# does support), and of course the "int" array size in the "big memory" era (C# has the same problem).
Talking about memory, it would be cool at the VM level to support virtual allocated array where the memory is committed on access or usage.
Well, I did not change my mind that much since last discussion ..I think your API is useful as long no VM-backed "struct" equivalent is present in Java. Since the introduction of native "struct" support is urgently necessary i think we will get this some day.
- According to my benchmarks, its not only the layout of memory, but also the need for lots of dereferencing which hurts cache locality.
- I have implemented a byte-code-instrumentation backed struct emulation (it still lacks documentation+sanity checks etc., so its unreleased)
https://code.google.com/p/fast-serialization/wiki/StructsIntroduction
if you look at the benchmarks https://fast-serialization.googlecode.com/files/structiter.html you'll see, that using (pseudo)pointers brings the real performance kick, as it is not necessary to access different memory locations in order to iterate a data structure or parts of it. In fact, some benchmarks using structs are up to 3 times faster than their on heap equivalent, mostly because one can reduce the need of dereferencing using (safe)pointers. The slowness of some "offheap" (structs) access benchmarks is due to the insane code generated behind the scenes to relocate the actual memory an object is using.
Its not meant as a general solution (would require VM backing), its just a giant hack to overcome the lack of structs in java.
As far I understood your API lacks the following advantages compared to a real struct language extension:
- no interprocess communication/message encoding by just "copyMemory" (this also requires kind of from void* cast like "MyStructPointer p = (MyStructPointer)bytes[]")
- no reduction of the number of object references. Using the struct approach, embedded objects are 2cnd class regarding GC. It should be possible to make this manageable like c# did. Even if OpenJDK would have Zing's C4 collector, a program still would profit from a reduction of allocation/GC work
- no deterministic memory layout to work with, as each VM may choose a different representation.
- no rewriting of embedded objects
For C#, besides struct or value type support, there is also a supported way to do "union" (http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.structlayoutattribute.aspx). Struct without union is less useful. Personally I am not a fan of multi-dimentional arrays. Of course union with an array address will bring us into the "unsafe" world of off-GC heap.
The fact that this is an array of actual objects (and not of structs) changes things. And not just in small subtle ways.
For example, sine members are proper objects, the equivalent of the C/C++/C# union part is quite simple. Member object contents access can (and probably should usually) be done with accessor methods, and your accessor methods can decide to interpret and manipulate the member object state in any way you want, including overlapping meanings that are very useful for overlapping meanings (e.g. packet headers for varying protocols) or simply conserving space.
Similarly, since sub-dimensions (array.getSubArray()) are also proper StructuredArray objects (even though they can be intrinsicly embedded in other StructuredArrays with no indirection), you can do things like include rows of an array in other collections while still maintaining optimal (flat) array layout even for multiple dimensions. Again, something you can't quite do with arrays of structs.
There are other problems with proper array style indexer support in Java: [] could not be overloaded (C# does support), and of course the "int" array size in the "big memory" era (C# has the same problem).
As you can see, StructuredArray provides semantic support for long indexes. We're not trying to mess with language features and certainly not with operator overloading, focusing on defined functionality instead. StructuredArray<T> is simply a generic container class with accessor methods T t = array.get(x);
Talking about memory, it would be cool at the VM level to support virtual allocated array where the memory is committed on access or usage.
We haven't really tried to define or capture the various sparse and on-demand-allocation representations some people in the HPC world often use. John Rose's Arrays 2.0 talk covers variations of those in some detail. However, for such sparse array variations to include proper object members (that can point to other objects, for example) would introduce "interesting" practical problems we would still need to think a lot about. Maybe a fourth useful form when we figure it out.
On Thursday, July 18, 2013 1:31:16 AM UTC-7, Sand Stone wrote:
--
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/9PNuQKuWVa4/unsubscribe.
To unsubscribe from this group and all of its topics, send an email to mechanical-symp...@googlegroups.com.
Funny you should say that. I recently read a paper on a queue implementation requiring this trick exactly where you have a union of adjacent data and pointer and you want to CAS both in on op (assume address is compressed -> fits in an int):
class Node{
private static final longDATA_OFFSET;
static {
// a crude way of finding if I can fit a ref into an int
int refScale = UnsafeAccess.UNSAFE.arrayIndexScale(Object[].class);
if(refScale != 4){
throw new RuntimeException("this is not going to work...");
}
try {
DATA_OFFSET = UnsafeAccess.UNSAFE.objectFieldOffset(Node.class.getDeclaredField("data"));
} catch (NoSuchFieldException e) {
throw new RuntimeException();
}
}
long data;
Object object(){return UnsafeAccess.UNSAFE.getObject(this,DATA_OFFSET); }
int counter(){return UnsafeAccess.UNSAFE.getInt(this,DATA_OFFSET+4);}
void object(Object o){UnsafeAccess.UNSAFE.putObject(this,DATA_OFFSET,o); }
void counter(int counter){UnsafeAccess.UNSAFE.putInt(this,DATA_OFFSET+4,counter);}
}
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
________________________________
From: Kirk Pepperdine <ki...@kodewerk.com>
> And now we're talking about adding Struct's to continue futzing up the the line between the programmers API or what the classes are responsible for with what the JVM is responsible for.
> ... we used a special class known as ClusterBucket. The class has special support built into the run time to ensure things were packed in way that made sense for the hardware we were running on.
On 2013-07-18, at 7:34 PM, Martin Thompson <mjp...@gmail.com> wrote:
> For me what is missing from Java is arrays of types that are not just primitives... Arrays of objects are potentially much more useful for data structures.
I'm with you on the quality of GC we get from the standard JVMs. You should try other runtimes like, Mono, Javascript, PHP, Python, etc. to see how GC can be even worse! :-)
I'm not sure your "struct" solves all the issues.Take IPC for example. Rather than introduce structs, add putOrderedX(), compareAndSetX(), and getAndAddX() atomics to ByteBuffer and now you have everything you need for a IPC implementation using memory mapped files and it is pure Java with no language changes. For IPC we need to extend the memory model. It is not about structures. Just map a flyweight over the buffer to access your structures. This would be a trivial addition to the core libraries. I've been meaning to get a discussion going on this on the concurrency-interest list.
I get to profile a lot of applications and have to agree that encoding is one of the largest performance hits. However it is often not an issue, even for some pretty extreme applications, if they use a binary format. When using binary formats it is usually a modelling or data normalisation issue rather than encoding/decoding. The big issue is encoding and decoding to Strings. The Java libraries for this are quite simply crap. They turn everything into UCS-2 strings, yet most protocols are ASCII, and generate huge amounts of garbage in the process. For encoding/decoding the common types in typical protocols I can usually beat what Java has to offer by 10-40X. Why don't we have an efficient AsciiBuffer class with the ability to read and write primitives???I've generally not found an issue saturating a 10GigE network connection from Java. Data needs to be framed with sympathy for the underlying stack and a lot of the default kernel tunables for Linux require attention. The lack of understanding in a mainstream programmer for networking is staggering.
The goals Gil and I have for this is quite modest. If we have better control of memory layout then we can build much better performing collections and other data structures. Every programmer does not need to understand how the intrinsics work but they can benefit from some significant performance improvement when their Maps, Trees, and other goodies get compacted and accelerated.
see iniline comments ..-Rüdiger
The goals Gil and I have for this is quite modest. If we have better control of memory layout then we can build much better performing collections and other data structures. Every programmer does not need to understand how the intrinsics work but they can benefit from some significant performance improvement when their Maps, Trees, and other goodies get compacted and accelerated.
no problem with your approach, I am just convinced there exists a solution covering a larger part of the problem domain, instead of a fraction. Maybe you find ways to extend your existing design.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
Not underestimating your experience at all. Anyone trying to saturate IB is right on the frontier.
I was just illustrating the common problems I see. At the extremes I encode/decode with Unsafe and deal with some of the largest feeds so I have empathy. This requires not just a native binary method of encoding but also alignment, striding and branching considerations which I'm sure you well know :-)
Martin,
Am Freitag, 19. Juli 2013 17:27:21 UTC+2 schrieb Martin Thompson:
Not underestimating your experience at all. Anyone trying to saturate IB is right on the frontier.
actually we don't have to saturate it in order meet requirements, but i tried and it was not possible. Most of the benchmarks focus on how to get byte[] throughput, however filling these bytes with structured data (de-/encoding) in a manageable way is often more of an issue ... Especially Decoding without Object allocation is hardly doable if you like to have kind of manageable business code layer.
I was just illustrating the common problems I see. At the extremes I encode/decode with Unsafe and deal with some of the largest feeds so I have empathy. This requires not just a native binary method of encoding but also alignment, striding and branching considerations which I'm sure you well know :-)
There are some very interesting blogs out there covering this ;-) In practice its pretty hard to actually make use of locality in Java, usually the data to be encoded comes in right from the business layer and is cluttered across the heap, so you get cache misses anyway. Additionally locality is very fragile in java, tiny changes break it easily. However a more strategic approach to increase locality showed much better results (use open adressed Hashmaps, copy tiny Objects directly to the container etc.). This is why i think you should focus on keeping locality if objects are mutated in ObjectLayout. High locality after instantiation will help, but well .. objects change over time.You are invited to provide a faster serialization than mine (without cutting corners) ..i have experimented somewhat with alignment and hot field placement, however the results were minimal. Major issue is the hashmap lookup in order to track references to identical objects. Flat primitives+strings can be encoded very fast using Unsafe and x86 number layout (always confuse Big and Little Endian). bench: https://fast-serialization.googlecode.com/files/result-v1.13.html. However mechanical symphaty is not used that much. Additionally object size matters also, as if i can put more Objects/UDP packet it raises effective throughput as well, so alignment can be contraproductive overall ..
Absolutely, for decoding you need to use a flyweight or callback pattern. Object allocation in this path is too expensive. My measurements have shown this is not due to actual decoding but more to do with cache invalidation due to washing the newly allocated objects through the cache.
You've hit on my main motivation for the object layout work. Based on measurements I've found that cache missing using maps and other structures is the biggest performance bottleneck. On output I do not generate objects but instead write events to the outgoing buffer via a flyweight pattern to keep the cache hot. Encoding an event in binary format may take a few 10s of instructions but a single cache miss can cost high 100s of lost opportunity instruction cycles.
I like how you refer to the entire "encoding/decoding" process to not just be the bit twiddling to fill a buffer but the whole locate an object to gettings its representation into a buffer hot for sending to an IO device and vice versa. :-)
The way I see it, there will forever be a conversion between On-heap and Off-heap. It is, by definition, unavoidable when Off heap storage is used.
Objects (things inherited from Object) can never reside in Off-heap memory. So things that are Off-heap are by definition not Objects. A different way to think about it is that the heap is not a range of memory addresses. It is the collection of all places in which something inherited from Object could possibly exist. So if you ever found a way to put an Object somewhere outside of the current heap, all you'd be doing is expanding the heap to also include that place, placing it under the heap's management, and losing the ability to do Off-heap things in it...The things you put Off-heap are opaque bit-buckets as far as the On-heap stuff is concerned, and as far as the JVM is concerned. The two data spaces will never hold shared data, and there will always be a conversion step if the same data is represented in both On-heap and Off-heap spaces. The conversion may be "fast" (as in just copying the data with no interpretation), but it will still be there as a logical step for various simple reasons (like endian-ness).
Those Opaque bit-buckets that are Off-heap memory are capable of providing the same controls and lifecycle management capabilities that a C/C++ environment does for all of memory, but no more. They currently have a very thin layer of (Java accessible) library code to manipulate them with, if you don't count JNI. As a result, Off-heap memory is usually used in a fairly static or very-slow-moving form (as in statically allocated storage, mapped files, and occasionally allocated large buffers). You could certainly build good "commonly used" libraries to manipulate that data space as it's own heap (malloc/free, new/delete, retain/release, ARC), and built layers and layers of Java-accesible code that would give you C/C++/Objective-C heap semantics for that bit-bucket space, but for some (good in my opinion) reason this practice has not emerged as a widely used one. I.e. where those libraries exist, they usually are not leveraged outside a specific project or small set of people.
Most of the "leveraged by others" libraries I've seen used for Off-heap manipulation (and there aren't many of those) expose one or more of the following three behaviors:1. Convenient and explicit data layout control2. Fast data manipulation capabilities3. Some high level collection-style APIs and large storage capacity#1 is usually motivated by Java code communicating with non-Java-data stuff, including wire protocols, hardware and OS buffers, and non-Java processes, all of which require a fixed, well agreed on data structures for the bits involved. Many people roll their own here (with variations of flyweights or other wrappers for opaque bit buckets), and some people have placed actual libraries out there that may be leverage-able by others to save them work. Some infrastructure work that may evolve into a future standard or de-facto standard ways to do this on JVMs is also being done (e.g. IBM PacketObjects).
#2 is usually motivated by speed, and is distinguished by it's use even when no non-Java code is interacting with the data. There are good reasons people do this (they can measure the benefit), but I always look at this practice (when only Java code ever accesses the data) as an indication of missing on-heap performance capabilities, as the code would have probably used on-heap stuff if the same (or very close) performance where possible there. This is where I draw my main motivation for ObjectLayout. As I see it, where ObjectLayout produces good, high performance data manipulation capabilities on-heap, code that would have needed to move to use off-heap data purely for speed reasons will be able to remain on-heap instead, and will be able to benefit from real Object capabilities and the more commonly understood Java programming idioms, not to mention the ability to leverage all that third-party code that doesn't seem to understand non-Object-dervied data.
#3 is usually motivated by wanting to move data volume off the heap (usually to collection stuff like caches, K/V storage, hash tables, etc.) in order to reduce pressure on GC and avoid the other negative impacts associated with GC in most JVMs (pausing, inefficiency). I call these libraries "EMS-based" collection frameworks. As you all probably know by now, my opinion is that this category will simply go away over time, as GC is now a provably solved problem both from a pause time and efficiency perspective (Zing being a practical, available-right-now existence proof point). While annoyingly pausing and non-efficient collector implementations are still commonplace right now in other server JVMs, I fully expect badly behaving GCs to be weeded out over the next decade, making workaround memory management solutions like these as rare for Java as EMS became after Windows95 came out.
So from an Off-heap perspective my ObjectLayout work is focused on category #2 above, since I see #3 as a non-problem (I've already solved that one), and I see #1 as a separate problem worth solving (by someone else).From an On-heap perspective (which is where most of the Java world lives), I see ObjectLayout as a way to improve performance, period. That's because [currently] most people faced with the choice between living with the performance overhead (of extra indirection and non-streaming memory access) and moving to non-Object-based data simply choose to stay with lower-performing Object-based data. I'm hoping ObjectLayout gives them some ways to enjoy both.
You've hit on my main motivation for the object layout work. Based on measurements I've found that cache missing using maps and other structures is the biggest performance bottleneck. On output I do not generate objects but instead write events to the outgoing buffer via a flyweight pattern to keep the cache hot. Encoding an event in binary format may take a few 10s of instructions but a single cache miss can cost high 100s of lost opportunity instruction cycles.encoding cost still matters, copyMemory to put an int array is 5 to 10 times faster than a java-written loop converting Little- to Big-Endian.
Will the intrinsic backing of Object Layout be avaiable as an OpenJDK patch or for Zing only ?
...
There is some middle ground. Structs are actually not off heap !Consider the following very common design pattern of "Value Classes"class Person [extends Struct] {String name;TimeRange contractValidity;Date birth;...}It is obvious that locality of this class is likely to be severe (especially if it has been mutated several times).
... Business code operating on this would likely cause several cache misses. Additionally a lot of per object memory is used and GC overhead is created (regardless how efficient this GC implementation is). ObjectLayout will help here, however only in case the "Person" is immutable.
... Additionally a lot of per object memory is used and GC overhead is created (regardless how efficient this GC implementation is). ...
On Sunday, July 21, 2013 3:24:40 AM UTC-7, Rüdiger Möller wrote:...There is some middle ground. Structs are actually not off heap !Consider the following very common design pattern of "Value Classes"class Person [extends Struct] {String name;TimeRange contractValidity;Date birth;...}It is obvious that locality of this class is likely to be severe (especially if it has been mutated several times).This is a good example for discussion.First, it raises a good point about one of the use limitations of ObjectLayout. When ObjectLayout things provide a flat layout in which some object contains some other objects, as in StructuredArray, the references from the container to the contained object must by definition remain immutable (and in fact, must be knowable at container allocation time). This does in no way means that the contained object needs to be immutable (I expect the opposite to be common), but it does mean that using StructuredArrays with immutable element types will have relatively limited use. The same will likely be true in other use cases (such as the struct-in-struct use case we haven't put together yet).As a result, in the above example, in which all member objects under Person are represented as mutable references to immutable types, will not fit well into ObjectLayout things, just as they won't fit well into any other struct-like thing. However, with ObjectLayout you would probably represent this sort of thing using immutable references to mutable objects. And since container objects will be full blown objects, you can mix and match embedded and non-embedded objects.For example:class Person {String fullNameThatIsRarelyUsed;final MutableString name = StructuredObject.newEmbeddedInstance(MutableString.class, 128, "Placeholder");final MutableTimeRange contractValidity = StructuredObject.newEmbeddedInstance(MutableTimeRange.class);final MutableDate birth = StructuredObject.newEmbeddedInstance(MutableDate.class);...}If all you need is a good, completely flat value object and you are willing to converse about it's contents with only scalar values, flyweights and other access-method-based thing already provide that ability on top of objects, even without any ObjectLayout stuff. See the Union example I posted here earlier.It's when what you want is to be able to converse about the members in more-complex-than-scalar form, e.g. if you want pass that birth date thing around to other code without having it understand the containing Person structure, that's really where the struct vs. Object thing will come into play. I expect most concerns would be the same for either, e.g. the thing's contents will need to be mutable either way if you actually want to change it, and they can remain immutable [only] if they are truly final values from the Person structure's point of view.
I guess my take right now is that passing those things around as objects presents more value than the space savings that having a struct type would give. I see little semantic benefits to structs that go beyond the "save the wasted header space" argument.... Business code operating on this would likely cause several cache misses. Additionally a lot of per object memory is used and GC overhead is created (regardless how efficient this GC implementation is). ObjectLayout will help here, however only in case the "Person" is immutable.The above example would take care of the cache misses. It will not remove the per-object memory overhead, and depending on the intrinsic implementation may or may not remove the header gaps between the member field (affecting locality). We've already thought up some quirky intrinsic ways to keep the headers away from the bodies in some cases if we really wanted to, some of which are practically performance neutral (paid for with extra memory that is never accessed and doesn't make it into the cache), but I doubt we'll go there in the near future. Having header gaps between flatly laid out instances is probably a secondary (cache locality mostly) concern.... Additionally a lot of per object memory is used and GC overhead is created (regardless how efficient this GC implementation is). ...I need to point out the GC overhead worry is not really an issue in my view. Or shouldn't be. GC efficiency is being traded off against pause times right now, and (almost) nothing else. For ALL current collectors available and actually used in JVMs, GC efficiency math is easy to control without resorting to flattening out objects (note I'm not talking about pauses, but about efficiency) - for every doubling of empty memory in the heap, you roughly double the efficiency of GC for the exact same code, data, and data layout. This is such a powerful (and cheap) lever that it makes GC overhead concerns around the GC cost of traversing references an arbitrarily small one. Of course, if your collector exhibits stop-the-world behavior that grows with heap size, this may present a problem, but it's a problem Zing doesn't have, and it's something I'd expect all collectors to have learn to do right over the next decade+.
So yes, GC markers will spend energy and effort traversing the references in live objects. But the good ones will do this in background threads that do not affect the latency of other work. The amount of time spent on this is can be made so miniscule as a percentage of overall work in the system that optimizing it further (e.g. reducing it by programming better structures) can always be made a seventh order concern. For most of our Zing customers, for example, total cycles spent on GC amount to less than 3% of the total cycles spent by the entire JVM process (that's what we usually recommend they size their heap, and in many logs I see it's less than 0.5%). And since non of that GC work time is spent in a stop-the world pause, application threads just don't care or feel it. When performance matters, people will happily burn a few extra GB of empty memory to keep those efficiency numbers where they want them. Unless all your CPUs are saturated it's a non-issue. And even then, you still control the % of system CPU time spent on GC by deciding how much empty memory to give it.
If you are using ByteBuffer.order(ByteOrder.nativeOrder()) then this is not such an issue
Martin and I have been working together on something we've been calling a StructuredArray, which is part of an "ObjectLayout" project which is now up on github. I think the concept is becoming mature enough to have other people start throwing mud at it. So Here goes...First, a bit of background:StructuredArray originated with a friendly argument Martin and I were having some time in 2012. Martin was rightfully complaining about Java's lack of ability to represent the common "array of structs" data layout available in C/C++/C# and other languages, and the inefficiency of array objects and all other available collections that directly arises from both indirection costs and the lack of striding pattern to stream on. Martin was advocating the common position that there is a need for some form of structs (and arrays of them) to be added into the language as first class citizens. I was taking the contrarian view (as I often do for fun), claiming, "instinctively," that there has to be a way to address this problem through carefully stated semantics captured in regular "vanilla" Java class form, which can then be intrinsified by future JDK versions, but would not require ANY language changes. After all, java.util.concurrent and things like AtomicLong evolved in exactly that way in the pre Java 5 days. My take was that rather than a new struct thing, we can get regular java objects to play the needed role, and that the actual solution needs to come in the form of restricting semantics (through carefully selected Java class contracts), and not of expanding them with new language features or JVM specs.Somehow, we both decided to actually explore this notion. Martin started off by capturing StructuredArray in his code examples code area on github, and I went off and brainstormed with other Azul engineers on how a JVM would/could intrinsify something that would provide the needed value. Together, we evolved StructuredArray over several months, discovering fundamental issues that needed to be worked around as we went, as well as fundamental properties that make StrcuturedArray much more powerful and useful than an array of structs could ever be (IMHO).For example, we deduced early on that a StructuredArray should not have a public constructor, with instances created with static factory methods instead. This distinction is driven by a fundamental issue: In Java, all constructed objects have their allocation site (the new bytecode) occur before their initialization and construction code ever sees it's parameters. This would make it "hard" to intrinsify the construction of a variable sized object like a StructuredArray, as the needed allocation size is not known at the "new" time. A factory method, on the other hand can be easily be intrinsified such that allocation can consider the parameters that control the array size.We further discovered that with first-class Java objects as members, liveness logic can be made significantly different from array-of-structs, and that this Java-natural liveness behavior has an elegance level that makes things "click" together elegantly (I credit Michael Wolf with coming up with that one). In arrays-of-structs-like things, the container must remain logically "live" as long as any logical reference to a member of it exists. This is necessary since "structs" are usually not full-blown objects, and most struct-supporting environments lack the ability to track individual struct liveness. However, when container members are full blown objects, they can also have (and actually "prefer" to have) individual liveness, which means that there is no need for live member objects to implicitly keep their containers alive. This realization not only makes an intrinsic implementation much simpler, it also makes StructuredArray much more generically useful. E.g. a member of a StructuredArray can also be placed in a HashMap or some other collection, and there is no new mess to deal with if it does.At some point I started feeling that this ""Challenge Accepted!" experiment was showing enough promise that we may actually build something based on it. We moved the code to the newly formed ObjectLayout project on github, and kept hacking at it. We added multi-dimensional support. We added generic construction support for member objects (so that e.g. even objects with final fields can be natural members of StructuredArrays), we went back and forth on various implementation details.StructuredArray was made part of a project we called ObjectLayout because it is not the only example of an intrinsically optimized object layout opportunity. You can think of StructuredArray as the first of three (so far identified) forms of object layout abstractions that can be useful for intrinsically improving memory layout in Java, all without requiring any language changes. The other two forms, not yet captured or completely figured out yet, are "structs with structs inside" (Objects with final object members that are initialized with a factory call) and "struct with array at the end" (Objects that inherit from arrays). For now, I'm focusing on StructuredArray both because I think it's the most useful form, and because [I think] I know exactly how to make a JDK intrinsify it, both in Zing and in future OpenJDKs.My intent is to mature the vanilla definitions of StructuredArray as open source code, and to demonstrate their value with both fully supported Zing JDKs and some experimental vanilla-OpenJDK derivatives that would include intrinsic StructuredArray implementations that would lay it out as a flat "array of object structures" in the heap while maintaining the exact same class contract as the vanilla implementations will. When we can show and demonstrate the value of doing things "this way", and after working out the kinks, I would hope to drive it into some future Java SE version (10?) by properly upstreaming it into the OpenJDK project.Before you jump in and look through the JavaDoc and code, it's probably important to note what ObjectLayout is about, and what it is NOT about:ObjectLayout is focused on in-heap, POJO objects. It's goal is to allow normal Java objects and classes (including existing classes) to benefit from optimized heap layout arrangements, without restricting their ability to participate in other object graphs (both by referring to other objects, and by being referred to from other objects), and without restricting their ability to act as first-class citizens and be passed around to various libraries that would expect that ability (meaning that participating object can be still be synchronized on, for example, and can still have identity hash codes).The ObjectLayout "vanilla" Java class implementations (such as the current StructuredArray) are NOT intended to provide the same performance and layout benefits as their eventual intrinsic implementations will. Instead, they are intended to provide a fully functional implementation of the contract represented by the class, such that code will run portably across all (Java SE 6 and above) JDKs, while being able to "scream" in newer JDKs that would recognize them and intrinsify them (think of AtomicLong running on Java SE 5 vs. "vanilla" Java 1.4.2).ObjectLayout does NOT make any attempt to deal with the various needs for off-heap structured access. It is orthogonal to other work done for that purpose.ObjectLayout does NOT try to pack memory or save memory. The way I see it, memory is super-cheap and abundant. To the point where our real problem is finding good creative ways to waste more of it in order to gain real benefit (not that ObjectLayout wastes much memory). So instead, ObjectLayout is focused on reducing indirection, improving locality, and making regular striding patterns exposed to hardware (or software) prefetchers.So here it is.... Have at it. Comments (both sane and otherwise) are welcome:
-- Gil.
Hello,I've attended the talk about the subj this week. I really like it. Though, I'm not able to contribute into, I wonder if it's possible to avoid ConstructorMagic threadlocal? Really sorry about off-top. I suppose I found something relevant in my favorite book - API Design.I suppose we can apply this "friend code" trick, if we make StructuredArray final and move newInstance() to some frien "Accessor". Does it worth to look into? Or here is a bummer?
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
public class Octagons extends StructuredArray<Octagon> {
public Octagons() { }
public Octagons(Octagons source) {
super(source);
}