Can anything be assumed about memory layout of composite types?

85 views
Skip to first unread message

Erik Engheim

unread,
Dec 14, 2012, 6:18:48 PM12/14/12
to julia...@googlegroups.com
If you put only bittypes and composite types made up of bittypes inside a composite type, will an instance be one contiguous  block of memory? Or is every member of a composite type a pointer to some other memory location? The julia manual hints at this under the chapter about types:

 An instance of Point{Float64} can be represented compactly and efficiently as an immediate pair of 64-bit values

But I would have liked a bit more details about this. E.g. what about fields which are arrays. Would they always be pointers? I was thinking of making a 3x3 matrix type, and thought it would be nice to keep it compact. My first idea was to just make a type which contains an array with 2 dimensions. Alternatively making 9 fields named m11, m12, m13 ... m33. Of course this might all be stupid given the built in vector and matrix functionality. I just wanted some way of enforcing that a 3x3 matrix should be used. My idea is to create a 2D sprite library. So 3x3 matricies will be used a lot. So I thought it might make sense to represent that explicitly in the type system.

I do not have a good feel for what idiomatic julia should look like so this might be a bad idea. If there are better ways of handling this, please tell me.

Viral Shah

unread,
Dec 15, 2012, 1:00:48 AM12/15/12
to julia...@googlegroups.com
If all you want is a type that holds 3x3 arrays, then making a new type that contains a 3x3 array should suffice. Do you want to have multiple arrays within your new type?

-viral

Toivo Henningsson

unread,
Dec 15, 2012, 1:47:55 AM12/15/12
to julia...@googlegroups.com


On Saturday, December 15, 2012 7:00:48 AM UTC+1, Viral Shah wrote:
If all you want is a type that holds 3x3 arrays, then making a new type that contains a 3x3 array should suffice. Do you want to have multiple arrays within your new type?

I think that we might eventually want to have an NArray type where the shape is a type parameter, e.g. NArray{Float,(3,3)} would be a type that would always hold a 3x3 Float array. Sometimes you do want small fixed size arrays, and I can't help thinking that there must be some penalty associated with using an Array for these kinds of things. Until we have an NArray type, I think people will be going back and forth between using an Array, and the uglier solution of hardcoding the specific elements, e.g. m11;m12;m13; m21;m22;m23; m31;m32;m33

Erik Engheim

unread,
Dec 15, 2012, 3:27:57 AM12/15/12
to julia...@googlegroups.com
Just one array. But I was think arrays were probably pointers, so that each instance of my type would contain just a pointer instead of the actual array data directly. The problem I see with that is that I intend to have a long array of these 3x3 matricies. But this data will not be in contigeous memory then. That would be bad for cache.

I am starting to come down to the idea that I am probably thinking about this the wrong way. My idea was that a 3x3 matrix indicated position and orientation of sprites on the screen. I should perhaps just use one large matrix to represent ALL the sprites. It would probably be better for performance as I can update position of ALL sprites in one big matrix operation.

Viral Shah

unread,
Dec 15, 2012, 4:20:09 AM12/15/12
to julia...@googlegroups.com
Arrays of bitstypes should be contiguous - like say an array of integers or doubles. An array of arrays is probably not going to be contiguous. You could have it all in one large array that is stuffed into a new type. Then, define methods like ref and assign for convenient manipulation, while retaining the ability to do matrix operations on the whole matrix too.

Do you expect the need for SIMD instructions for performance?

-viral

--
 
 

Toivo Henningsson

unread,
Dec 15, 2012, 3:55:26 PM12/15/12
to julia...@googlegroups.com


On Saturday, December 15, 2012 9:27:57 AM UTC+1, Erik Engheim wrote:
Just one array. But I was think arrays were probably pointers, so that each instance of my type would contain just a pointer instead of the actual array data directly. The problem I see with that is that I intend to have a long array of these 3x3 matricies. But this data will not be in contigeous memory then. That would be bad for cache.

I am starting to come down to the idea that I am probably thinking about this the wrong way. My idea was that a 3x3 matrix indicated position and orientation of sprites on the screen. I should perhaps just use one large matrix to represent ALL the sprites. It would probably be better for performance as I can update position of ALL sprites in one big matrix operation.

That's probably a good idea from the memory layout point of view. You could have an Array of size (3,3,nsprites),
which would lay out your data as consecutive 3x3 matrices contiguously in memory.

The one-big-operation thing is not such a big concern in julia; for loops are pretty fast.
I guess some layouts might allow to do operations on all 3x3 matrices in one big BLAS operation, which might be even faster, but that's more than I know of.

Erik Engheim

unread,
Dec 15, 2012, 7:53:58 PM12/15/12
to julia...@googlegroups.com

That's probably a good idea from the memory layout point of view. You could have an Array of size (3,3,nsprites),
which would lay out your data as consecutive 3x3 matrices contiguously in memory.

Yeah, I was thinking that might be an approach.
 

The one-big-operation thing is not such a big concern in julia; for loops are pretty fast.
I guess some layouts might allow to do operations on all 3x3 matrices in one big BLAS operation, which might be even faster, but that's more than I know of.


I am a bit confused about when I should use built in matrix or vector operations. I assume they use BLAS. I did some tests. First I tested using built in vector operations:

# Each vector was 100000 elements
function test1(a::Vector{Float64}, b::Vector{Float64}, c::Vector{Float64})
         for i = 1:500
           a = a .* b .* c
         end
       end

And then by calculating the the result directly in a for loop:

function test2(a::Vector{Float64}, b::Vector{Float64}, c::Vector{Float64})
         for i = 1:500
           for j = 1:length(a)
             a[j] = a[j] * b[j] * c[j]
           end
         end
       end 

 The last version with the for loop was fastest. Which in a way makes sense, from a cache miss point of view. With the built in vector and matrix operations you have to visit the same memory locations multiple times, or you have to allocate big temporary chunks of memory to store the partial result. All this would be avoided using a for loop.

So I don't get how doing big batch calculations can be any fast. If you do lots of matrix multiplications do you not get the problem with needing to store lots of temporary results? With a for loop all the temporary results from calculations could be kept in CPU registers. It is only the final result you have to store in main memory.

If I was doing this in C++ and was trying to get max performance, I would probably interleave the data used together in calculations and then use vector processing on that and store the results in place. By interleaving you would get the a, b,  and c in my example pulled in with the same cacheline. But tried to simulate interleaving by using SubArrays but that did not work very well at all. It just made it slower.
Reply all
Reply to author
Forward
0 new messages