Trouble With C#

Will Iam

unread,

Sep 26, 2013, 4:32:14 PM9/26/13

to yeppp-...@googlegroups.com

I am looking for a little guidance here in the use of Yeppp.

I've backed my problem out to a simple example as I will describe below.

This is a C# project inside of Visual Studio 2010 in Windows 7. The test project and the Vector library are done in .NET 3.5. The underlying yeppp core is the x86 32 bit version.

I have a library that I am using that defines a Vector class with the usually mathematical operators.

I then tried:

class Program

{

static void Main(string[] args)
{
int reps = 1000000;
double[] aD = new double[3] {1.0, 2.0, 3.0};
double[] bD = new double[3] {2.0, 0.5, 3.0};
double[] cD = new double[3] {0.0, 0.0, 0.0};
double[] tD = new double[3] {0.0, 0.0, 0.0};

Vector aV = new Vector(aD);
Vector bV = new Vector(bD);
Vector cV = new Vector(cD);

for (int i = 0; i < reps; i++)
{

cV = aV + bV + cV;
}

for (int i = 0; i < reps; i++)
{
Yeppp.Core.Add_V64fV64f_V64f(aD, 0, bD, 0, tD, 0, 3);
Yeppp.Core.Add_V64fV64f_V64f(tD, 0, cD, 0, cD, 0, 3);
}
}

}

When I do the above I get a DataMisalignedException at the first Yeppp.Core.Add_V64fV64f_V64f call. In itself I don't understand why or how to fix that.

To highlight my confusion, if I flip the two for loops and do the Yeppp one first and the "Vector" based one second, then they both work fine and the code executes as I would expect.

So there is obviously some subtlety going on here but I it is beyond my current knowledge. Any suggestions or guidance would be appreciated.

The ultimate goal is to actually integrated the Yeppp calls into the underlying library that the Vector (and other constructs) utilize to take advantage of the speed increase (about 2-2.5x per my little test here).

Thank you,

Will

Marat Dukhan

unread,

Sep 26, 2013, 9:17:31 PM9/26/13

to Will Iam, yeppp-...@googlegroups.com

Hi Will,

Yeppp! requires that all types are naturally aligned (i.e. short/ushort numbers are aligned on

2 bytes, int/uint/float on 4 bytes, and long/ulong/double on 8 bytes). This requirement holds

for all C runtimes (malloc'ed memory is always aligned on 2*sizeof(void*), i.e. at least 8 bytes),

and I assumed that it hold for .Net framework, which was validated by my tests. Natural alignment

is important both for performance and portability:

Misaligned (say load of double which is not aligned on 8 bytes) loads on x86 are slower than aligned loads
If the array is naturally aligned it is possible to process first few elements and until the address of an element pointer becomes aligned by 16 or 32: the latter is important for SIMD processing as SIMD registers are 16 or 32 bytes wide and aligned loads/stores for SIMD registers are much faster than misaligned. Besides that, some types of store operations are supported only for aligned pointers.
Thirdly, some architectures do not allow misaligned loads and would generate a hardware exception when attempting to perform such operation. Examples are MIPS, Xeon Phi (for SIMD instructions), and ARM before ARMv6.

However, I just found a post on Stack Overflow which suggests that small double/long/ulong arrays

(with less than 1000 elements) are not guaranteed to be aligned on 8 bytes on 32-bit systems.

So, here is what happens in your use case: in Yeppp 1.0 all compute functions require naturally

aligned pointers and return YepStatusMisalignedPointer error if this requirement is violated.

Yeppp! CLR bindings, in turn, convert YepStatusMisalignedPointer status to DataMisalignedException.

Since you use CLR bindings with small array length, sometimes CLR allocates arrays in such way

that they are not aligned on 8 bytes, which causes Yeppp! kernels to fail. Via CLR bindings this failure

is translated to C# program as DataMisalignedException.

I will change the requirements for Yeppp! functions on 32-bit systems to accept arrays which are not

aligned on 8 bytes. As a temporary workaround for your project I would suggest to use 64-bit version

of Yeppp! (which is also much faster as Yeppp! 1.0.0 does not include optimized implementations for

32-bit x86). BTW, I don't expect Yeppp! to be efficient on small arrays, it is optimized for arrays of 100

elements of more.

Regards,

Marat

--
You received this message because you are subscribed to the Google Groups "yeppp-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to yeppp-discus...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Will Iam

unread,

Sep 27, 2013, 11:50:49 AM9/27/13

to yeppp-...@googlegroups.com, Will Iam

Marat,

First let me thank you for producing the library and releasing it in to the wild. I forgot to start my first post with that.

Second, sorry that I left the "!" off the name ;)

Third, thank you for the thorough and detailed response. I knew that it was pretty questionable to useYeppp! for the size of arrays that I'd generally be working with and thats why I was doing a little bit of testing to see what the impact would be with the typical data sizes that I expect my libraries to work with. As you predicted, it isn't efficient enough for such small arrays to make up for the additional interfacing and overhead.

Also, in my case, the switch over to the 64 bit model long term isn't possible because of a need to interface with a host of other 32 bit dll's.

Thank you again and if I transition to doing larger scale numerics I'll definitely revisit it.

-Will

Reply all

Reply to author

Forward