You can write it, but you would have to create a module with IL, and
then compile that into your assembly (or just write the whole assembly).
Also, what optimizations do you think you can make? Ultimately, you
suffer from the fact that Windows is not a real-time OS, and nothing you can
do will change that. On top of that, the JIT is what's going to optimize
your code again after you try to, so you might actually end up hurting
yourself more than helping yourself.
If you post the code you are trying to optimize, we can try and tell you
where you might make some improvements, but dipping down to the IL level is
most likely not going to help you much.
--
- Nicholas Paldino [.NET/C# MVP]
- m...@spam.guard.caspershouse.com
"Peter Olcott" <olc...@att.net> wrote in message
news:7t%rf.37887$QW2.9998@dukeread08...
"Nicholas Paldino [.NET/C# MVP]" <m...@spam.guard.caspershouse.com> wrote in
message news:OCvd7unC...@TK2MSFTNGP09.phx.gbl...
Not at all. When the CLR gets a hold of the JIT, it is free to perform
any optimizations it deems necessary, and that might not necessarily be in
line with what you are expecting.
My recommendation would be to use Managed C++ to create a wrapper to
your unmanaged code which uses It Just Works (IJW, seriously). You should
get a managed interface, and the best possible performance (for this
specific situation, not all situations) between managed an unmanaged code.
--
- Nicholas Paldino [.NET/C# MVP]
- m...@spam.guard.caspershouse.com
"Peter Olcott" <olc...@att.net> wrote in message
news:_51sf.37898$QW2.37853@dukeread08...
"Nicholas Paldino [.NET/C# MVP]" <m...@spam.guard.caspershouse.com> wrote in
message news:%23RjK3$oCGHA...@TK2MSFTNGP14.phx.gbl...
You don't understand a fundamental concept to .NET and CIL. Yes, there
are compilers that will perform optimization of IL to a certain degree.
However, when the managed code is run, the CLR will take the CIL and
then compile it into native code. At this point in time, it is free to
optimize, or not optimize, or mangle your code in any way you want when
making the transition from CIL to native code.
When you are dealing with assembly language in general, you have
complete control of what is going on, memory allocation, deallocation,
execution, etc, etc. With the CLR, this is taken out of your hands to a
degree.
For example, have you considered, what happens when a Garbage Collection
(GC) occurs while your function is running? If it is in complete managed
code, then there is nothing that you can do about it, and your function will
resume running when the GC is complete. Depending on what is happening on
the machine at the time, combined with what your program is doing, etc, etc,
it is very feasible that your code will take more than 1/10th of a second.
Just because it looks like assembly language, don't assume that CIL is
assembly language. There are some very different things going on under the
hood.
--
- Nicholas Paldino [.NET/C# MVP]
- m...@spam.guard.caspershouse.com
"Peter Olcott" <olc...@att.net> wrote in message
news:xj2sf.37906$QW2.34052@dukeread08...
I don't care about these things they are not effecting my performance. What is
effecting my performance are things such as the compiler failing to inline my
functions code, and unnecessary overhead in the translation of a switch
statement. My function will be always executed several million times every
second. It must run concurrently with other applications.
>
> For example, have you considered, what happens when a Garbage Collection
> (GC) occurs while your function is running? If it is in complete managed
> code, then there is nothing that you can do about it, and your function will
> resume running when the GC is complete. Depending on what is happening on the
> machine at the time, combined with what your program is doing, etc, etc, it is
> very feasible that your code will take more than 1/10th of a second.
My code knows exactly how much memory it needs at load time. It needs all of
this memory the whole time that it executes. It would make no sense to have any
garbage collection of my code's memory in this case. I want my code to be
implemented as a .ET component.
You are missing the point completely. If you implement your code as
managed code, even in IL, you can't stop a GC no matter what. Your thread
is going to be pre-empted (in most situations, except if you have the GC
running on a separate thread) and it WILL stop and it WILL affect your
performance when it happens.
Just because your code knows how much memory it needs doesn't mean that
you can pre-empt a GC. If it happens, it's going to happen, and there is
nothing you can do about it. Your 100-line function isn't going to be able
to stop it, and the CLR isn't going to care what your function is doing.
You can't just pretend its not going to happen. It does, and it will,
and you can't stop it. This isn't a choice you have if you are running
managed code, whether you do it in IL or not.
This is what it means to have ^managed^ code. The CLR is going to
provide a good number of services, but you are going to have to pay for
them, and should be aware of how they impact your code.
This is why I recommended that you use interop with your unmanaged code.
You will have your performance requirements fufilled, and not have to worry
about doing something that will ultimately be self-defeating.
--
- Nicholas Paldino [.NET/C# MVP]
- m...@spam.guard.caspershouse.com
"Peter Olcott" <olc...@att.net> wrote in message
news:GV3sf.37927$QW2.2410@dukeread08...
I don't care if a GC occurs during the execution of my code. It can't occur
because of my code because my code knows how much memory it needs in advance and
always needs all of this memory the whole time that it is executing. If some
other process interrupts my code, it won't hurt it. My code just can't take more
than 1/10 second to execute, since it needs to execute every second this will
limit its use to 10% of the CPU time. Ultimately I want to limit my thread's
required execution to no more than 10% of the CPU time. This means that this one
function can't take more than 1/10 second to execute.
>
> Just because your code knows how much memory it needs doesn't mean that you
> can pre-empt a GC. If it happens, it's going to happen, and there is nothing
> you can do about it. Your 100-line function isn't going to be able to stop
> it, and the CLR isn't going to care what your function is doing.
>
> You can't just pretend its not going to happen. It does, and it will, and
> you can't stop it. This isn't a choice you have if you are running managed
> code, whether you do it in IL or not.
>
> This is what it means to have ^managed^ code. The CLR is going to provide
> a good number of services, but you are going to have to pay for them, and
> should be aware of how they impact your code.
>
> This is why I recommended that you use interop with your unmanaged code.
> You will have your performance requirements fufilled, and not have to worry
> about doing something that will ultimately be self-defeating.
I don't think that unmanaged code would make a good .NET component. The current
design requires the function to be implemented as a .NET component.
I highly recommend that you read up on how Garbage Collection works
exactly.
--
- Nicholas Paldino [.NET/C# MVP]
- m...@spam.guard.caspershouse.com
"Peter Olcott" <olc...@att.net> wrote in message
news:Hy4sf.37932$QW2.22826@dukeread08...
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndotnet/html/dotnetperftechs.asp
section -->
"Myth: Garbage Colloction Is Always Slower Than Doing It by Hand"
"Nicholas Paldino [.NET/C# MVP]" <m...@spam.guard.caspershouse.com> wrote in
message news:%23YB$FaqCGH...@TK2MSFTNGP12.phx.gbl...
I already know this. One thing that it can not do is to reclaim memory that is
still in use. I remember reading the algorithm. It is some sort of aging system.
In any case even if my memory needs to be constantly checked to see if it is
still in use, I only need a single monolithic large block. One thing that I do
know about GC, is that it is ONLY invoked when memory runs out, and is needed,
otherwise it is never invoked.
"Peter Olcott" <olc...@att.net> wrote in message
news:Hy4sf.37932$QW2.22826@dukeread08...
"Pohihihi" <noe...@hotmail.com> wrote in message
news:e6$JcnqCG...@TK2MSFTNGP10.phx.gbl...
Well anyways, seems like you know the ground you are playing on so for the
answer for your original question, yes you can write the whole program in
IL. I will be interested in knowing how you optimized it.
"Peter Olcott" <olc...@att.net> wrote in message
news:4S4sf.37937$QW2.7702@dukeread08...
ofcourse you can, by now you may even have done it. Just write few *.il
lines and pass it to ilasm (that comes with the sdk) and you'll get the
managed binary.
although Nicholas tried explaning. I have just few things to add.
1- If you look at the shared source implementation of .net, ie, SSCLI (aka
ROTOR), you wont find a single *.il file that microsoft devs had to write in
order to acheive better performance. The max they did to write fast code as
a .net code was to write the unsafe C# code (which uses pointers). Beyond
that, places that needed to get maximum efficiency like JIT, they used a
pure c++ code and in very few places, x86 assembly code was used. I think if
those developers could acheive anything significant by writing il directly,
they would have done that.
2- C# compiler is as clever in generating msil as anyone can possibly get. I
would recommend that start analysing the output of the C# compiler by
looking at the generated msil using ildasm (or your fav il disassembler) and
really see if you could have written a better msil.
3- I heard somebody mention that there are indeed some msil instructions
that C# compiler doesnt use. If by using those instructions more fast code
could be produced than maybe you have a chance. But I would not count on
this one.
regards,
Ab.
http://joehacker.blogspot.com
"Peter Olcott" <olc...@att.net> wrote in message
news:7t%rf.37887$QW2.9998@dukeread08...
<snip>
> 2- C# compiler is as clever in generating msil as anyone can possibly get. I
> would recommend that start analysing the output of the C# compiler by
> looking at the generated msil using ildasm (or your fav il disassembler) and
> really see if you could have written a better msil.
This is entirely untrue. I believe the MC++ compiler generates better
IL than the C# compiler, for a start. For a second thing, I remember a
post a long time ago by someone who wanted to be able to embed IL in
his C#. We asked him whether he was sure it would produce a performance
improvement, and he produced his hand-tweaked IL and the IL generated
by the C# compiler - and the hand-tweaked IL was about twice as fast.
Don't misunderstand me - I'm not recommending hand-tweaking IL - I'm
just saying that the C# compiler does *not* generate "perfect" IL.
--
Jon Skeet - <sk...@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
I hope that the new C#(2.0) compiler is better at this.
Thanks.
Ab.
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.1e1b1a993...@msnews.microsoft.com...
It may well be better, although I don't know that it is. I doubt that
it's absolutely perfect though - that you could never, ever improve on
the code it generates.
It's a shame I can't find the newsgroup thread I'm thinking about - I
could give it a try with C# 2.0...
btw what search engine are you using? It maybe socking for some people but
few days back I was searching some thread and for the same keywords google
could not find the post but msn search (search.msn.com) did :)
Ab.
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.1e1b21d3b...@msnews.microsoft.com...
> "Nicholas Paldino [.NET/C# MVP]" <m...@spam.guard.caspershouse.com> wrote in
> message news:u3hVLnqC...@TK2MSFTNGP10.phx.gbl...
>
>>Peter,
>>
>> I highly recommend that you read up on how Garbage Collection works
>>exactly.
>
>
> I already know this.
No, you obviously don't. The problem is not that the GC will take any
memory from your method, but that it is running in a separate thread. So
lets say you method executes in 1/10 s. When garbage collection occurs,
it will take way longer, simply because your method will halt in the
middle of something and resume when the GC is done. So from inside your
method, the execution time was still 1/10, but from the outside the
execution time is way longer ( A little bit like theory of relativity
:-) ).
Relying on the GC to do or not to do something is a capital sin in .NET.
> One thing that it can not do is to reclaim memory that is
> still in use.
Correct, it will not try to get you memory, however it will stop you
thread if it wants to
> I remember reading the algorithm. It is some sort of aging system.
> In any case even if my memory needs to be constantly checked to see if it is
> still in use, I only need a single monolithic large block.
Yea, and while t checks and sees that it is not allowed to touch you
monolithic large block, you method will pause and take longer than 1/10s.
> One thing that I do
> know about GC, is that it is ONLY invoked when memory runs out, and is needed,
> otherwise it is never invoked.
One thing that you must know when developing managed code, is that you
*never* know when the GC is invoked.
Watch your performance counters for garbage collection. You'll be
surprised how busy the area :-)
[snip]
HTH,
Andy
--
To email me directly, please remove the *NO*SPAM* parts below:
*NO*SPAM*xmen40@*NO*SPAM*gmx.net
groups.google.com, which is what I always use for newsgroup posts.
(Note: not just "web google".) It'll be there somewhere, but I just
can't find it at the moment...
Andreas,
The only time the GC runs (un-forced) is when the creation of an object on
the GC heap would overrun the gen0 heap threshold. When this happens the GC
runs on the same thread as the object creator. That means that the GC won't
run as long as you don't create objects instances.
Note that this assumes there is no external memory pressure when there are
extra GC heap segments allocated, this would force the CLR to start a full
collection.
Willy.
Actually he is, that doesn't mean he generates "perfect" IL, whatever that
may be.
I ran a bunch of (things I keep around since v1.0) benchmarks comparing
managed C++/CLI and C# and found that the RTM build generates almost
identical IL where the previous C# version (even the v2 beta1) did generate
(some) less optimal IL. But even then the performance delta never exceeded
some 5%.
Note that comparing MC++ with C# isn't a fair comparison as the MC++
compiler may emit non verifiable IL, while C# cannot. I tried hand-tweaking
IL and came to the conclusion that it's a waste of time (at least in v2),
most of the time the performance gains are nil.
Willy.
> 2- C# compiler is as clever in generating msil as anyone can possibly get. I
False assumption when one examines the benchmarks of managed C++ against managed
C#, Managed C++ does significantly better in at least some cases.
Good to know, thanks. This is the knid of validation that I was looking for.
That is find as long as my code does not take more than 1/10 second total. When
I called my code real-time, this was a close approximation of the truth, rather
than precisely true.
Be careful, if your function allocates that "block of memory" from the GC
heap, it may get pre-empted by the CLR to perform a GC. You could force a GC
run by calling GC.Collect() before you call the method, but this won't
necessarily prevent a GC when you allocate a very large object in that
method.
Willy.
Not sure where you get this from? Did you actually run such benchmarks?
I did run many benchmarks, since v1.0 comparing both C# and Managed C++ (and
C++/CLI) and I never saw a delta (yes, for some C# is the winner) larger
than 5%, using v2 they are even smaller.
Willy.
Why not post the C++ code? Are you sure that all of the facilities that
are available to you in that C++ code are available in .NET? Do you have to
call other APIs? Are you sure there are all managed equivalents?
I think it's time to either post the code, or let the thread die.
--
- Nicholas Paldino [.NET/C# MVP]
- m...@spam.guard.caspershouse.com
"Peter Olcott" <olc...@att.net> wrote in message
news:9ybsf.37951$QW2.24780@dukeread08...
But as a general statement, the C++ compiler generally has the best
optimizations (and for unmanaged code, with the new profile-guided
optimization, it's even cooler).
--
- Nicholas Paldino [.NET/C# MVP]
- m...@spam.guard.caspershouse.com
"Jon Skeet [C# MVP]" <sk...@pobox.com> wrote in message
news:MPG.1e1b36c8...@msnews.microsoft.com...
No, the article was definitely someone posting in this group saying, "I
want to be able to embed IL in my C# code, here's why." He then
produced some better IL (which I suspect *was* verifiable) which the C#
compiler "could" have produced from the source C# (i.e. the behaviour
was identical).
I'm sure this will improve over time, but to be honest it's usually the
JIT that has more to do with optimisation IMO.
> But as a general statement, the C++ compiler generally has the best
> optimizations (and for unmanaged code, with the new profile-guided
> optimization, it's even cooler).
Right.
I don't care about a GC before I begin running, or any other GC that I did not
invoke.
>
> Willy.
>
>
What do you mean by verifiable code?
I wouldn't think that this would be the case for two reasons:
(1) CIL (for the most part) forms a one-to-one mapping with assembly language
(2) End users are waiting on the JIT to complete, no time to waste doing
optimizations that could have been done before the softwae shipped.
Not any possible 100 lines of code, only the specific 100 lines of code that I
am referring to in my function. No OS calls. No memory management, just looping
and comparing.
--
- Nicholas Paldino [.NET/C# MVP]
- m...@spam.guard.caspershouse.com
"Peter Olcott" <olc...@att.net> wrote in message
news:Xaksf.38006$QW2.8948@dukeread08...
"Nicholas Paldino [.NET/C# MVP]" <m...@spam.guard.caspershouse.com> wrote in
message news:upn6lJ1C...@tk2msftngp13.phx.gbl...
I said it doubled the speed for one particular case, which was only
about four instructions. I wouldn't expect there to be much difference
(if any) normally.
How much more performance do you need? Have you tried doing the
conversion and seeing how it performs *without* tweaking?
I want to advise that you first study the framework details. You are wanting
to write the IL yourself which would be better than the C# compiler
generated output, and you dont yet know about verified code, details of GC,
etc. I think you would yet have to go through the complete clr instruction
set in order to pick the best instructions to best optimise the code. By now
you must have got an idea that in .net world (as opposed to c++) its going
to be extremely difficult to find people who occasionally hand code *.il
files to acheive better performance.
Ab.
"Peter Olcott" <olc...@att.net> wrote in message
news:QNosf.38020$QW2.8997@dukeread08...
>
The fastest algorithm with the best compilation just barely meets my target.
This is with MS Visual C++ 6.0. The project requirements call for a .NET
component. If I could double the speed of this I would be very pleased. In any
case more recent compilers do not meet my target even with the best algorithm,
so I must do at least as well as the best compiler. This should only be a matter
of translating the generated assembly language from the best unmanaged code into
CIL.
Yet when a project with 16,000 hours invested can be made or broken by the speed
of a single 100 line function, this kind of optimization would be completely
appropriate. I am only at the feasibility study stage now. It definitely looks
feasible.
No - because CIL looks pretty different from assembly language, and
even if you generated similar-looking CIL somehow, there's no guarantee
that would then be JITted to the same assembly code.
The performance improvement you'll get from this step, if any, is
likely to be tiny (don't hang on to the idea of doubling the speed -
that was a very particular case) and an awful lot of work. I'd do the
initial conversion to C# and benchmark that *first*.
Not true, IL is kind of high level language compared to X86 assembly, one
single IL instruction translates to x assembly level instructions where x is
certainly not 1.
> (2) End users are waiting on the JIT to complete, no time to waste doing
> optimizations that could have been done before the softwae shipped.
>
Wrong again, IL is not optimized that much, THE optimizer is the JIT. It's
the JIT that knows at run-time what kind of optmizations can be performed
depending on the characteristics of the HW like CPU type 32bit/64 bit,
number of registers, L1 and L2 cache sizes, MMX/SSE enabled etc.
The CLR is a run-time optimizing execution engine, whether you believe it or
not.
Willy.
If the fasted algo using C++ barely meets your target, you wont do any
better using IL, whether you "translate" X86 assembly to IL or not (which
isn't possible as there is no direct mapping) or not, the IL will be JIT
compiled at run-time, but don't expect it will translate to the same X86
machine code.
Willy.
You don't care! I see, wonder why you did start this thread if you don't
care about most of the good advise we gave.
Willy.
Well, they were wrong, for sure, Please post the URL's where you found this
kind of nonsense.
Willy.
Again, I ask of you, post the 100 line function, or let the thread die.
The fact that you have some 16,000 hours of development means absolutely
NOTHING to the CLR.
And yes, I am trying to say, in the nicest possible way, put up, or shut
up.
--
- Nicholas Paldino [.NET/C# MVP]
- m...@spam.guard.caspershouse.com
"Peter Olcott" <olc...@att.net> wrote in message
news:2uqsf.38024$QW2.33610@dukeread08...
Well of course I would do that first. The only reason that I am considering this
step is because with the unmanaged C++ compilers the change between 6.0 to 7.0
resulting in doubling of the time. In other words the newer compiler produces
code that is only half as fast as the older compiler. The code that I am talking
about doesn't have any OS calls or memory management. It is just comparisons,
branches and movement of integers. The tweaking that I am talking about would
merely be to shorten the lengths of the execution paths. Fewer instructions in
the execution paths would have to result in at least somewhat faster code. If I
could cut down the weighted average length of the execution paths in half, this
would result in doubling of the speed. Most every simple instruction only takes
a single clock. Some instructions can now be paired to execute concurrently. I
will probably look into this sort of optimization as well. This would only
involve changing the order of some instructions. I figure that in the worst case
scenario I will be able to achieve the same speed as the best compiler. For
example if the best compiler is VC++ 6.0, unmanaged code, and if 2005 C#
compiler produces code that takes 500% longer to execute, I can force the .NET
code to execute just as fast as the VC++ 6.0 unmanaged code. I would only bother
to do this for this critical 100 line function. I actually expect to be able to
do better than this. I would expect to improve performance at least 50%.
>
>> (2) End users are waiting on the JIT to complete, no time to waste doing
>> optimizations that could have been done before the softwae shipped.
>>
>
> Wrong again, IL is not optimized that much, THE optimizer is the JIT. It's
The JIT probably does all the processor specific optimizations. These don't
affect performance nearly as much as the ones that are not processor specific.
Because much of this advice does not apply to my case. I do care about the CPU
time it takes this critical 100 line function to execute. If another process
interrupts this so that the real time is much loner than the CPU time, I don't
care. I used the term real-time somewhat misleadingly.
>
> Willy.
>
>
My goal is to at least match this fastest time. The project has a design
requirement to be implemented as a .NET component. I will probably also hand
tweak the assembly language from this fastest compiler VC++ 6.0, and then
attempt to match this performance in CIL. From all of this effort I expect to
improve the performance of the fastest compiler by at least 50%. Since this
critical function will be executed several million times every second, it will
be worth the cost of this extra effort at optimization.
>
> Willy.
>
>
The difference between VC++ 6.0 and VC++ 7.0 is 50%. The older compiler produces
much better code.
http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.htmlThe above link shows that C# is about 500% slower on something as simple as anested loop.>>>
--
- Nicholas Paldino [.NET/C# MVP]
- m...@spam.guard.caspershouse.com
"Peter Olcott" <olc...@att.net> wrote in message
news:Oqzsf.38050$QW2.17219@dukeread08...
No they are not, IL is based on a pure stack based virtual machine execution
environment, it has not such thing like registers, it has no notion of a
real memory location, it has no access to the runtime stack.
Just to give you an idea what I'm trying to explain, consider following C#
method and it's compiler generated IL method.
[C#]
static void Foo()
{
int v = 0;
int[] ar = new int[5] {0,1,2,3,4};
for (int i = 0;i != 5 ;i++ )
{
v += ar[i];
}
}
//
[compiler generated IL]
.method private hidebysig static void Foo() cil managed
{
// Code size 39 (0x27)
.maxstack 3
.locals init (int32 V_0,
int32[] V_1,
int32 V_2)
IL_0000: ldc.i4.0
IL_0001: stloc.0
IL_0002: ldc.i4.5
IL_0003: newarr [mscorlib]System.Int32
IL_0008: dup
IL_0009: ldtoken field valuetype
'<PrivateImplementationDetails>{E21D91A1-F27C-4190-94E3-4FB17E12D29A}'/'__StaticArrayInitTypeSize=20'
'<PrivateImplementationDetails>{E21D91A1-F27C-4190-94E3-4FB17E12D29A}'::'$$method0x6000002-1'
IL_000e: call void
[mscorlib]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class
[mscorlib]System.Array,
valuetype [mscorlib]System.RuntimeFieldHandle)
IL_0013: stloc.1
IL_0014: ldc.i4.0
IL_0015: stloc.2
IL_0016: br.s IL_0022
IL_0018: ldloc.0
IL_0019: ldloc.1
IL_001a: ldloc.2
IL_001b: ldelem.i4
IL_001c: add
IL_001d: stloc.0
IL_001e: ldloc.2
IL_001f: ldc.i4.1
IL_0020: add
IL_0021: stloc.2
IL_0022: ldloc.2
IL_0023: ldc.i4.5
IL_0024: bne.un.s IL_0018
IL_0026: ret
} // end of method Tester::Foo
and here is what the JIT compiler actually generated from this (!! CPU
specific !!)
00cb0098 57 push edi
00cb0099 56 push esi
00cb009a ba05000000 mov edx,0x5
00cb009f b92a981579 mov ecx,0x7915982a
00cb00a4 e86b21c5ff call 00902214
00cb00a9 8d7808 lea edi,[eax+0x8]
00cb00ac be68204000 mov esi,0x402068
00cb00b1 f30f7e06 movq xmm0,qword ptr [esi]
00cb00b5 660fd607 movq qword ptr [edi],xmm0
00cb00b9 f30f7e4608 movq xmm0,qword ptr [esi+0x8]
00cb00be 660fd64708 movq qword ptr [edi+0x8],xmm0
00cb00c3 83c610 add esi,0x10
00cb00c6 83c710 add edi,0x10
00cb00c9 a5 movsd
00cb00ca 33d2 xor edx,edx
00cb00cc 8b4804 mov ecx,[eax+0x4]
00cb00cf 3bd1 cmp edx,ecx
00cb00d1 730b jnb 00cb00de
00cb00d3 83c201 add edx,0x1
00cb00d6 83fa05 cmp edx,0x5
00cb00d9 75f4 jnz 00cb00cf
00cb00db 5e pop esi
00cb00dc 5f pop edi
00cb00dd c3 ret
00cb00de e8fe453e79 call mscorwks!JIT_RngChkFail (7a0946e1)
00cb00e3 cc int 3
Now try for yourself to build an IL module from the assembly code, and
please make sure it compiles, is verifiable and runs as fast as the C#
generated IL above. Or try to tweak the IL so it translates into better
(faster) X86 code.
>>
>>> (2) End users are waiting on the JIT to complete, no time to waste doing
>>> optimizations that could have been done before the softwae shipped.
>>>
>>
>> Wrong again, IL is not optimized that much, THE optimizer is the JIT.
>> It's
> The JIT probably does all the processor specific optimizations. These
> don't affect performance nearly as much as the ones that are not processor
> specific.
>
Apart from the processor specific optimizations (which are significant) it
performs most of the optimizations performed by a C/C++ compiler back-end
optimizer (both the C++ back-end optimizer and the JIT optimizer has been
written by the same team), only difference is that it happens at run-time,
so it is somewhat constrained by time, but this is largely compensated by
the processor/memory specific optimizatons.
Check this link and see how managed code compares to unmanaged code at the
performance level.
http://www.grimes.demon.co.uk/dotnet/man_unman.htm
Willy.
Don't expect you can make it execute (50%) faster than unmanaged C++ code,
don't expect to hand tweak ASM and translate that to IL and expect the JIT
compiler will produce the same machine code - IT WONT. Also I'm not clear on
what you mean by executing several million times per second, in another
reply you said the function takes 10 msec to finish using VC6 and now you
are expecting this to execute million times per second.
Willy.
Ok, now you are again comparing C++(6.0) to C++ (7.1) while previously you
were comparing C++ to C#.
But also this one is one of your claims you can't (or are not willing?)
prove, anyway if it's true I would suggest you to file a bug report
(http://lab.msdn.microsoft.com).
I can prove that the newest C++ compiler produces equal or better (faster)
code than VC6.
Willy.
That's right, guess it's time to let this thread die,
Willy.
Show me the source code.
>
>>>
>>>> (2) End users are waiting on the JIT to complete, no time to waste doing
>>>> optimizations that could have been done before the softwae shipped.
>>>>
>>>
>>> Wrong again, IL is not optimized that much, THE optimizer is the JIT. It's
>> The JIT probably does all the processor specific optimizations. These don't
>> affect performance nearly as much as the ones that are not processor
>> specific.
>>
>
> Apart from the processor specific optimizations (which are significant) it
> performs most of the optimizations performed by a C/C++ compiler back-end
> optimizer (both the C++ back-end optimizer and the JIT optimizer has been
> written by the same team), only difference is that it happens at run-time, so
> it is somewhat constrained by time, but this is largely compensated by the
> processor/memory specific optimizatons.
> Check this link and see how managed code compares to unmanaged code at the
> performance level.
> http://www.grimes.demon.co.uk/dotnet/man_unman.htm
>
>
> Willy.
>
>
http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.html
The above link is much more telling. There is a 450% difference in performance
between C++ and C# for something as simple as nested loops. Also the difference
between optimized code and code compiler with optimization disabled can be at
least an order of magnitude. If there is a 450% difference in the performance on
something as simple as a nested loop, this shows that there is significant room
for improvement.
--
- Nicholas Paldino [.NET/C# MVP]
- m...@spam.guard.caspershouse.com
"Willy Denoyette [MVP]" <willy.d...@telenet.be> wrote in message
news:OFjwIX9C...@TK2MSFTNGP10.phx.gbl...
I hope to get the managed code to execute just as fast as my best unmanaged
code. I hope to improve the performance of my best unmanaged code by 50% by hand
tweaking the assembly language. The caller that executes this function executes
once in 10 ms. It executes this function several million times. The caller will
be executed once every second.
>
> Willy.
>
>
>
> code than VC6.
>
> Willy.
>
>
http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.html
What about addressing this issue?
Willy.
That would take more time that I can afford to spend right now. I already did
that with native code 6.0 and 7.0. Even this is too slow. Because benchmarks
indicate that C# can be 450% slower on nested loops (my code is mostly nested
loops), I wanted to look into solving this problem in advance. This is just the
feasibility study stage. I might be forced to digress to unmanaged code.
>
> Willy.
>
>
>
>
What else do you want?, I gave you the C# source code (the Foo method), it's
corresponding IL and the X86 code produced by the JIT.
>
> http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.html
> The above link is much more telling. There is a 450% difference in
> performance between C++ and C# for something as simple as nested loops.
> Also the difference between optimized code and code compiler with
> optimization disabled can be at least an order of magnitude. If there is a
> 450% difference in the performance on something as simple as a nested
> loop, this shows that there is significant room for improvement.
>
This very specific (but broken [1] and cluless) benchmark (the loop) is a
sample where the NATIVE C compilers optimizer does a better job than the JIT
compiler/optimizer, but this has nothing to do with the IL code.
[1] This is the correct code which is still ~50% slower than the (corrected)
C++ code ( __int64 x=0; ).
int a = 0, b = 0, c = 0, d= 0, e = 0, f = 0;
long x=0;
startTime = DateTime.Now;
for (a=0; a!=n; a++)
for (b=0; b!=n; b++)
for (c=0; c!=n; c++)
for (d=0; d!=n; d++)
for (e=0; e!=n; e++)
for (f=0; f!=n; f++)
x+=a+b+c+d+e+f;
C# with /o+
Nested Loop elapsed time: 10015 ms - 479232000000
C++ with /O2 /EHsc
Nested Loop elapsed time: 6171 ms 479232000000
Willy.
| I already did
| that with native code 6.0 and 7.0. Even this is too slow. Because
benchmarks
| indicate that C# can be 450% slower on nested loops (my code is mostly
nested
| loops), I wanted to look into solving this problem in advance.
I would "solve" this problem by converting the code, do some profiling &
timing on the converted code, and going from there! I would convert to C#
first, if C# proved to be too slow, I would then consider a unsafe C#,
followed by Managed C++ class library, as a last resort I would consider a
hand tweaked IL class library. My concern with hand tweaked IL would be when
the JIT was updated with better optimization algorithms which conflicted
with my hand tweaking, the 32bit JIT behaved differently then the 64bit JIT,
or the JIT behaved differently based on processor...
Remember to time the C# code outside of the VS IDE and use a Release build.
As the C# compiler doesn't optimize Debug builds & JIT compiler won't
optimize any code run under the IDE.
| This is just the
| feasibility study stage.
It would seem to me that if these 100 lines are the "critical" lines to your
process, then taking the time to convert them would be paramount for
determining the feasibility of the project.
IMHO I would not use some generic C# benchmark to decide if the project is a
go or no go. I would prototype (translate) the critical code & time that.
| I might be forced to digress to unmanaged code.
I would only digress to unmanaged code, once C#, unsafe C#, Managed C++ &
hand tweaked IL all proved (*proved*) to have performance issues.
--
Hope this helps
Jay [MVP - Outlook]
.NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net
"Peter Olcott" <olc...@att.net> wrote in message
news:NSDsf.38092$QW2.25345@dukeread08...
|
| "Willy Denoyette [MVP]" <willy.d...@telenet.be> wrote in message
| news:%23mzM%23V$CGHA...@TK2MSFTNGP09.phx.gbl...
| >
<<snip>>
This is hilarious, you are making a fool of yourself really (or should I
call you a troll?), Translating 100 lines of C code to C# takes 5 minutes of
your time, hand optimizing the IL another 10 minutes, this is far less than
the time you wasted in this thread.
I said in other replies and in the other thread you started that the 450%
benchmarks is clueless and broken, read my reply in the relevant thread for
more details.
Willy.
Now it is very clear that you are not looking for any solution as you have
argument for every suggestion, and more over your argument is not solution
bound but ego bound. Seems like using this thread you are feeding your ego.
If ego is not the case and if you are so smart why even bother coming on
this newsgroup and replying every single post. If you are worried about your
work so much and millions of $$$ then you should have dropped this long time
back and try to find other ways own your own which we now all think you are
not worried about.
Seems like you are having a ego ride here showing off your 16000 hours of
work and going behind every solution. If you think non is working in your
case they move on dude. Believe me no one will even care what happened to
you after that.
"Peter Olcott" <olc...@att.net> wrote in message
news:wpAsf.38054$QW2.18933@dukeread08...
"Peter Olcott" <olc...@att.net> wrote in message
news:7t%rf.37887$QW2.9998@dukeread08...
I can't decipher the CIL from what you have provided. When I decipher the
assembly language I embed the source in the assembly language.
>
>
>>
>> http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.html
>> The above link is much more telling. There is a 450% difference in
>> performance between C++ and C# for something as simple as nested loops. Also
>> the difference between optimized code and code compiler with optimization
>> disabled can be at least an order of magnitude. If there is a 450% difference
>> in the performance on something as simple as a nested loop, this shows that
>> there is significant room for improvement.
>>
> This very specific (but broken [1] and cluless) benchmark (the loop) is a
> sample where the NATIVE C compilers optimizer does a better job than the JIT
> compiler/optimizer, but this has nothing to do with the IL code.
What is wrong with the benchmark? Did you use these compiler options?
C++ compiler options: /Op /Oy /O2 /Oi /Og /Ot /EHsc /arch:SSE
C# compiler options: /optimize+ /debug- /checked-
>
> [1] This is the correct code which is still ~50% slower than the (corrected)
> C++ code ( __int64 x=0; ).
>
> int a = 0, b = 0, c = 0, d= 0, e = 0, f = 0;
> long x=0;
> startTime = DateTime.Now;
> for (a=0; a!=n; a++)
> for (b=0; b!=n; b++)
> for (c=0; c!=n; c++)
> for (d=0; d!=n; d++)
> for (e=0; e!=n; e++)
> for (f=0; f!=n; f++)
> x+=a+b+c+d+e+f;
>
> C# with /o+
> Nested Loop elapsed time: 10015 ms - 479232000000
>
> C++ with /O2 /EHsc
> Nested Loop elapsed time: 6171 ms 479232000000
Did you try managed C++ ?
>
> Willy.
>
>
The project has already been determined to be feasible. This stage is assessing
whether or not converting it to C# .NET is feasible.
>
> IMHO I would not use some generic C# benchmark to decide if the project is a
> go or no go. I would prototype (translate) the critical code & time that.
There is a learning curve involved in this that might hurt my chances of
ultimate success. This is the point in time where I decide to proceed down the
C# .NET path or remain on the native code C++ path. I can't afford to spend the
time learning C# and .NET, and then converting even this part of the project to
later find out that C# and .NET were not the way to go. At this stage I am
determining which of these two paths to take.
First I have to learn C# and .NET.
>
> I said in other replies and in the other thread you started that the 450%
> benchmarks is clueless and broken, read my reply in the relevant thread for
Yet you never bothered to say what is clueless and broken about it, thus it is
merely an unsupported assertion.
> more details.
>
> Willy.
>
>
>Cab you write code directly in the Common Intermediate language? I need to
>optimize a critical real-time function.
>
Without a doubt, this is the longest thread I have ever seen. It
looks like a lightening bolt in the reader when the replies are
exposed ;o)
{snip}
>I have
>16,000 hours of development time in my current project.
Oh, Peter....
Since you didn't say "our" project I assume you spent 2000 8 hour
man-days (5.4794520547945205479452054794521 years) without taking a
day off on your project.
Would you mind telling us how many lines of code are in your 16,000
hour project? I'm sure one, such as yourself, who pays great
attention to detail will know the exact count.
I'd like to calculate how long it took you to write that 100 line
function.
I think you're full of BS, troll...
mable
Time will tell.
>
> mable
>
>
Credibility like belief and disbelief are fallacious conceptions, they each
cause conclusions to be drawn based on necessarily insufficient information.
>
> mable
>
>
Of course I would do it this way, yet I would not even proceed along the
learning curve path of .NET / C# if I did not at least have this last resort as
an option. There is great disagreement here whether or not this is an option.
What it seems to be is an option that few here are aware of.
http://vstte.inf.ethz.ch/pdfs/vstte-hoare-misra.pdf
This was written by microsoft research about verified code, apparently you must
be talking about something else. This will not be fully implemented for at least
twenty years.
> etc. I think you would yet have to go through the complete clr instruction
> set in order to pick the best instructions to best optimise the code. By now
> you must have got an idea that in .net world (as opposed to c++) its going
> to be extremely difficult to find people who occasionally hand code *.il
> files to acheive better performance.
>
> Ab.
>
> "Peter Olcott" <olc...@att.net> wrote in message
> news:QNosf.38020$QW2.8997@dukeread08...
>>
>
>
Google and Google groups doesn't know about it either. I wonder what that means?
What I stated in the message, and you seem to have missed:
It would seem to me that if these 100 lines are the "critical" lines to your
process, then taking the time to convert them to C# would be paramount for
determining the feasibility of using C# for the project.
In other words I would build a C# prototype to determine if C# was going to
work or not! Likewise with managed C++ or IL.
| C# .NET path or remain on the native code C++ path. I can't afford to
spend the
| time learning C# and .NET, and then converting even this part of the
project to
| later find out that C# and .NET were not the way to go. At this stage I am
| determining which of these two paths to take.
You can use my web site below to contact me if you would like to hire my
services on converting the 100 lines...
--
Hope this helps
Jay [MVP - Outlook]
.NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net
"Peter Olcott" <olc...@att.net> wrote in message
news:QUFsf.38109$QW2.10843@dukeread08...
It looks like C# will not be feasible, but Managed C++ might be feasible.
http://msdn.microsoft.com/msdnmag/issues/04/05/VisualC2005/
Microsoft has focused their attention on providing optimization to managed C++
because this was the quickest way to provide the best optimization.
Now that I have a copy of Visual Studio 2005, with its purported much better
optimizations, I could make this sort of test. My purpose here was to see which
learning curve path I would need to proceed with. .NET or COM. I can't afford
the time to proceed down both these paths. If I proceed down both these paths
and find that the current path is infeasible, I am out of time and can't
complete the project. For this reason I needed to know if it was feasible to
write code directly in CIL, as a backup plan just in case the compiler
optimizations are not good enough. If this is true, then I can take the time to
learn .NET so that I can convert this 100-line function to .NET, otherwise I
must find some other way to determine which of these two paths to take in
advance. I guess that another scenario could be a .NET component wrapper with
unmanaged code at its core. There would only need to be one transition to the
unmanaged code, and one transition back form the unmanaged code. Since .NET has
such a simpler component model than COM, this approach might also prove to work.
The latest version of Visual Studio has a managed version of the STL. This would
further reduce my learning curve.
You could create the bulk of your application in C#. You could write the
time critical part in Managed C++, you could write the UI in VB.NET. They
all get compiled into .NET Assemblies (an EXE or DLL that contains IL code).
The JIT compiler doesn't really care what languages created the assemblies.
--
Hope this helps
Jay [MVP - Outlook]
.NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net
"Peter Olcott" <olc...@att.net> wrote in message
news:5ivuf.40241$QW2.12685@dukeread08...
<<snip>>
I just hope that the 2005 compiler fulfills its optimization claims.