Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

GC Latency - system freezes

3 views
Skip to first unread message

Asher F.

unread,
Aug 10, 2009, 5:06:01 AM8/10/09
to
Our system is rather sensitive to sloppy timings, delay of 50-100
milliseconds should be prevented (it is OK to happen once in a while, but not
every 5 seconds).
We observe the following behavior:
Allocation rate is about 500KB/second (will be optimized later, but it won’t
become 0).
Every 5-10 seconds GC0 happens. Usually when GC0 happens our system suffers
from delays in timing (100-200 ms).

Problem was “simulated” on the simplest possible scenario – see below

Our question:
Why code in our example, which frequently calls to API, calls to methods and
returns from methods (and is 100% managed, without CER etc) prevents GC from
running ?
Our system has long running threads, doing some computations. They can run
for 100-300 ms, sometimes more (it is the same worker threads, doing lengthy
jobs, no threads are created).
Other parts of the system are rather time-sensitive.
Turns out that lengthy calculations can mess up timings ?

What can we do (except inserting “cooperative calls”) to prevent long waits
due to GC ?


P.S.
We saw here description of something similar :
http://blogs.microsoft.co.il/blogs/sasha/archive/2009/07/31/garbage-collection-thread-suspension-delay-250ms-multiples.aspx
(busy wait loop prevents GC from happen, it waits 250 ms or more and then
stops the threads).
We modified their test code to be more “like” our application:
One thread periodically allocates memory (blocks of 50KB) and sleeps (to
cause GC, we don’t want to induce GC manually, allocations create some
pressure and sleeps leave CPU for other tasks)
Second thread (churner) does recursive calls and calls CLR API methods
(simulates busy wait, but with continuous hopping between methods, to let GC
to inject thread-stopping instructions),
and one Timer handler delegate set to happen every 20 ms, and printing of
scheduling happens later than 50 ms from previous.

If churner does not contains “GC cooperation code” (GC.KeepAlive was
suggested by original post), we see CPU utilization only on one CPU, and very
frequent timing delays in multiplications of 250 ms (again, as described by
original post).
If churner “cooperates”, everything works OK – both cores are utilized (one
by GC/timers/memory allocator, second by churner).

class Program
{
//static volatile int i;
private static byte[] tmp;
private static Stopwatch sw;
private static Random rnd = new Random();

static void Main(string[] args)
{

Thread t1 = new Thread(memAllocs);
t1.Priority = ThreadPriority.Lowest;
t1.Start();

Thread churner1 = new Thread(doBusyLoop);
churner1.Priority = ThreadPriority.Lowest;
churner1.Start();

sw = Stopwatch.StartNew();
Timer sched = new Timer(new TimerCallback(onTimer),null,20,20);

while (true)
{
Thread.Sleep(1000);
}
}

private static void doBusyLoop()
{
busyLoop();
}

private static void onTimer(object state)
{
double elapsed = sw.Elapsed.TotalMilliseconds;
if(elapsed > 50)
Console.WriteLine("!!!! " + elapsed);

sw = Stopwatch.StartNew();
}

[MethodImpl(MethodImplOptions.NoInlining)]
static void dummyMethod(int val)
{
//to prevent code elimination
int n = rnd.Next(100);
if (n > 200)
Console.WriteLine("wow");

if (val > 0)
dummyMethod(--val);
}

[MethodImpl(MethodImplOptions.NoInlining)]
static void busyLoop()
{
object o = new object();
for (int i = 0; ; ++i)
{
//GC.KeepAlive(o);
dummyMethod(100);
}
}

[MethodImpl(MethodImplOptions.NoInlining)]
static void memAllocs()
{
int i = 0;
while (true)
{
i++;
tmp = new byte[1024 * 50];
if (i == 100)
Thread.Sleep(1000);
}
}
}

--
Asher.

Hongye Sun [MSFT]

unread,
Aug 14, 2009, 6:19:42 AM8/14/09
to
Hi Asher,

Thanks for your post.

I tested the code and can reproduce the issue. However, I was unable to
reproduce the issue when I select to optimize the code. Can you check it by
opening project properties -> in Build tab -> check Optimize code checkbox?

I also found that Thread.Sleep() works same as GC.KeepLive().

If optimize code is not a solution for you, I would recommend you to
provide the calculation code to me to have a check. You can send it to my
email hon...@online.microsoft.com, please remove 'online'.

Here is my research analysis based on SSCLI, but I failed to find any
information can help on this case:
GC safe point is determined JIT compiler. Since latest JIT compiler
implementation is not public, I can only take SSCLI 2.0 as a reference to
this issue. After looking into JIT compiler source code, there is a method
called JIT_PollGC which is resonsible to set GC safe point.

As I looked into the code, I found that JIT compiler will inject JIT_PollGC
method call in every backward jumps. That means it should inject the call
in every loop. But after looking into the disassembly code in Visual
Studio, I was unable to find such injected code. Not sure what's wrong.

Sincerely,
Hongye Sun (hon...@online.microsoft.com, remove 'online.')
Microsoft Online Community Support

Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
msd...@microsoft.com.

This posting is provided "AS IS" with no warranties, and confers no rights


0 new messages