Dalvik spends 7.4% of its time garbage collecting => Android UI spends 7.4% of its time unresponsive

458 views
Skip to first unread message

luke

unread,
Jan 8, 2009, 3:25:22 AM1/8/09
to android-platform
With the phone sitting *idle*, Dalvik performs a GC every 3.5sec on
average. GCs usually take ~260ms on the G1. That means that 7.4% of
the time Dalvik is garbage collecting (albeit for only a quarter of a
second each time), and none of the layers of the Android stack above
Dalvik can do anything. In other words, from a user interaction
perspective, *7.4% of the time* the UI is completely unresponsive, and
this percentage increases once your UI starts allocating and dropping
objects during user interaction.

This results in all sorts of unpleasantness that goes deeper than
overall jerkiness of the UI:

http://code.google.com/p/android/issues/detail?id=1742
http://code.google.com/p/android/issues/detail?id=1740

How much of the low-hanging fruit for optimizing Dalvik garbage
collection have already been picked? Are there plans to add multi-
stage object obsoletion and recycling to Dalvik (like in Sun's JVM
1.6+)?

At the very least, what needs changing to add proper event queueing
below the VM so that user interaction events are not lost? (Where
exactly are they being lost? Maybe the Java stack is just dropping
excess events, and only keeping the stats from the most recent event?)

luke

unread,
Jan 8, 2009, 3:52:21 AM1/8/09
to android-platform
Ah, I finally found (right after posting this, and after wrestling
with this for days) that the event history is in fact all preserved --
it's stored in the "event history" accessors of ACTION_MOVE events.
(ACTION_UP / ACTION_DOWN don't have a history.) This actually means
that for my particular application I can "play catchup" once the GC is
finished, which is what I needed.

The other issues reported above and in the linked bug reports are
still a problem, though I am relieved that I am not losing as many
events as I thought I was...


On Jan 8, 3:25 am, luke <luke.hu...@gmail.com> wrote:
> With the phone sitting *idle*, Dalvik performs a GC every 3.5sec on
> average.  GCs usually take ~260ms on the G1.  That means that 7.4% of
> the time Dalvik is garbage collecting (albeit for only a quarter of a
> second each time), and none of the layers of the Android stack above
> Dalvik can do anything.  In other words, from a user interaction
> perspective, *7.4% of the time* the UI is completely unresponsive, and
> this percentage increases once your UI starts allocating and dropping
> objects during user interaction.
>
> This results in all sorts of unpleasantness that goes deeper than
> overall jerkiness of the UI:
>
> http://code.google.com/p/android/issues/detail?id=1742http://code.google.com/p/android/issues/detail?id=1740

luke

unread,
Jan 8, 2009, 6:53:03 AM1/8/09
to android-platform
I spoke too soon again. The event history makes the event stream
smoother, but there is still a significant halt in event firing or
possibly loss of events during a GC, and there are definitely
ACTION_DOWN and ACTION_UP events that are being lost. Curiously if
you lose an ACTION_DOWN event during GC then you can keep moving your
finger around the screen as long as you want, and no further touch
events are generated until you lift and replace your finger.

On Jan 8, 3:52 am, luke <luke.hu...@gmail.com> wrote:
> Ah, I finally found (right after posting this, and after wrestling
> with this for days) that the event history is in fact all preserved --
> it's stored in the "event history" accessors of ACTION_MOVE events.
> (ACTION_UP / ACTION_DOWN don't have a history.)  This actually means
> that for my particular application I can "play catchup" once the GC is
> finished, which is what I needed.
>
> The other issues reported above and in the linked bug reports are
> still a problem, though I am relieved that I am not losing as many
> events as I thought I was...
>
> On Jan 8, 3:25 am, luke <luke.hu...@gmail.com> wrote:
>
> > With the phone sitting *idle*, Dalvik performs a GC every 3.5sec on
> > average.  GCs usually take ~260ms on the G1.  That means that 7.4% of
> > the time Dalvik is garbage collecting (albeit for only a quarter of a
> > second each time), and none of the layers of the Android stack above
> > Dalvik can do anything.  In other words, from a user interaction
> > perspective, *7.4% of the time* the UI is completely unresponsive, and
> > this percentage increases once your UI starts allocating and dropping
> > objects during user interaction.
>
> > This results in all sorts of unpleasantness that goes deeper than
> > overall jerkiness of the UI:
>
> >http://code.google.com/p/android/issues/detail?id=1742http://code.goo...

Dan Bornstein

unread,
Jan 8, 2009, 12:06:15 PM1/8/09
to android-...@googlegroups.com
On Thu, Jan 8, 2009 at 12:25 AM, luke <luke....@gmail.com> wrote:
> With the phone sitting *idle*, Dalvik performs a GC every 3.5sec on
> average.

I don't observe this on my phone. Are you sure you don't have an
application/service running constantly in the background, and that the
foreground application is in fact truly idle (no free-running
threads)?

Every gc comes from a particular process, and the process id is noted
in the log. For example:

D/dalvikvm( 1188): GC freed 27 objects / 1240 bytes in 92ms

This is a gc in process 1188. Running "ps" can tell you what that
process actually is:

$ adb shell ps | grep 1188
app_9 1188 31 92368 11480 ffffffff afe0c824 S android.process.media

It turns out to be the system media service in this case.

If you do this on your device and determine that the unusual gc
activity you see is coming from a system process of some sort, please
go ahead and file a bug for it. Include specific and explicit steps to
reproduce the problem, since without that there's practically no way
we can track down what's gone wrong.

Thanks, and happy new year!

-dan

Dianne Hackborn

unread,
Jan 8, 2009, 5:24:08 PM1/8/09
to android-...@googlegroups.com
Fwiw, GCs definitely do NOT cause events to be lost.  The window manager will wait for the application to finish processing an event before delivering the next one (and has a queue of pending events that are waiting to be delivered).  It does have a timeout for apps to process events, but that is currently 5 seconds, way longer than the GC time, and if that timeout does expire it will be very obvious due to the resulting Application Not Responding dialog.

Also GCs in other processes will not impact the running of your process (except in some ways the system process, but that GCs pretty rarely), so if GCs are impacting the performance of your app it is almost certainly due to something it is doing.  In general my first guess would be that your event handling or drawing code is creating a lot of temporary objects; that is a common issue we have had in our own apps that performance optimization work has targeted.
--
Dianne Hackborn
Android framework engineer
hac...@android.com

Note: please don't send private questions to me, as I don't have time to provide private support.  All such questions should be posted on public forums, where I and others can see and answer them.

luke

unread,
Jan 8, 2009, 8:34:33 PM1/8/09
to android-platform
Nope, I create almost no temporary objects in my code (intentionally
to avoid this problem). I preallocate (almost) everything, with a
couple of small exceptions that I know about -- however the GCs happen
even when my code is sitting in the idle loop and not receiving any
events.

I followed Dan's advice and it looks like the guilty process that
causes all the GCs is system_server. Based on the huge range of
messages that come with the same pid, I'm guessing that is a message
queue, and is probably running some Runnable that has been passed to
it by various apps? How can I get more info on what it is doing, so
that I know what the guilty app is? The phone gets slower and slower
over the course of a day or so. If I don't reboot daily then within a
couple of days it is almost unusable, taking 10-30 seconds to open an
app, 5-10 seconds to process a keypress, etc. There are tons of
reports on the web of people having the same problem, and the advice
is always: just reboot your phone every day. I note that there is
watchdog code in Android that is supposed to reboot the phone in the
early hours of the morning if it hasn't been rebooted in a while, but
this code is disabled in the release. I tried to write an app to
schedule a regular reboot, but even using introspection to call the
native reboot method that is used by the stack did not work (I assume
for a security reason).

As far as events never being lost -- maybe this is not the problem,
but the state machine in the window manager that handles events has
some incorrect logic in it? The example that seems to indicate that
there might be a problem is as follows. My app is pretty simple, it
lets the user draw multiple lines or shapes on the screen, and also
tracks ACTION_UP/ACTION_DOWN so it can keep the line segments
separate. If you keep drawing lines on the screen until you miss an
ACTION_DOWN event, then (with your finger still on the screen) set a
breakpoint at the start of onTouchEvent(), then keep moving your
finger on the screen, you don't actually get any more events fired
until you lift your finger off the screen and then put it back down
again, at which point the code stops at the breakpoint. This is not a
bug with my code, because my code's onTouchEvent method never even
gets called, and it seems that something somewhere got into a weird
state. If you miss the ACTION_UP event, then you don't get another
ACTION_UP until you finish drawing the next line -- the method is
never called with that event. It happens in about 5% of cases right
now (somewhat hard to trigger).
> hack...@android.com

Dianne Hackborn

unread,
Jan 8, 2009, 9:38:14 PM1/8/09
to android-...@googlegroups.com
On Thu, Jan 8, 2009 at 5:34 PM, luke <luke....@gmail.com> wrote:
I followed Dan's advice and it looks like the guilty process that
causes all the GCs is system_server.  Based on the huge range of
messages that come with the same pid, I'm guessing that is a message
queue, and is probably running some Runnable that has been passed to
it by various apps?  How can I get more info on what it is doing, so
that I know what the guilty app is?  The phone gets slower and slower
over the course of a day or so.  If I don't reboot daily then within a
couple of days it is almost unusable, taking 10-30 seconds to open an
app, 5-10 seconds to process a keypress, etc.  There are tons of
reports on the web of people having the same problem, and the advice
is always: just reboot your phone every day.

Are you sure there isn't some third party app on your phone that is instigating this?  On a regular G1 with the normal apps I install, I have never seen such I think.  There may be an issue with location updates if an app repeatedly registers for them and is then killed, so you might want to try removing apps that use GPS.
 
 I note that there is
watchdog code in Android that is supposed to reboot the phone in the
early hours of the morning if it hasn't been rebooted in a while, but
this code is disabled in the release.

Correct, we have never found this to be needed for a G1 running the standard apps.  We had this facility in case we found bad problems with the system, but have not needed to use it.
 
I tried to write an app to
schedule a regular reboot, but even using introspection to call the
native reboot method that is used by the stack did not work (I assume
for a security reason).

Yep apps can't reboot the phone.
 
As far as events never being lost -- maybe this is not the problem,
but the state machine in the window manager that handles events has
some incorrect logic in it?  The example that seems to indicate that
there might be a problem is as follows.  My app is pretty simple, it
lets the user draw multiple lines or shapes on the screen, and also
tracks ACTION_UP/ACTION_DOWN so it can keep the line segments
separate.  If you keep drawing lines on the screen until you miss an
ACTION_DOWN event, then (with your finger still on the screen) set a
breakpoint at the start of onTouchEvent(), then keep moving your
finger on the screen, you don't actually get any more events fired
until you lift your finger off the screen and then put it back down
again, at which point the code stops at the breakpoint.  This is not a
bug with my code, because my code's onTouchEvent method never even
gets called, and it seems that something somewhere got into a weird
state.  If you miss the ACTION_UP event, then you don't get another
ACTION_UP until you finish drawing the next line -- the method is
never called with that event.  It happens in about 5% of cases right
now (somewhat hard to trigger).

Sorry, I don't really understand the flow you are talking about here.  I am not aware of problems in the framework like this, but there certainly could be.  It's not something I have ever seen happen elsewhere.

The only thing I can think of is that the target for touch events is determined at the down event, so whoever is hit by the down is the only one that will receive the following events until after the corresponding up.  The only exception to this is that a parent view can steal a gesture from its child view.

--
Dianne Hackborn
Android framework engineer
hac...@android.com

fadden

unread,
Jan 9, 2009, 12:33:10 AM1/9/09
to android-platform
On Jan 8, 5:34 pm, luke <luke.hu...@gmail.com> wrote:
> I followed Dan's advice and it looks like the guilty process that
> causes all the GCs is system_server.

This usually only happens when something is driving system_server to
distraction.

See http://groups.google.com/group/android-discuss/browse_thread/thread/fa79b405ecb5fb51/dd459d306003915a#dd459d306003915a
for some advice on how to track these problems down.

luke

unread,
Jan 9, 2009, 1:11:42 AM1/9/09
to android-platform
> Sorry, I don't really understand the flow you are talking about here.  I am
> not aware of problems in the framework like this, but there certainly could
> be.  It's not something I have ever seen happen elsewhere.
>
> The only thing I can think of is that the target for touch events is
> determined at the down event, so whoever is hit by the down is the only one
> that will receive the following events until after the corresponding up.
> The only exception to this is that a parent view can steal a gesture from
> its child view.

That would explain the behavior. I just have a simple LinearLayout
with a TextView at the top and a custom canvas View at the bottom. If
the ACTION_DOWN somehow starts sending the motion events to a
different view in the hierarchy, which one would it be? I would
imagine that LinearLayout doesn't normally catch motion events... Is
there a way to debug where events are going?

luke

unread,
Jan 9, 2009, 1:15:19 AM1/9/09
to android-platform
On Jan 9, 12:33 am, fadden <fad...@android.com> wrote:
> > I followed Dan's advice and it looks like the guilty process that
> > causes all the GCs is system_server.
>
> This usually only happens when something is driving system_server to
> distraction.
>
> Seehttp://groups.google.com/group/android-discuss/browse_thread/thread/f...
> for some advice on how to track these problems down.

This fixed it! The Locale app was the culprit causing all the GCs.

I was still losing events during GCs, but only half as much once I
started using the motion history in motion events. According to
Dianne, these events should never be lost. Is it possible that if
system_server is really being taxed, then the CPU is pegged so hard
that even the responsiveness of lower-level layers of the platform is
impacted?

Dianne Hackborn

unread,
Jan 9, 2009, 2:42:11 AM1/9/09
to android-...@googlegroups.com
On Thu, Jan 8, 2009 at 10:15 PM, luke <luke....@gmail.com> wrote:
I was still losing events during GCs, but only half as much once I
started using the motion history in motion events.  According to
Dianne, these events should never be lost.  Is it possible that if
system_server is really being taxed, then the CPU is pegged so hard
that even the responsiveness of lower-level layers of the platform is
impacted?

I'm not sure how big the event queue is in the driver, but I suppose if the very high-priority thread that pulls events from the driver was not able to run then something could be lost.  That is not something I have seen before though.


--
Dianne Hackborn
Android framework engineer
hac...@android.com

Dianne Hackborn

unread,
Jan 9, 2009, 2:44:31 AM1/9/09
to android-...@googlegroups.com
On Thu, Jan 8, 2009 at 10:11 PM, luke <luke....@gmail.com> wrote:
That would explain the behavior.  I just have a simple LinearLayout
with a TextView at the top and a custom canvas View at the bottom.  If
the ACTION_DOWN somehow starts sending the motion events to a
different view in the hierarchy, which one would it be?  I would
imagine that LinearLayout doesn't normally catch motion events...  Is
there a way to debug where events are going?

The view hierarchy dispatches down to the leaf-view under the down location, tries to deliver the event to that view by calling onPointerEvent() and if that function returns false it calls the same function on the parent and on up the hierarchy until one returns true or it reaches the top.  If it reaches the top, there is no target so the remaining events in that gesture will be ignored.

raghu

unread,
Jun 14, 2012, 7:54:29 AM6/14/12
to android-...@googlegroups.com
Hi,

Some times touch screen is unresponsive (If device not used for more than day and touch the screen). If I restart the device everything is fine.
Please help me to trace this issue.

When touch screen become unresponsive, sometimes when button on screen is clicked a button beside will be clicked. That is, clicked event is received by another button which is not clicked. If I restart the device everything this issue does not come.

And it cannot be reproduced easily.

Please help.

Regards,
Raghu N
Reply all
Reply to author
Forward
0 new messages