Let's optimize KeyInputQueue/InputDevice.

64 views
Skip to first unread message

Robert Green

unread,
Apr 19, 2010, 2:06:23 AM4/19/10
to android-platform
Hey everyone,

It's pretty obvious from all of the developer feedback and my own
personal testing that the touch input processing code in KeyInputQueue
and InputDevice is very inefficient and is currently making it nearly
impossible to put out a high quality game with virtual joystick
controls. The issue is that currently the input service consumes up
to 36% of the CPU on a MSM7200-based device and up to 30% of the CPU
on a Cortex A8. On the 7200s, this usually cripples the game, cutting
framerates in half and causing inconsistent updates.

I believe that it is absolutely paramount to the future of games on
this platform that this code be looked over and optimized where
possible. Assuming that faster processors will glaze over the problem
is wasteful and doesn't address the millions of devices currently
affected by the issue.

I'd be happy to jump in and start working on this, but I don't want to
spend a week or two working on this problem without getting some solid
backing this time from the Android team. Last time I made a code
contribution, I received encouragement all the way to the top and then
the code got nixed. That was a good week wasted.

This problem directly affects my company's newest games and while I'm
a little upset that it hasn't been resolved in the last year, I'd be
happy knowing that a fix has been committed and will be rolled out in
future versions.

Please let me know who I need to get in touch with to start work on
this.

Thank you.

--
You received this message because you are subscribed to the Google Groups "android-platform" group.
To post to this group, send email to android-...@googlegroups.com.
To unsubscribe from this group, send email to android-platfo...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/android-platform?hl=en.

Robert Green

unread,
Apr 19, 2010, 3:28:10 AM4/19/10
to android-platform

Dianne Hackborn

unread,
Apr 19, 2010, 6:19:02 PM4/19/10
to android-...@googlegroups.com
I would suggest not spending much time on this.

Just saying.
--
Dianne Hackborn
Android framework engineer
hac...@android.com

Note: please don't send private questions to me, as I don't have time to provide private support, and so won't reply to such e-mails.  All such questions should be posted on public forums, where I and others can see and answer them.

Robert Green

unread,
Apr 19, 2010, 7:05:37 PM4/19/10
to android-platform
Does the Android team find the performance of that service acceptable
or is the idea that eventually phones will be so fast that you won't
notice it?

I'm just really astounded that when touching my N1's screen, a full
200Mhz or more is being used to process and pass a few pairs of x,y
coordinates up to the application layer. As a comparison, you can
decode several streams of MP3 real-time cheaper than that.
> > android-platfo...@googlegroups.com<android-platform%2Bunsubscrib e...@googlegroups.com>
> > .
> > For more options, visit this group at
> >http://groups.google.com/group/android-platform?hl=en.
>
> --
> Dianne Hackborn
> Android framework engineer
> hack...@android.com
>
> Note: please don't send private questions to me, as I don't have time to
> provide private support, and so won't reply to such e-mails.  All such
> questions should be posted on public forums, where I and others can see and
> answer them.
>
> --
> You received this message because you are subscribed to the Google Groups "android-platform" group.
> To post to this group, send email to android-...@googlegroups.com.
> To unsubscribe from this group, send email to android-platfo...@googlegroups.com.
> For more options, visit this group athttp://groups.google.com/group/android-platform?hl=en.

Armando Ceniceros

unread,
Apr 19, 2010, 7:11:10 PM4/19/10
to android-...@googlegroups.com
I'd toss a guess that it's more an issue of planned obsolesce, as evidenced by Dream-type (and now, apparently, Nexus One type) hardware. Rather than improve the code to be more efficient per clock cycle, the ball is tossed to the manufacturers to create faster hardware to alleviate the software woes, and manufacturers looking to sell as many devices as possible will of course happily oblige.

Christopher Tate

unread,
Apr 19, 2010, 7:13:27 PM4/19/10
to android-...@googlegroups.com
On Mon, Apr 19, 2010 at 4:05 PM, Robert Green <rbgr...@gmail.com> wrote:
> Does the Android team find the performance of that service acceptable
> or is the idea that eventually phones will be so fast that you won't
> notice it?

Neither.

> I'm just really astounded that when touching my N1's screen, a full
> 200Mhz or more is being used to process and pass a few pairs of x,y
> coordinates up to the application layer.  As a comparison, you can
> decode several streams of MP3 real-time cheaper than that.

To be fair, there's a great deal more processing going on than
"passing a few pairs of x,y coordinates up." That said, yes, we're
well aware that the current state of things is far from ideal. On a
Nexus One, the latency from the kernel driver popping a motion event
out to your app receiving it is typically on the order of 6 or 7 ms.
That matches what you're seeing with 30% of processing time going to
input-dispatch overhead. FWIW, I would never claim that this is a
fine state of affairs.

Out of curiosity, how much CPU does an iPhone 3GS use for input
dispatch while handling motion events? Obviously the model is quite
different but I've heard assertions that it can also be on the order
of 30%, and I find that hard to believe.

--
chris tate
android framework engineer

Robert Green

unread,
Apr 19, 2010, 7:39:35 PM4/19/10
to android-platform
I understand that there is more going on than simple passing, but I
was just trying to make a point to show that it's a little absurd any
way you slice it. The bigger issue is not delivery time but the fact
that it's actually impossible to deliver a quality game using modern
virtual joystick or any constant-touch control system to the MSM7200
phones, which is 2/3 of the Android install base. My team is
finishing two games right now and when we started, we wanted to go
next-gen but decided that since 2.1 was expected to be released for
most of those phones, it should contain fixes to the input issues that
were known from all the way back to 1.1. Unfortunately it doesn't,
and so despite the optimizations we've made to ensure a solid 30FPS of
quake3-class graphics (lightmapped animated 3D environment shooter) on
a G1, in an exclusive title optimized as much as possible for Android,
2/3 of the devices can't support it because the moment you touch that
screen, the game drops to 15FPS , making the game unplayable.
Keyboard controls work fine but many phones don't have keyboards.

It's just very frustrating because despite everything we've done to
bring such a title to Android, the speed bump lies here and the
attitude seems to be to let it lie because things are what they are.
Sure, we can just set the min gl version to 2 which will effectively
make it so that only high-end devices can get the game but we wanted a
really good game for everyone. We're not the only ones in this boat.
I'm sure there are many games hung up by this problem alone.

I think of Google is as serious as they say about gaming on Android,
they'll take this issue a little more seriously. This is the kind of
thing that will hinder the platform for another 2 years (until
manufacturers stop making MSM7200-based phones, if that ever happens).

When people ask, "where are all the great games?" and I say, "We can't
deliver them unless you have one of two really high-end phones," it's
just not very satisfying. I really don't like the fact that the phone
hardware _can_ run it but the platform has a problem that inhibits
it. Currently the only games that are universally playable across all
Android phones are games with tilt controls. That's why so many
racing games are popular. It's actually impossible to deliver better
controls to those first gen systems without dealing with harsh
slowdown.

More advanced control systems will be possible on all devices if this
is worked on.

On Apr 19, 6:13 pm, Christopher Tate <ct...@google.com> wrote:

Christopher Tate

unread,
Apr 19, 2010, 7:57:34 PM4/19/10
to android-...@googlegroups.com
On Mon, Apr 19, 2010 at 4:39 PM, Robert Green <rbgr...@gmail.com> wrote:
> ...
> and so despite the optimizations we've made to ensure a solid 30FPS of
> quake3-class graphics (lightmapped animated 3D environment shooter) on
> a G1, in an exclusive title optimized as much as possible for Android,
> 2/3 of the devices can't support it because the moment you touch that
> screen, the game drops to 15FPS , making the game unplayable.
> Keyboard controls work fine but many phones don't have keyboards.

First: in no way am I trying to undermine your points, which are valid
and I hear you.

That said, it sounds like you might be running into a structural issue
with input handling in your applications. Android will spit events at
you as fast as you ask for them. 30 fps means you're spending around
33 ms per frame total without touch input processing, right? Is that
your peak running flat out, or do you have any headroom? If you have
even six or seven milliseconds of headroom per frame, then it should
be quite doable to get one input event per frame and still hit 30 fps
solid.

Dropping down to 15 fps with input handling suggests that you're
winding up handling *another* 33-ish ms per frame worth of input
processing. OS overhead there is only on the order of 6-7 ms per
motion event, as i said, so that suggests to me that you're handling
multiple input events per frame. For motion events, the OS is
accumulating the point track in a lighter-weight manner internally and
dispatching the whole track delta since the last motion event; that
eliminates a lot of transit overhead and latency. If you aren't
already, try limiting your input rate to the frame rate and see what
that buys you.

(If you already do that, then I apologize for restating what you
already know ... but then I'm *very* surprised that you're losing 30+
ms per frame to input dispatch; that doesn't make sense.)

--
chris tate
android framework engineer

Robert Green

unread,
Apr 19, 2010, 8:25:31 PM4/19/10
to android-platform
Chris,

My input processing takes less than a fraction of a millisecond. I
put the coordinates into a queue and return. The physics run the same
no matter what the last-touched values are. I don't have any fancy
processing that would cause any sort of a performance hit from touch.
Evidence of that point is shown in that when you use key bindings, the
game runs fine.

The big kicker? If I simply sleep(16) (recommended to control event
dispatch flood) and return false in Activity.onTouchEvent(MotionEvent
evt) and do absolutely no processing of my own, the game still has the
massive frame rate drop. system_service still consumes 1/3 of the CPU
on my G1 - and that's completely out of my hands. This sort of report
is consistent across all game developers I've talked to who have
tested their games. You can actually download just about any 3D game
off the market and watch the framerate get cut in half on a G1 from
touching the screen on it. The more demanding the graphics, the more
dramatic the drop is, so you can imagine that my game is nearly
unplayable in that state. There's absolutely nothing a developer can
do to stop that because it's a result of system_service consuming the
CPU.

Check the bug I filed - I did this test on every phone I had. This is
consistent with OpenGL on the MSM7200. Touch-processing before it
ever reaches the application causes that drop - not the handling in
the app itself (though I'm sure it could if you were writing horribly
inefficient code).

This is not some first-time game dev thing for me. I've spent days
trying every possible thing there could be to work around this
problem. I've pulled other game developers in to ask for help finding
solutions and we all came to the same conclusion - There is nothing a
developer can do but pray to you Android gods for a service that
doesn't kill the framerate on touch.

As far as specifics go, I do run an update counter and display it on-
screen so I can always make sure my updates are as lean as possible.
My average update with vertex animation & collision response is 4ms on
a G1, so the vast majority of the round trip time is spent waiting for
the GPU to draw (which is the same chip as the CPU, probably why the
slowdown is so dramatic on that chip).

As a last ditch effort, I actually did limit it to the frame rate by
using Thread.wait(1000) and notifying it when my logic was ready to
pull some more events in and that had the exact same result. Like I
said earlier, doing absolutely 0 event processing with no sleep, 16ms
sleep and 32ms sleep all produce the exact same effect - terrible
framerate slowdown and system_service consuming over 30% of the cpu.

My update is not going up by 33ms, it remains at 4ms. What's
happening is the GPU is bogging down when the system_service goes up,
probably as a result of some architectural thing qualcomm did with the
integrated CPU/GPU. This is why I keep saying that the problem is
_really_ apparent on an MSM7200 using OpenGL in a game with a heavy
scene. Put that way, it does seem like worst-case scenario but the
fact of the matter is that it is still 2/3rds of the market and I'd
really love to be able to deliver our new games to them!

I'll tell you what - if you want to talk about more details, feel free
to call me. I can even give you a demo and maybe some source for you
to look at so you can see that this is not an application code side
issue and that we game devs are truly powerless to fix it.

Thanks for listening. You're one of very few over there that has so
far!

Robert Green
Battery Powered Games, LLC
(612)234-2288

On Apr 19, 6:57 pm, Christopher Tate <ct...@google.com> wrote:

Robert Green

unread,
Apr 19, 2010, 8:33:45 PM4/19/10
to android-platform

On Apr 19, 6:57 pm, Christopher Tate <ct...@google.com> wrote:
> processing.  OS overhead there is only on the order of 6-7 ms per
> motion event, as i said, so that suggests to me that you're handling

Sorry, I missed that point. Are you saying that the OS consumes 6-7ms
of 100% CPU time per event?

If so, at 30 events per second (which would be in-line with a real-
time game), the OS uses 6 * 30 = 240ms of CPU per second or 1/4 of the
entire CPU. At 60 events per second, it would use 480ms or 1/2 of the
entire CPU time, leaving half the CPU left (or half the CPU/GPU in the
case of the MSM7200) for the actual application.

Is that what you're saying?

Christopher Tate

unread,
Apr 19, 2010, 8:37:52 PM4/19/10
to android-...@googlegroups.com
On Mon, Apr 19, 2010 at 5:33 PM, Robert Green <rbgr...@gmail.com> wrote:
>
> On Apr 19, 6:57 pm, Christopher Tate <ct...@google.com> wrote:
>> processing.  OS overhead there is only on the order of 6-7 ms per
>> motion event, as i said, so that suggests to me that you're handling
>
> Sorry, I missed that point.  Are you saying that the OS consumes 6-7ms
> of 100% CPU time per event?
>
> If so, at 30 events per second (which would be in-line with a real-
> time game), the OS uses 6 * 30 = 240ms of CPU per second or 1/4 of the
> entire CPU.  At 60 events per second, it would use 480ms or 1/2 of the
> entire CPU time, leaving half the CPU left (or half the CPU/GPU in the
> case of the MSM7200) for the actual application.
>
> Is that what you're saying?

Well, I'm working backwards from measuring that the input delivery
pipeline is about 6-7 ms long on average, and seeing from the source
code that the pipeline seems largely cpu bound. So I *think* that's a
reasonable conclusion, and it matches what you're seeing -- you're
heavily dependent on available CPU, and if you lose 6 ms *of CPU* per
input event, that's a killer.

--
chris tate
android framework engineer

Robert Green

unread,
Apr 19, 2010, 8:51:04 PM4/19/10
to android-platform
You and I both know that these CPUs can do quite a bit of work in 6ms
of time. For instance, in it's most readable and inefficient form, my
first attempt at 3D animation cost 300ms per update for 900 verts.
After a few hours of optimization, I got it down to 30ms. After
another round of optimization and a move into native code, I brought
it down to 3ms. After yet another round of loading optimizations
which put it into the most efficient it could possibly ever be, it
takes just 1ms on a G1. 300 times faster than I started and 3 times
faster than a _really_ optimized state. Those gains aren't always
available but my point is - perhaps there is something that can be
done to bring that 6ms delivery time down to something more reasonable
for a real-time environment, like 1-2ms?

I saw really obvious things that could be optimized in that code -
straight out of the Android optimization guidelines, in fact. I saw
loops where there were several calls to the same virtual (15 cycles to
lookup each time, right?) and it was never declared locally first.
That's just one example, but after working on some of the 3D game
engine work and AI I've done on Android, I've seen big gains by doing
those small things.

I'm glad you agree that the math works out to match my observations.
24-48% of the CPU to just deliver touch events really is too much.
There must be a way to bring that down. I'd be surprised if it
weren't possible to get it down to 2-5% of the CPU for the same
functionality. I'd actually be happy to jump in and work on it if
given permission. I am quite qualified for such things and do have an
interest in seeing this particular piece of Android become an order of
magnitude faster.

What are your thoughts on possibilities to optimize that code for less
CPU consumption?

On Apr 19, 7:37 pm, Christopher Tate <ct...@google.com> wrote:

Robert Green

unread,
Apr 20, 2010, 12:07:25 PM4/20/10
to android-platform
Christopher,

I double checked one of my games to make sure that I wasn't allowing
too many events to come in and things looked correct. When I log the
update/draw/getInput, it's all properly synchronized so that I only
update after the world is drawn and only allow a motion event on
update, which is never more than 50 per second because I'm GPU-bound.

I was thinking more about this and I'm thinking that the CPU usage is
not actually in the delivery of the event but at some point before
that. I've looked through the code and see many spots I'd love to
start measuring time at but I don't currently have time to run custom
builds so I can't dive as deep into this as I'd like at the moment. I
have a feeling that the majority of that use is in the
"InputDeviceReader" thread, before the event is even put into the
queue that WindowManagerService uses for dispatch. I would really
love to time the block from the line after readEvent() to the end of
the enclosing while(). Obviously you can't throttle that because it
must process every event serially but I'm wondering how many events it
has to deal with (perhaps the driver sends an absurd number of weight/
width changes?) or there is a policy mixed in somewhere that's really
heavyweight that was overlooked... Those sorts of things.

Was that thread in 1.6 or was readEvent() called by the UI thread?
Anyway, I believe that's where all of the CPU is being eaten up and so
maybe it's not a matter of how big of a batch of historical touch data
you can get across to minimize transport time but instead analyzing
that thread to find out why it's so hungry.

Thoughts?

San Mehat

unread,
Apr 20, 2010, 12:17:14 PM4/20/10
to android-...@googlegroups.com
On Tue, Apr 20, 2010 at 9:07 AM, Robert Green <rbgr...@gmail.com> wrote:
> Christopher,
>
> I double checked one of my games to make sure that I wasn't allowing
> too many events to come in and things looked correct.  When I log the
> update/draw/getInput, it's all properly synchronized so that I only
> update after the world is drawn and only allow a motion event on
> update, which is never more than 50 per second because I'm GPU-bound.
>
> I was thinking more about this and I'm thinking that the CPU usage is
> not actually in the delivery of the event but at some point before
> that.  I've looked through the code and see many spots I'd love to
> start measuring time at but I don't currently have time to run custom
> builds so I can't dive as deep into this as I'd like at the moment.  I
> have a feeling that the majority of that use is in the
> "InputDeviceReader" thread, before the event is even put into the
> queue that WindowManagerService uses for dispatch.  I would really
> love to time the block from the line after readEvent() to the end of
> the enclosing while().  Obviously you can't throttle that because it
> must process every event serially but I'm wondering how many events it
> has to deal with (perhaps the driver sends an absurd number of weight/
> width changes?) or there is a policy mixed in somewhere that's really
> heavyweight that was overlooked...  Those sorts of things.
>

It might be worth-while to gather some baseline raw low-level numbers on
driver processing time as well; I know constant touches cause a lot of i2c
traffic from the touch-panel. I'll try to gather some data on this
later (currently swamped), and toss the numbers up here.
--
San Mehat  |  Staff Software Engineer  |  Android  |  Google Inc.
415.366.6172 (s...@google.com)

Tomei Ningen

unread,
Apr 20, 2010, 2:10:19 PM4/20/10
to android-platform
I must say from a platform implementor's point of view, we find this
event handling performance bug really puts our devices at a
disadvantage when compare to other devices, such as the iphone.

Even on the NexusOne, when you start the photo gallery app and drag a
photo around, the response is sooooooo sluggish compared to the
iPhone.

I suggest the Android team spend much time on this.

Marcelo

unread,
Apr 20, 2010, 2:48:46 PM4/20/10
to android-...@googlegroups.com
On Tue, Apr 20, 2010 at 10:07, Robert Green <rbgr...@gmail.com> wrote:

> I double checked one of my games to make sure that I wasn't allowing
> too many events to come in and things looked correct.  When I log the
> update/draw/getInput, it's all properly synchronized so that I only
> update after the world is drawn and only allow a motion event on
> update, which is never more than 50 per second because I'm GPU-bound.

A shot in the dark, since don't have hardware to test this: it might
be that the touchscreen driver is generating too many events. A while
ago when working on porting Android to a specific board, performance
was terrible everywhere, it turned out that the touchscreen driver was
generating ~ 1000 events per second. We dialed the sampling rate down
to something more reasonable, in the order of 100 Hz, and everything
became much more responsive.

Marcelo

San Mehat

unread,
Apr 20, 2010, 2:14:38 PM4/20/10
to android-...@googlegroups.com
On Tue, Apr 20, 2010 at 11:10 AM, Tomei Ningen <tomei....@yahoo.com> wrote:
> I must say from a platform implementor's point of view, we find this
> event handling performance bug really puts our devices at a
> disadvantage when compare to other devices, such as the iphone.
>
> Even on the NexusOne, when you start the photo gallery app and drag a
> photo around, the response is sooooooo sluggish compared to the
> iPhone.
>
> I suggest the Android team spend much time on this.
>

Can we avoid having this thread devolve into a dogpile-on-the-devs
type of thing? I'd much prefer to keep the thread on-topic with
constructive engineering & design discussion on how to solve the
problem at-hand (Thanks Robert!).


-san
--
San Mehat  |  Staff Software Engineer  |  Android  |  Google Inc.
415.366.6172 (s...@google.com)

Christopher Tate

unread,
Apr 22, 2010, 8:12:39 PM4/22/10
to android-...@googlegroups.com
Just FYI....

On Tue, Apr 20, 2010 at 9:07 AM, Robert Green <rbgr...@gmail.com> wrote:
> I would really
> love to time the block from the line after readEvent() to the end of
> the enclosing while().

If you're talking about the primary KeyInputQueue processing, the
enclosing while() is an infinite loop; I think you mean you want to
time from the return of readEvent() [i.e. the arrival of the event
data from the native event hub into the Java-language input processing
code] to the point where addLocked() is called to put it on the queue
for outbound dispatch. Is that right?

On a G1 or myTouch 3G [dream or sapphire hw platform], I see that
particular slice of processing taking around 75-90 microseconds on
average. The minimum is a tight bound on 60 microseconds. The max
I've observed is quite high (on the order of > 1 millisecond) because
it is affected by things like inopportune system-process GC activity.

Short answer: that particular piece of dispatch is not where the CPU
is going. I do suspect that there is a lot of CPU load inside the
touchpanel driver itself, but don't currently have measurements for
that.

Robert Green

unread,
May 14, 2010, 6:36:08 PM5/14/10
to android-platform
Has any additional progress been made on this?

On Apr 22, 7:12 pm, Christopher Tate <ct...@google.com> wrote:
> Just FYI....
>
Reply all
Reply to author
Forward
0 new messages