Binder: notifying clients of asynchronous events; weak binders

3,911 views
Skip to first unread message

Adrian Taylor

unread,
Feb 25, 2009, 10:29:48 AM2/25/09
to android-platform
Hi,

I'm currently trying to handle some cross-process interactions using
Binder. I have a few questions, but first some background.

The system I'm trying to implement is a callback from an Android native
system "service", whenever a certain event occurs[1]. The "client" which
receives this callback may be written in Java or native code.

My questions are:


1) Callbacks versus blocking calls.

My intention is:

* The service exposes two additional APIs on its existing IInterface:
"register listener" and "unregister listener".
* These take an instance of a new IInterface, which has a single
callback method indicating the event has occurred.
* The service maintains a vector of these registered listeners.
* The service calls a callback on each listener each time the event
occurs.

Is this the normal way to notify clients of events asynchronously?

Or alternatively, the client could make a single blocking call to the
service: "waitUntilEventOccurs". The service would only return from its
onTransact when the event occurred. Is that more normal in a Binder
world?


2) Abandoning blocking transactions.

If I use the "waitUntilEventOccurs" solution: how best should I cancel
an outstanding wait? One way is with another API that somehow signals
the existing blocked Binder thread and causes it to return immediately.
Or is there some means built into Binder?


(All subsequent questions assume I am using a genuine 'callback' rather
than waiting within onTransact calls).


3) Can weak binders be used?

The original OpenBinder[2] had a concept of a 'weak binder'. This would
be useful because I don't want my "service" to retain a reference to the
callback object if that callback object has been deleted by the client.

Weak binders do not appear to be used anywhere in the Android codebase,
and aren't present in the Java APIs. Using them in the C++ APIs appears
more complex than I'd expect if they were in common use[3].

I assume the lack of weak-binder stuff means there's always a better way
to do it in Android? (I believe I can avoid them by ensuring clients
always call "deregisterListener" unless they crash, in which case
linkToDeath should tell me).


4) Can strong binders be converted to weak binders? (Related).

If I write a strong binder, e.g. from the Java client, but then want to
store it as a weak reference on the service side, is there any way to do
this? Can I writeStrongBinder but readWeakBinder from the same parcel?


5) Weak references don't appear to work cross-process. (Related).

If I have an sp<> to the callback object at the client end, but only
wp<>s at the service end, then Binder appears to delete my callback
object. Is this expected behaviour, or should wp<> and sp<> observe
cross-process reference counts? If so, I've probably just made a silly
mistake.


5) Is the IBinder supposed to be the same between multiple transactions?

My service has two APIs, "register listener" and "deregister listener",
which are passed the very same parameter (the callback object) but at
different times. I'd hope that the IBinder address of the parameter
would be identical at the service end as well as the client end[4]. It
doesn't appear to be. Are IBinder pointers only guaranteed to be
identical within the same cluster of recursive calls, or am I doing
something wrong?

This makes it trickier to remove the listener from a list of listeners
at the service end. Is there a normal solution to this? I'd expect the
best way is to return some sort of token which allows the service later
to key into a KeyedVector.


6) Death.

If the client dies, I assume I can clean up if I use linkToDeath. But if
I ever get weak binders working, I'm assuming I wouldn't need to --
would the weak reference simply evaporate and become null if the client
died?


7) Robustness.

These callbacks can't be allowed to affect the primary function of the
service. How can I best ensure that, especially if the client is wedged
and takes an infinitely long time to process the callback? For example,
I am tempted to use IBinder.FLAG_ONEWAY. But what happens in that case
if there are many callbacks? Does Binder keep creating new Binder
threads in the client each time? Or do I eventually get an error
returned from the transact call?


8) Documentation...

Is there any documentation for sp, wp, and the native binder interfaces?
Yes, I know, native development is not supported :-) But if there are
any hints anywhere in the source tree it would be great to know where.


9) Are there good examples of such callbacks?

Is there such a callback anywhere within the Android source on which I
can model things? Especially if a single event might need calls to
multiple fragile listeners, which would help me work out the right way
to maintain the vector of listeners.


Thanks very much indeed for any hints. I hope your answers may be useful
to others who are trying to get their heads round this stuff too!

Regards

Adrian


[1] The event is screen draws, the service is SurfaceFlinger, and the
clients are VNC servers which wish to send the screen image to VNC
viewers. They want to know about the screen draws so they only send
network data when the image has changed, and don't use too much battery
power polling and comparing the screen image all the time. We are fully
aware we have zero chance of persuading you to add these APIs right now,
especially since even the existing screenshotting APIs are closed (cf.
Gerrit 8866), but we want to be prepared in case we ever think of a
cunning way to convince you :-)
[2] http://www.angryredplanet.com/~hackbod/openbinder/ Thanks for
posting that stuff, Dianne!
[3] myParcel.writeWeakBinder(myInterfaceWp.promote()->asBinder()), plus
null checks!
[4] http://developer.android.com/reference/android/os/IBinder.html

Dianne Hackborn

unread,
Feb 25, 2009, 8:57:59 PM2/25/09
to android-...@googlegroups.com
Hi!


On Wed, Feb 25, 2009 at 7:29 AM, Adrian Taylor <adr...@macrobug.com> wrote:
1) Callbacks versus blocking calls.

My intention is:

* The service exposes two additional APIs on its existing IInterface:
 "register listener" and "unregister listener".
* These take an instance of a new IInterface, which has a single
 callback method indicating the event has occurred.
* The service maintains a vector of these registered listeners.
* The service calls a callback on each listener each time the event
 occurs.

Is this the normal way to notify clients of events asynchronously?

Yep.  You probably want to make the callbacks oneway so that your server can't get stuck if a client is misbehaving.  Also be sure to linkToDeath() to clean up the callbacks if a client dies before unregistering.
 
Or alternatively, the client could make a single blocking call to the
service: "waitUntilEventOccurs". The service would only return from its
onTransact when the event occurred. Is that more normal in a Binder
world?

This is less common, often because we don't want to block the main thread of applications for an indefinite amount of time.
 
2) Abandoning blocking transactions.

If I use the "waitUntilEventOccurs" solution: how best should I cancel
an outstanding wait? One way is with another API that somehow signals
the existing blocked Binder thread and causes it to return immediately.
Or is there some means built into Binder?

You would have to take an approach of another API to have the server return the other calling thread.
 
3) Can weak binders be used?

The original OpenBinder[2] had a concept of a 'weak binder'. This would
be useful because I don't want my "service" to retain a reference to the
callback object if that callback object has been deleted by the client.

Unfortunately, we haven't implemented weak references in the IPC protocol so you can't use them at this point for things that go across processes.  They do work for the basic local process reference counting, so if you do use them you need to be very very careful. :}
 
I assume the lack of weak-binder stuff means there's always a better way
to do it in Android? (I believe I can avoid them by ensuring clients
always call "deregisterListener" unless they crash, in which case
linkToDeath should tell me).

Nope, just not something we absolutely need so didn't take the time to implement. :}
 
4) Can strong binders be converted to weak binders? (Related).

Across processes, only if weak binders are implemented! ;)  Within the same process, sure.

If I write a strong binder, e.g. from the Java client, but then want to
store it as a weak reference on the service side, is there any way to do
this? Can I writeStrongBinder but readWeakBinder from the same parcel?

When this works, a weak binder is a different data type, so you will always need to read it as work and then promote.
 
5) Weak references don't appear to work cross-process. (Related).

This is true. :}
 
5) Is the IBinder supposed to be the same between multiple transactions?

My service has two APIs, "register listener" and "deregister listener",
which are passed the very same parameter (the callback object) but at
different times. I'd hope that the IBinder address of the parameter
would be identical at the service end as well as the client end[4]. It
doesn't appear to be. Are IBinder pointers only guaranteed to be
identical within the same cluster of recursive calls, or am I doing
something wrong?

I'm not sure exactly what you are asking here.  An IBinder is guaranteed to be a unique identity for an object within a process.  If the IBinder reference goes around to other processes, gets received multiple times, or whatever else, the IBinder address you get will always be the same.  (Note that this is NOT true for the IInterface wrapper around its target IBinder.)
 
This makes it trickier to remove the listener from a list of listeners
at the service end. Is there a normal solution to this? I'd expect the
best way is to return some sort of token which allows the service later
to key into a KeyedVector.

Yes, it is very very common to have a KeyedVector<IBinder, ...> to keep track and look up clients or such.  Or in Java HashMap<IBinder, ...>
 
6) Death.

If the client dies, I assume I can clean up if I use linkToDeath. But if
I ever get weak binders working, I'm assuming I wouldn't need to --
would the weak reference simply evaporate and become null if the client
died?

They never become null.  You will just fail when you try to promote them to use the real object.  You'll still want to clean these out of your data structure in some way.
 
7) Robustness.

These callbacks can't be allowed to affect the primary function of the
service. How can I best ensure that, especially if the client is wedged
and takes an infinitely long time to process the callback? For example,
I am tempted to use IBinder.FLAG_ONEWAY. But what happens in that case
if there are many callbacks? Does Binder keep creating new Binder
threads in the client each time? Or do I eventually get an error
returned from the transact call?

FLAG_ONEWAY is the way to deal with this.  It means that when you hit the IPC, a transaction will be sent to the driver without every blocking, so you are protected from misbehaving apps.

This doesn't involve creating new binder threads, but just enqueueing work into the target process to be handled by a thread in its thread pool.  Eventually that process may run out of space for new transactions, in which case I believe transact() will return an error code.
 
8) Documentation...

Is there any documentation for sp, wp, and the native binder interfaces?
Yes, I know, native development is not supported :-) But if there are
any hints anywhere in the source tree it would be great to know where.

Sorry, not at this point.  The wp and sp classes should be relatively simple, they are designed to be full-fledged smart pointers that prevent you from doing things the wrong way.  IBinder is...  well, more complicated. :}
 
9) Are there good examples of such callbacks?

Is there such a callback anywhere within the Android source on which I
can model things? Especially if a single event might need calls to
multiple fragile listeners, which would help me work out the right way
to maintain the vector of listeners.

The media system, surface flinger, and I believe audio flinger have fairly extensive binder interfaces, and I am pretty sure the media system also does callbacks.
 
Thanks very much indeed for any hints. I hope your answers may be useful
to others who are trying to get their heads round this stuff too!

I hope these answers are useful!

--
Dianne Hackborn
Android framework engineer
hac...@android.com

Note: please don't send private questions to me, as I don't have time to provide private support.  All such questions should be posted on public forums, where I and others can see and answer them.

Dave Sparks

unread,
Feb 26, 2009, 2:13:11 AM2/26/09
to android-platform
MediaPlayerService and CameraService are examples of native binder
interfaces that implement asynchronous callbacks.

You have to be very careful with callback interfaces. It's easy to
create circular sp<> references across the binder interface that lead
to crashes with mysterious stack traces (e.g. binder worker thread
calls RefBase::decStrong() which calls a Bp or Bn destructor that
crashes).
> hack...@android.com

Adrian Taylor

unread,
Feb 26, 2009, 5:31:32 AM2/26/09
to android-...@googlegroups.com
Hi Dianne,

Thanks very much for the full answers (and thanks to Dave too). You've
answered everything I needed to know!

You say:

> An IBinder is guaranteed to be a unique identity for an object within
> a process. If the IBinder reference goes around to other processes,
> gets received multiple times, or whatever else, the IBinder address
> you get will always be the same.

For some reason, this wasn't the case when I tried this. The IBinder
address was different each time I passed it across. But I just tried it
again and it worked fine, so I must have been doing something daft :-)

Thanks again,

Adrian

Dianne Hackborn

unread,
Feb 26, 2009, 3:22:33 PM2/26/09
to android-...@googlegroups.com
On Thu, Feb 26, 2009 at 2:31 AM, Adrian Taylor <adr...@macrobug.com> wrote:
For some reason, this wasn't the case when I tried this. The IBinder
address was different each time I passed it across. But I just tried it
again and it worked fine, so I must have been doing something daft :-)

If the process releases all references on the object, and then gets the object again, it will get a different IBinder.  This is correct because at that point you have no more references to its identity so can't know if its identity changes.

Mike Hearn

unread,
Feb 26, 2009, 11:43:11 PM2/26/09
to android-platform
> We are fully
> aware we have zero chance of persuading you to add these APIs right now,
> especially since even the existing screenshotting APIs are closed (cf.
> Gerrit 8866), but we want to be prepared in case we ever think of a
> cunning way to convince you :-)

I remain hopeful some way to do this without compromising security can
be found :-) If your VNC server doesn't require access to the
notification bar part of the screen, maybe the app-specific shared key
thing I mentioned could be applied to both screenshotting and VNC.

> [2]http://www.angryredplanet.com/~hackbod/openbinder/Thanks for
> posting that stuff, Dianne!

This is just an incredibly interesting site for androidologists,
thanks for the link. You can easily see the evolution of ideas across
Be, then PalmSource, then Android. The "Looper/Handler/Message" APIs
seem to originate in BeOS. The whole idea of a process-transparent
operating system seems to originate with a never-released OS at Palm.
And now Android takes the intra-process messaging APIs from Be, the
binder/process/services model from Palm and adds the whole Intents/
Activities idea on top.

That's more than a decade of thinking about component frameworks and
OS design in Android. No wonder it feels a bit overwhelming at
first ! :-)

A bit offtopic, but whilst we're talking about the Binder, a while
back on android-developers I posted this question:

http://groups.google.com/group/android-developers/browse_frm/thread/279c9484b9c0fdb2/e17a267ca44821a8

but it never got a response. I guess nobody knew the answer.

It is clear now from reading the OpenBinder docs that the Binder is a
lot more than a way to optimize out a few context switches. It is
actually meant to be a whole component architecture in which you never
have to think about processes again, and the system can dynamically
choose how to separate address spaces based on whatever policies it
wants. Android does exactly that. But the Binder isn't quite
transparent, because implementing a remoteable object is still a lot
more work than a local object.

In my own sync app I spent quite a long time separating the meat into
a service so the user could task-switch away and have the sync
continue running in the background rather than die. A bit chunk of the
work was doing the IPC goop .... handling connect/disconnect/callback
registration/handling of dead objects etc.

So you can imagine that when I saw an Android app which just skipped
all that and implemented a service without using the Binder, I was
kind of surprised. I don't care about exposing my service to other
apps, so assuming services always run in the same VM as my frontend
activity would simplify life a lot. That *seems* to be a safe
assumption in Android today, but it's not *quite* clear, because that
obviously goes against the whole Binder philosophy of being process
transparent.

Is it safe to assume services will always run in the same process as
the activity, if you don't export it via the manifest? If so I'd be
happy to submit a patch documenting this trick, because I think a
common use of the Service API is going to be just to handle multi-
tasking rather than exporting app innards to the sea of components.

Dave Sparks

unread,
Feb 27, 2009, 12:56:06 AM2/27/09
to android-platform
We would like to generate native marshaling code for binder interface
using AIDL (just like Java does today), but no one seems to ever get
around to writing the tool.

There are subtle things you have to consider when re-partitioning.
Callbacks across a process boundary always run on a different thread.
Callbacks within a process run on the same thread. This can lead to
deadlocks when the client holds a lock on a server call and the
callback tries to go for the same lock. The problem doesn't happen
when the server is in a different process.
> http://groups.google.com/group/android-developers/browse_frm/thread/2...

Dianne Hackborn

unread,
Feb 27, 2009, 5:54:12 PM2/27/09
to android-...@googlegroups.com
On Thu, Feb 26, 2009 at 8:43 PM, Mike Hearn <mh.in....@gmail.com> wrote:
I remain hopeful some way to do this without compromising security can
be found :-) If your VNC server doesn't require access to the
notification bar part of the screen, maybe the app-specific shared key
thing I mentioned could be applied to both screenshotting and VNC.

I'm sorry I dropped the discussion on this.  I don't think there is a basic issue about whether there is VNC, but just how it is exposed from both a security and architecture perspective.  I'll try to comment on the other thread soon.
 
> [2]http://www.angryredplanet.com/~hackbod/openbinder/Thanks for
> posting that stuff, Dianne!
This is just an incredibly interesting site for androidologists,
thanks for the link. You can easily see the evolution of ideas across
Be, then PalmSource, then Android. The "Looper/Handler/Message" APIs
seem to originate in BeOS. The whole idea of a process-transparent
operating system seems to originate with a never-released OS at Palm.
And now Android takes the intra-process messaging APIs from Be, the
binder/process/services model from Palm and adds the whole Intents/
Activities idea on top.

That is certainly one of the sources for a lot of the Android architecture. :)  We also have a strong design element from many ex-Danger engineers (the source of Dalvik and such) as well as other sources such as WebTV.

That said, I would caution about extrapolating too much from the OpenBinder design documents -- the Android architecture borrows a number of things from it, but really isn't an evolution from it.
 

Sorry I missed that one.  The basic answer is: our goal is that, unless you do some specific things, when you write an .apk your programming model will be one thread, one process, period.  We really wanted to keep the standard model simple, because from hard experience with BeOS most developers are not prepared to deal with multiple threads, let alone multiple processes closely interacting.

So yes, it is perfectly acceptable to knowingly put your service in the same process as all of its clients, and take complete advantage of that.  The same goes for activities, receivers, and providers.
 
It is clear now from reading the OpenBinder docs that the Binder is a
lot more than a way to optimize out a few context switches. It is
actually meant to be a whole component architecture in which you never
have to think about processes again, and the system can dynamically
choose how to separate address spaces based on whatever policies it
wants. Android does exactly that. But the Binder isn't quite
transparent, because implementing a remoteable object is still a lot
more work than a local object.

Our goal for the version of Binder in Android is actually a fair bit narrower in scope than what OpenBinder is.  A lot of this is because, with the native programming language being Java, many of the higher-level facilities in OpenBinder don't really make sense or are redundant.  So when re-implementing it, we dropped a lot of stuff and just focused on what we needed for Android -- which basically turns it into a powerful IPC mechanism with a basic component model on top.

As far as transparency, I don't know if that has ever really been a goal in the way you mean.  Generally for the Binder we say it as "if you follow the model needed to write a remoteable object, then the process it lives in will be transparent."  So there is definitely up-front work needed to get that transparency. :)  This is basically a compromise: we can't really afford the overhead of having a really general remote communication mechanism that allows everything to be remoted (look at Java serialization as an end-point of that which is way too much overhead), but multiple processes is a key aspect of Android and making it relatively easy to deal with that (mostly so far at the system level) is important.

Also over the years we have found that the interface you write for a Binder object is usually not what you want to expose to application developers.  For example, when you define a binder interface, you really need to think about how many calls will be going through it, to reduce the number of IPCs.  That tends to lead towards having a client-side wrapper for it that exposes a simpler programming interface.  In addition, having that client-side wrapper also makes maintenance a lot easier, since you have a place to stick backwards compatibility logic in the client-side code.
 
In my own sync app I spent quite a long time separating the meat into
a service so the user could task-switch away and have the sync
continue running in the background rather than die. A bit chunk of the
work was doing the IPC goop .... handling connect/disconnect/callback
registration/handling of dead objects etc.

Yeah, sorry about causing so much work that probably wasn't necessary.  This is why I got the LocalService sample code structured the way it now is as we were getting close to the 1.0 release, to show a simple approach that can be used.  It would be good to have some more clear mentions of this in the SDK docs.
 
Is it safe to assume services will always run in the same process as
the activity, if you don't export it via the manifest? If so I'd be
happy to submit a patch documenting this trick, because I think a
common use of the Service API is going to be just to handle multi-
tasking rather than exporting app innards to the sea of components.

Yep, this is a safe assumption.  We'd welcome a patch. :)

harshal dhake

unread,
Apr 26, 2020, 1:33:36 PM4/26/20
to android-platform
Hello Dave,

As you have mentioned about the mysterious stack traces call RefBase::decStrong()
I am facing one of the same issue. 
But this is not always, I am using an sp<> for HAL to HAL communication. 
Could please how can we avoid this? or there is any debugging technique to find the root cause? 
After I run the backtrace on the tomstones then I could always find it pointing it to same sp<>. 
Any Idea you can give how do I avoid it?

Thanks.!

Regards,
Harshal
Reply all
Reply to author
Forward
0 new messages