Issues with Channel connection status

152 views
Skip to first unread message

Kristopher Giesing

unread,
Oct 22, 2012, 2:35:08 AM10/22/12
to google-a...@googlegroups.com
I'm in what should be the final stages of development prior to deploying my app.  But I can't figure out how to use the Channel API effectively.

What I need out of the system is a way to notify a user of certain events.  If the user is connected by a channel, I use that, but if they aren't, I want to send a push notification.  That means I need some way of telling whether or not a user is connected.

I should be able to use the connected/disconnected API calls for this, but I can't tell that they are reliable enough.  After struggling with mysterious failures to send messages, I started just logging every connect and disconnect message for each user, at the moment my app receives the calls.

I've enclosed a session (below the sig since it's long) with the token timeout set to 1 minute.  This is not a production value, but I'm setting it that way to test whether a user's session will survive intact when the token does time out.  So what I should see below is an initial connect, followed by a disconnect/connect pair every minute.  What I get instead is a seemingly completely random scattering.  The only explanation I can come up with for the lack of a pattern below is that there is a highly variable latency for connect and disconnect calls, a latency so high that it completely obscures what should be a per-minute pattern.

Is there any way to tell reliably whether a message sent on a channel either will be received or has been received?  I think what I really want is async http semantics, like the async URL Fetch service provides, but ChannelService.sendMessage returns void.

Thanks,

- Kris

"2012-10-19 05:14:41.074: CONNECT"
"2012-10-19 05:14:51.975: DISCONNECT"
"2012-10-19 05:14:59.005: CONNECT"
"2012-10-19 05:16:02.244: CONNECT"
"2012-10-19 05:16:32.312: DISCONNECT"
"2012-10-19 05:17:06.129: CONNECT"
"2012-10-19 05:17:37.564: DISCONNECT"
"2012-10-19 05:18:10.850: CONNECT"
"2012-10-19 05:18:43.400: DISCONNECT"
"2012-10-19 05:19:15.571: CONNECT"
"2012-10-19 05:20:02.696: DISCONNECT"
"2012-10-19 05:20:20.306: CONNECT"
"2012-10-19 05:21:18.098: DISCONNECT"
"2012-10-19 05:21:24.552: CONNECT"
"2012-10-19 05:21:54.450: DISCONNECT"
"2012-10-19 05:22:28.524: CONNECT"
"2012-10-19 05:22:58.342: DISCONNECT"
"2012-10-19 05:23:35.089: CONNECT"
"2012-10-19 05:24:05.763: DISCONNECT"
"2012-10-19 05:24:41.071: CONNECT"
"2012-10-19 05:25:11.415: DISCONNECT"
"2012-10-19 05:25:46.249: CONNECT"
"2012-10-19 05:26:17.876: DISCONNECT"
"2012-10-19 05:26:51.309: CONNECT"
"2012-10-19 05:27:21.467: DISCONNECT"
"2012-10-19 05:27:56.865: CONNECT"
"2012-10-19 05:28:26.661: DISCONNECT"
"2012-10-19 05:29:01.161: CONNECT"
"2012-10-19 05:29:30.979: DISCONNECT"
"2012-10-19 05:30:06.248: CONNECT"
"2012-10-19 05:30:09.558: DISCONNECT"
"2012-10-19 05:30:36.361: DISCONNECT"
"2012-10-19 05:54:38.777: CONNECT"
"2012-10-19 05:54:58.359: DISCONNECT"
"2012-10-19 05:55:02.256: CONNECT"
"2012-10-19 05:56:05.113: CONNECT"
"2012-10-19 05:56:43.411: DISCONNECT"
"2012-10-19 05:57:08.320: CONNECT"
"2012-10-19 05:57:38.325: DISCONNECT"
"2012-10-19 05:58:12.434: CONNECT"
"2012-10-19 05:58:42.443: DISCONNECT"
"2012-10-19 05:59:16.679: CONNECT"
"2012-10-19 05:59:46.686: DISCONNECT"
"2012-10-19 06:00:21.892: CONNECT"
"2012-10-19 06:01:06.169: DISCONNECT"
"2012-10-19 06:01:25.725: CONNECT"
"2012-10-19 06:02:09.637: DISCONNECT"
"2012-10-19 06:02:32.373: CONNECT"
"2012-10-19 06:03:03.003: DISCONNECT"
"2012-10-19 06:03:37.183: CONNECT"
"2012-10-19 06:04:14.422: DISCONNECT"
"2012-10-19 06:04:44.372: CONNECT"
"2012-10-19 06:05:15.403: DISCONNECT"
"2012-10-19 06:05:52.034: CONNECT"
"2012-10-19 06:06:27.457: DISCONNECT"
"2012-10-19 06:06:59.136: CONNECT"
"2012-10-19 06:07:56.649: DISCONNECT"
"2012-10-19 06:08:04.653: CONNECT"
"2012-10-19 06:09:03.384: DISCONNECT"
"2012-10-19 06:09:09.987: CONNECT"
"2012-10-19 06:09:54.077: DISCONNECT"
"2012-10-19 06:10:09.552: DISCONNECT"
"2012-10-19 06:10:15.505: CONNECT"
"2012-10-19 06:10:30.732: DISCONNECT"
"2012-10-19 06:10:44.468: CONNECT"
"2012-10-19 06:11:50.181: CONNECT"
"2012-10-19 06:12:20.296: DISCONNECT"
"2012-10-19 06:12:55.736: CONNECT"
"2012-10-19 06:13:25.753: DISCONNECT"
"2012-10-19 06:14:04.029: CONNECT"
"2012-10-19 06:15:02.722: DISCONNECT"
"2012-10-19 06:15:13.511: CONNECT"
"2012-10-19 06:15:43.393: DISCONNECT"
"2012-10-19 06:16:16.961: CONNECT"
"2012-10-19 06:16:46.845: DISCONNECT"
"2012-10-19 06:17:21.435: CONNECT"
"2012-10-19 06:17:51.345: DISCONNECT"
"2012-10-19 06:18:25.748: CONNECT"
"2012-10-19 06:18:57.003: DISCONNECT"
"2012-10-19 06:19:33.036: CONNECT"
"2012-10-19 06:20:07.889: DISCONNECT"
"2012-10-19 06:20:38.335: CONNECT"
"2012-10-19 06:21:21.366: DISCONNECT"
"2012-10-19 06:21:42.598: CONNECT"
"2012-10-19 06:22:12.590: DISCONNECT"
"2012-10-19 06:22:50.721: CONNECT"
"2012-10-19 06:23:20.663: DISCONNECT"
"2012-10-19 06:23:55.112: CONNECT"
"2012-10-19 06:24:51.063: DISCONNECT"
"2012-10-19 06:24:59.041: CONNECT"
"2012-10-19 06:25:57.816: DISCONNECT"
"2012-10-19 06:26:04.661: CONNECT"
"2012-10-19 06:27:03.207: DISCONNECT"
"2012-10-19 06:27:08.868: CONNECT"
"2012-10-19 06:28:03.758: DISCONNECT"
"2012-10-19 06:28:12.640: CONNECT"
"2012-10-19 06:29:09.578: DISCONNECT"
"2012-10-19 06:29:16.758: CONNECT"
"2012-10-19 06:30:15.155: DISCONNECT"
"2012-10-19 06:30:20.795: CONNECT"
"2012-10-19 06:31:18.205: DISCONNECT"
"2012-10-19 06:31:24.974: CONNECT"
"2012-10-19 06:32:21.439: DISCONNECT"
"2012-10-19 06:32:29.682: CONNECT"
"2012-10-19 06:33:25.044: DISCONNECT"
"2012-10-19 06:33:34.026: CONNECT"
"2012-10-19 06:34:24.214: DISCONNECT"
"2012-10-19 06:34:38.116: CONNECT"
"2012-10-19 06:35:26.098: DISCONNECT"
"2012-10-19 06:35:44.511: CONNECT"
"2012-10-19 06:36:40.795: DISCONNECT"
"2012-10-19 06:36:51.662: CONNECT"
"2012-10-19 06:37:36.573: DISCONNECT"
"2012-10-19 06:37:57.916: CONNECT"
"2012-10-19 06:38:41.483: DISCONNECT"
"2012-10-19 06:39:03.649: CONNECT"
"2012-10-19 06:39:49.005: DISCONNECT"
"2012-10-19 06:40:10.589: CONNECT"
"2012-10-19 06:40:53.353: DISCONNECT"
"2012-10-19 06:41:16.548: CONNECT"
"2012-10-19 06:41:50.879: DISCONNECT"
"2012-10-19 06:42:23.300: CONNECT"
"2012-10-19 06:43:23.202: DISCONNECT"
"2012-10-19 06:43:30.644: CONNECT"
"2012-10-19 06:44:24.708: DISCONNECT"
"2012-10-19 06:44:35.994: CONNECT"
"2012-10-19 06:45:30.391: DISCONNECT"
"2012-10-19 06:45:43.469: CONNECT"
"2012-10-19 06:46:41.716: DISCONNECT"
"2012-10-19 06:46:49.828: CONNECT"
"2012-10-19 06:47:49.488: DISCONNECT"
"2012-10-19 06:47:57.553: CONNECT"
"2012-10-19 06:48:57.002: DISCONNECT"
"2012-10-19 06:49:03.818: CONNECT"
"2012-10-19 06:50:03.328: DISCONNECT"
"2012-10-19 06:51:27.847: DISCONNECT"
"2012-10-21 22:23:05.992: CONNECT"
"2012-10-21 22:23:38.257: DISCONNECT"
"2012-10-21 22:24:07.254: CONNECT"
"2012-10-21 22:24:09.013: DISCONNECT"
"2012-10-21 22:24:14.479: CONNECT"
"2012-10-21 22:25:26.859: CONNECT"
"2012-10-21 22:25:42.983: DISCONNECT"
"2012-10-21 22:25:54.581: CONNECT"
"2012-10-21 22:26:22.421: DISCONNECT"
"2012-10-21 22:27:00.062: CONNECT"
"2012-10-21 22:27:30.075: DISCONNECT"
"2012-10-21 22:28:05.398: CONNECT"
"2012-10-21 22:29:01.563: DISCONNECT"
"2012-10-21 22:29:09.637: CONNECT"
"2012-10-21 22:30:07.419: DISCONNECT"
"2012-10-21 22:30:14.288: CONNECT"
"2012-10-21 22:31:14.088: DISCONNECT"
"2012-10-21 22:31:20.295: CONNECT"
"2012-10-21 22:32:18.878: DISCONNECT"
"2012-10-21 22:32:25.300: CONNECT"
"2012-10-21 22:33:24.893: DISCONNECT"
"2012-10-21 22:33:30.138: CONNECT"
"2012-10-21 22:34:27.793: DISCONNECT"
"2012-10-21 22:34:35.420: CONNECT"
"2012-10-21 22:34:43.702: DISCONNECT"
"2012-10-21 22:35:35.213: DISCONNECT"
"2012-10-21 22:36:56.076: CONNECT"
"2012-10-21 22:36:57.248: DISCONNECT"
"2012-10-21 22:37:02.652: CONNECT"
"2012-10-21 22:38:06.956: CONNECT"
"2012-10-21 22:39:05.265: DISCONNECT"
"2012-10-21 22:39:11.678: CONNECT"
"2012-10-21 22:39:50.268: DISCONNECT"
"2012-10-21 22:40:17.913: CONNECT"
"2012-10-21 22:40:47.996: DISCONNECT"
"2012-10-21 22:41:22.548: CONNECT"
"2012-10-21 22:41:52.565: DISCONNECT"
"2012-10-21 22:42:27.822: CONNECT"
"2012-10-21 22:43:20.529: DISCONNECT"
"2012-10-21 22:43:31.946: CONNECT"
"2012-10-21 22:44:30.647: DISCONNECT"
"2012-10-21 22:44:36.733: CONNECT"
"2012-10-21 22:45:22.905: DISCONNECT"
"2012-10-21 22:45:41.509: CONNECT"
"2012-10-21 22:46:12.720: DISCONNECT"
"2012-10-21 22:46:46.002: CONNECT"
"2012-10-21 22:47:15.839: DISCONNECT"
"2012-10-21 22:49:10.607: DISCONNECT"
"2012-10-21 22:57:25.766: CONNECT"

PS. here are the stats on connect and disconnect for the period in question.  Note that I get more connects than disconnects (across all users); all users were in fact logged off before I took this snapshot.

Endpoint/requests/mSec/latency

/_ah/channel/connected/      69  34  129 ms
/_ah/channel/disconnected/  67  45  140 ms

Aleksei Rovenski

unread,
Oct 22, 2012, 3:46:43 PM10/22/12
to google-a...@googlegroups.com
My experience with presence notifications is also that they are unreliable.
Channel API has some problems, you can read this topic for more:
https://groups.google.com/forum/?fromgroups=#!searchin/google-appengine/channel$20unreliable/google-appengine/lqaCMEj8IyQ/7eWTcLqJxWsJ

I have created an issue some time ago re presences problem, please star it.
It is as simple as adding a timestamp to presence notification.
http://code.google.com/p/googleappengine/issues/detail?id=8151

PS Btw for about a month now there is a new problem that working channel stops receiving messages w/o any errors for undefined period of time and then may start receiving them again...

Kristopher Giesing

unread,
Oct 23, 2012, 2:10:06 AM10/23/12
to google-a...@googlegroups.com
Yeah, I already starred that one.

So... what's the solution?  Is there one or do I need to roll my own channel replacement?

Aleksei Rovenski

unread,
Oct 23, 2012, 4:21:58 AM10/23/12
to google-a...@googlegroups.com
I don't know of any proper solution I'm afraid except to use other service like Pusher maybe.
In my app there is a workaround, clients contact server quite often, so when there is a false disconnect server returns them special code.
If they are actually connected, they simply inform the server that they are and life goes on. Most of the time this happens behind the scenes.
However this is ugly indeed, when there is a reconnect, there is often a bunch of requests when client and server keep trying to figure out whats going on.
Actually when I look at my channels code, it is basically covered with all kinds of workarounds like this :(

вторник, 23 октября 2012 г., 9:10:06 UTC+3 пользователь Kristopher Giesing написал:

Kristopher Giesing

unread,
Oct 28, 2012, 4:54:35 PM10/28/12
to google-a...@googlegroups.com
I'm going to migrate off the channel API and replace it with a hosted node.js instance running socket.io.  If people are interested I can post the resulting solution somewhere.

- Kris

Kristopher Giesing

unread,
Nov 23, 2012, 9:38:59 PM11/23/12
to google-a...@googlegroups.com
I have an update on plans to migrate off the channel APIs.

I looked at both socket.io and sockjs, and neither one of them turned out to be acceptable.  socket.io's error handling has some serious issues, and sockjs doesn't pass cookies in the connection request (intentionally - they claim it's a security hole if anyone does that, but they're making assumptions about cookie handling that in my case aren't true).

So here's where I ended up:

- GAE instance continues as usual, for now
- Node.js instance runs a low-level web socket implementation.  I used websocket-node, but ws probably would have worked as well.
- Node.js additionally runs a low-level Apple push notification implementation
- On the client, which is a hybrid iOS app, I have a native (Objective-C) implementation of the web socket client, called SocketRocket.  SocketRocket, like socket.io, has some error handling issues, but there's no other real option and I was able to work around them.
- The HTML layer then talks to the native layer over a local-http bridge (intercepted using NSURLProtocol subclassing, which is a standard trick I've used in a bunch of other places)

I used an Objective-C library rather than the UIWebView implementation of WebSocket because I'm supporting older iOS devices that don't have WebSocket support. However, this tradeoff means that I don't get cross-browser support, since many browsers don't have sufficient support for WebSockets to connect to the service (websocket-node only supports the latest RFC versions of the WebSocket protocol, not earlier Hybi versions).

It all works pretty well.  I now understand why the connection state works the way it does in GAE; there is almost certainly a proxy server somewhere in GAE land that's holding the channel connection separate from the GAE app itself.  However, given that architecture there are a bunch of things that the Channel API failed on, which were pretty trivial to deliver in my replacement:

- I can send to multiple clients in one request (fan-out)
- I can send multiple messages in one request (batching)
- I can instantly fall back to Apple push notifications if a client is not connected (and the message format is the same, so the application level in the client doesn't have to care how the message was delivered)
- The request is synchronous so I get a result status back, with individual reporting on each client and each message
-- Note though that the result status isn't bulletproof; it just reports connection up/down at the time Node.js issues the socket write.  Since the socket write is asynchronous, it won't report errors that happen after dispatch.  But it's good enough for my purposes.

Some of the tradeoffs I've made mean that the work I've done isn't necessarily applicable to another project.  So I'm not planning to post this e.g. to github for now.  However, I wanted to post what I've done in case others want to pursue a similar path.  The node.js service is only about 100 lines of code; most of the tricky bits had to do with shunting events around on the client and providing a proper abstraction for SocketRocket's error reporting.

Cheers,

- Kris
Reply all
Reply to author
Forward
0 new messages