[android][java][cloud-speech] how can I know about the channel status?

980 views
Skip to first unread message

David Edery

unread,
Mar 26, 2017, 12:28:45 PM3/26/17
to grpc.io
Hi,

I'm using gRPC with the cloud speech API (v1beta1).
In my app, at a certain point, I need to stream the audio to the speech service and get the results as fast as I can.
For that, I need the managed-channel to be up and ready before streaming. The wiring of everything (creating the managed channel, creating the SpeechStub, the response stream observer and the request observer) takes roughly 500ms.
500ms is too much for my app to wait before streaming. This is why I prepare everything before and I make sure that at the end of a recognition operation the full structure is prepared for the next iteration.
However, I've noticed that given a long enough idle wait (don't know how long, matter of minutes) of the channel, if I try to stream the audio, everything acts as if all is well but I don't get any response (to beginning of speech nor transcripts nor any error). 
I hypothesised that it has to do with the connectivity state/idle state of the channel and decided that I'll constantly shut the channel down and reconnect in 1 minute intervals (given of course that it's not busy). This solved the problem - but it's a workaround of course.
Is there a way to know what's the state of the channel? I saw that grpc-java issue #28 should address this issue with the ManagedChannel.getState/notifyWhenStateChanged APIs (rel 1.2.0) but it's not implemented yet.
I also saw that there's a health check protocol (https://github.com/grpc/grpc/blob/master/doc/health-checking.md) - does this feature work? would it be suitable for my needs?

When is the state API is expected to land? I think that going forward this is the way to go from our app's perspective.

Thanks,
David.

Eric Anderson

unread,
Mar 27, 2017, 12:19:38 PM3/27/17
to David Edery, grpc.io
On Sun, Mar 26, 2017 at 9:28 AM, David Edery <da...@intuitionrobotics.com> wrote:
500ms is too much for my app to wait before streaming. This is why I prepare everything before and I make sure that at the end of a recognition operation the full structure is prepared for the next iteration.

ManagedChannel connects lazily (on first RPC), and I'd expect a good portion of the 500ms is creating the TCP connection and TLS. Eagerly connecting will be provided by the channel state API via getState(true)
 
However, I've noticed that given a long enough idle wait (don't know how long, matter of minutes) of the channel, if I try to stream the audio, everything acts as if all is well but I don't get any response (to beginning of speech nor transcripts nor any error).

That sounds like a network failure. Once we support keepalive without any active RPCs (I'm working on this now; is about a week away; it may make the next release), it could detect the failure. But using that option with Google APIs is unsupported; the frontend doesn't want the additional traffic. At the moment we would suggest using ManagedChannelBuilder.idleTimeout so that the TCP connection is torn down during the inactive period so that it isn't in a hung state when you want to do an RPC.

I hypothesised that it has to do with the connectivity state/idle state of the channel and decided that I'll constantly shut the channel down and reconnect in 1 minute intervals (given of course that it's not busy). This solved the problem - but it's a workaround of course.

I don't think the issue is caused by a gRPC state change. It's generally caused by problems in the network. Using idleTimeout() will trigger gRPC to shutdown the connection for you. In order to avoid the 500ms overhead later, you'd need the channel state API and ask the channel to re-connect each time it goes IDLE.

Is there a way to know what's the state of the channel? I saw that grpc-java issue #28 should address this issue with the ManagedChannel.getState/notifyWhenStateChanged APIs (rel 1.2.0) but it's not implemented yet.

Nope. Note that the API wouldn't tell you anything in this case, since the problem isn't likely caused by gRPC going idle. But if it was implemented it would provide you a way to "kick" gRPC to eagerly make a TCP connection.

I also saw that there's a health check protocol (https://github.com/grpc/grpc/blob/master/doc/health-checking.md) - does this feature work? would it be suitable for my needs?

That's more for load balancing (avoiding backends that aren't healthy). It wouldn't help you, as I don't think our public APIs provide such a service.

When is the state API is expected to land? I think that going forward this is the way to go from our app's perspective.

It's currently scheduled for Q2. That isn't a promise, but gives an idea.

David Edery

unread,
Mar 28, 2017, 1:11:41 AM3/28/17
to grpc.io, da...@intuitionrobotics.com
Thank you for your answer :)


On Monday, March 27, 2017 at 7:19:38 PM UTC+3, Eric Anderson wrote:
On Sun, Mar 26, 2017 at 9:28 AM, David Edery <da...@intuitionrobotics.com> wrote:
500ms is too much for my app to wait before streaming. This is why I prepare everything before and I make sure that at the end of a recognition operation the full structure is prepared for the next iteration.

ManagedChannel connects lazily (on first RPC), and I'd expect a good portion of the 500ms is creating the TCP connection and TLS. Eagerly connecting will be provided by the channel state API via getState(true)

It doesn't seem like we're spending more time after the wiring so I guess that something in the wiring flow causes the eager connectivity. The flow of the wiring is:
1. Create the channel using ManagedChannelBuilder (without idleTimeout for now)
2. Create the speech client (SpeechGrpc.newStub(channel).withCallCredentials(MoreCallCredentials.from(credentials)))
3. Create a response StreamObserverImpl (implements StreamObserver<StreamingRecognizeResponse>. This is a simple internal object creation so no connectivity there, I can probably skip this since there's no real value in creating a new instance every time but it won't save much of the 500ms)
4. Create a request observer (of type StreamObserver<StreamingRecognizeRequest>) by calling the speech client's (which is of type SpeechGrpc.SpeechStub) streamingRecognize function

I didn't get into the details (yet) but I'm sure that there's network activity in the above described flow. I know it due to an exception I got on network activity when this flow was executed on the main (UI) thread (which doesn't allow network activity to be executed on it).

 
However, I've noticed that given a long enough idle wait (don't know how long, matter of minutes) of the channel, if I try to stream the audio, everything acts as if all is well but I don't get any response (to beginning of speech nor transcripts nor any error).

That sounds like a network failure. Once we support keepalive without any active RPCs (I'm working on this now; is about a week away; it may make the next release), it could detect the failure. But using that option with Google APIs is unsupported; the frontend doesn't want the additional traffic. At the moment we would suggest using ManagedChannelBuilder.idleTimeout so that the TCP connection is torn down during the inactive period so that it isn't in a hung state when you want to do an RPC.

"the frontend doesn't want the addition traffic" == RPC calls are ok but anything else would be suspected as DDoS? (depends of course on the frequency of the keep alive)
 

I hypothesised that it has to do with the connectivity state/idle state of the channel and decided that I'll constantly shut the channel down and reconnect in 1 minute intervals (given of course that it's not busy). This solved the problem - but it's a workaround of course.

I don't think the issue is caused by a gRPC state change. It's generally caused by problems in the network. Using idleTimeout() will trigger gRPC to shutdown the connection for you. In order to avoid the 500ms overhead later, you'd need the channel state API and ask the channel to re-connect each time it goes IDLE.

Yes. That is why I'm asking about the state API. It seems that this is the ideal solution for my problem 


Is there a way to know what's the state of the channel? I saw that grpc-java issue #28 should address this issue with the ManagedChannel.getState/notifyWhenStateChanged APIs (rel 1.2.0) but it's not implemented yet.

Nope. Note that the API wouldn't tell you anything in this case, since the problem isn't likely caused by gRPC going idle. But if it was implemented it would provide you a way to "kick" gRPC to eagerly make a TCP connection.

A:
So if I understand correctly (and please correct me if I'm wrong), once state API is available the flow would be something like:
1. create the channel (as described above) with idleTimeout + listener on connectivity state change
2. In case of connectivity state change, goto #1
3. prior to using the channel, call getState(true) to eagerly connect it (in case that idleTimeout was reached) if is not connected and then do the actual streaming work 

B:
Today, in step #1 (that doesn't include idleTimeout), if channel != null && !channel.isShutdown && !channel.isTerminated I call channel.shutdownNow and immediately create a new ManagedChannel (which means - the way I understand it - that there's a channel in the process of shutting down while immediately I create another channel which is wiring up). Just to validate this point - is this described flow is ok? (shutdown one channel instance while creating another channel for the same host).

Given the future A and the current B, I assume that I will still need to take care for the channel shutdown at the end of the streaming operation. idleTimeout will not take care for it once the channel has been active no? from the documentation of idleTimeout: "By default the channel will never go to idle mode after it leaves the initial idle mode". Is this a correct assumption?
Does the above flow (A+B) sounds reasonable as a solution to an always-ready channel requirement?


I also saw that there's a health check protocol (https://github.com/grpc/grpc/blob/master/doc/health-checking.md) - does this feature work? would it be suitable for my needs?

That's more for load balancing (avoiding backends that aren't healthy). It wouldn't help you, as I don't think our public APIs provide such a service.

cool
 

When is the state API is expected to land? I think that going forward this is the way to go from our app's perspective.

It's currently scheduled for Q2. That isn't a promise, but gives an idea.

Thanks. It is much expected. 

Eric Anderson

unread,
Mar 31, 2017, 3:49:32 PM3/31/17
to David Edery, grpc.io
On Mon, Mar 27, 2017 at 10:11 PM, David Edery <da...@intuitionrobotics.com> wrote:
4. Create a request observer (of type StreamObserver<StreamingRecognizeRequest>) by calling the speech client's (which is of type SpeechGrpc.SpeechStub) streamingRecognize function

I didn't get into the details (yet) but I'm sure that there's network activity in the above described flow. I know it due to an exception I got on network activity when this flow was executed on the main (UI) thread (which doesn't allow network activity to be executed on it).

#4 creates an RPC. So that's where the I/O should be.

"the frontend doesn't want the addition traffic" == RPC calls are ok but anything else would be suspected as DDoS? (depends of course on the frequency of the keep alive)

I can't speak authoritatively, but I think it more about the load and lack of billing. If you aren't careful, keepalive pings can very easily eat up a significant portion of network/cpu. They are also mostly invisible, so it's very easy to avoid noticing unnecessary load.

Is there a way to know what's the state of the channel? I saw that grpc-java issue #28 should address this issue with the ManagedChannel.getState/notifyWhenStateChanged APIs (rel 1.2.0) but it's not implemented yet.

Nope. Note that the API wouldn't tell you anything in this case, since the problem isn't likely caused by gRPC going idle. But if it was implemented it would provide you a way to "kick" gRPC to eagerly make a TCP connection.

A:
So if I understand correctly (and please correct me if I'm wrong), once state API is available the flow would be something like:
1. create the channel (as described above) with idleTimeout + listener on connectivity state change
2. In case of connectivity state change, goto #1
3. prior to using the channel, call getState(true) to eagerly connect it (in case that idleTimeout was reached) if is not connected and then do the actual streaming work 

#2 should be calling getState(true). #3 should never be necessary; getState(true) basically does the first half of setting up an RPC, making sure that a connection is available, but then doesn't send an RPC

B:
Today, in step #1 (that doesn't include idleTimeout), if channel != null && !channel.isShutdown && !channel.isTerminated I call channel.shutdownNow and immediately create a new ManagedChannel (which means - the way I understand it - that there's a channel in the process of shutting down while immediately I create another channel which is wiring up). Just to validate this point - is this described flow is ok? (shutdown one channel instance while creating another channel for the same host).

Shutting down a channel while creating another to the same host is safe. I probably would just check isShutdown; isTerminated can take some time since it needs to release resources. Semi-unrelated, but isTerminated == true implies isShutdown == true.

Given the future A and the current B, I assume that I will still need to take care for the channel shutdown at the end of the streaming operation. idleTimeout will not take care for it once the channel has been active no? from the documentation of idleTimeout: "By default the channel will never go to idle mode after it leaves the initial idle mode". Is this a correct assumption?
Does the above flow (A+B) sounds reasonable as a solution to an always-ready channel requirement?

Hmm... that documentation is bit misleading. I just sent out a PR to improve it.

idleTimeout doesn't shutdown a channel, but it would cause it to go idle (i.e., TCP connection). The part of documentation you linked to starts with "by default"; that was meaning "if you don't call idleTimeout."

David Edery

unread,
Apr 4, 2017, 2:14:45 AM4/4/17
to grpc.io, da...@intuitionrobotics.com


On Friday, March 31, 2017 at 10:49:32 PM UTC+3, Eric Anderson wrote:
On Mon, Mar 27, 2017 at 10:11 PM, David Edery <da...@intuitionrobotics.com> wrote:
4. Create a request observer (of type StreamObserver<StreamingRecognizeRequest>) by calling the speech client's (which is of type SpeechGrpc.SpeechStub) streamingRecognize function

I didn't get into the details (yet) but I'm sure that there's network activity in the above described flow. I know it due to an exception I got on network activity when this flow was executed on the main (UI) thread (which doesn't allow network activity to be executed on it).

#4 creates an RPC. So that's where the I/O should be.

"the frontend doesn't want the addition traffic" == RPC calls are ok but anything else would be suspected as DDoS? (depends of course on the frequency of the keep alive)

I can't speak authoritatively, but I think it more about the load and lack of billing. If you aren't careful, keepalive pings can very easily eat up a significant portion of network/cpu. They are also mostly invisible, so it's very easy to avoid noticing unnecessary load.

Is there a way to know what's the state of the channel? I saw that grpc-java issue #28 should address this issue with the ManagedChannel.getState/notifyWhenStateChanged APIs (rel 1.2.0) but it's not implemented yet.

Nope. Note that the API wouldn't tell you anything in this case, since the problem isn't likely caused by gRPC going idle. But if it was implemented it would provide you a way to "kick" gRPC to eagerly make a TCP connection.

A:
So if I understand correctly (and please correct me if I'm wrong), once state API is available the flow would be something like:
1. create the channel (as described above) with idleTimeout + listener on connectivity state change
2. In case of connectivity state change, goto #1
3. prior to using the channel, call getState(true) to eagerly connect it (in case that idleTimeout was reached) if is not connected and then do the actual streaming work 

#2 should be calling getState(true). #3 should never be necessary; getState(true) basically does the first half of setting up an RPC, making sure that a connection is available, but then doesn't send an RPC

Just to be sure that I understand the flow - for #2, when the connectivity state changes, I don't need to rebuild the whole channel I just need to call getState(true). Right?


B:
Today, in step #1 (that doesn't include idleTimeout), if channel != null && !channel.isShutdown && !channel.isTerminated I call channel.shutdownNow and immediately create a new ManagedChannel (which means - the way I understand it - that there's a channel in the process of shutting down while immediately I create another channel which is wiring up). Just to validate this point - is this described flow is ok? (shutdown one channel instance while creating another channel for the same host).

Shutting down a channel while creating another to the same host is safe. I probably would just check isShutdown; isTerminated can take some time since it needs to release resources. Semi-unrelated, but isTerminated == true implies isShutdown == true.

great. will use only isShutdown
 

Given the future A and the current B, I assume that I will still need to take care for the channel shutdown at the end of the streaming operation. idleTimeout will not take care for it once the channel has been active no? from the documentation of idleTimeout: "By default the channel will never go to idle mode after it leaves the initial idle mode". Is this a correct assumption?
Does the above flow (A+B) sounds reasonable as a solution to an always-ready channel requirement?

Hmm... that documentation is bit misleading. I just sent out a PR to improve it.

idleTimeout doesn't shutdown a channel, but it would cause it to go idle (i.e., TCP connection). The part of documentation you linked to starts with "by default"; that was meaning "if you don't call idleTimeout."

Thank you for clarifying


There's another, probably-unrelated issue of a channel that reached the streaming limitation - If I stream more than 65 seconds using the same channel, I get an exception. I assume that the source of this exception is the speech API itself and not an internal gRPC logic (is my assumption correct?) Currently I'm handling this by:
A. Not streaming more than 65 seconds of audio data
B. Once I get the final result from the speech API, I immediately create another channel using the above described flow

If my assumption is correct, I guess that that's the way to avoid the exception. If not, is there a way to re-use the channel by calling a kind of "reset" function? (just like your suggestion above on #2 in which the channel should be reused by calling getState(true) instead of creating a new channel)


David Edery

unread,
Apr 18, 2017, 1:16:57 AM4/18/17
to grpc.io, da...@intuitionrobotics.com
ping :)

Eric Anderson

unread,
Apr 18, 2017, 6:40:37 PM4/18/17
to David Edery, grpc.io
On Mon, Apr 17, 2017 at 10:16 PM, David Edery <da...@intuitionrobotics.com> wrote:
ping :)

You didn't include me in the to: in your reply, so it got lost in the noise.
 
On Tuesday, April 4, 2017 at 9:14:45 AM UTC+3, David Edery wrote:
On Friday, March 31, 2017 at 10:49:32 PM UTC+3, Eric Anderson wrote:
On Mon, Mar 27, 2017 at 10:11 PM, David Edery <da...@intuitionrobotics.com> wrote:
A:
So if I understand correctly (and please correct me if I'm wrong), once state API is available the flow would be something like:
1. create the channel (as described above) with idleTimeout + listener on connectivity state change
2. In case of connectivity state change, goto #1
3. prior to using the channel, call getState(true) to eagerly connect it (in case that idleTimeout was reached) if is not connected and then do the actual streaming work 

#2 should be calling getState(true). #3 should never be necessary; getState(true) basically does the first half of setting up an RPC, making sure that a connection is available, but then doesn't send an RPC

Just to be sure that I understand the flow - for #2, when the connectivity state changes, I don't need to rebuild the whole channel I just need to call getState(true). Right?

Yes.

There's another, probably-unrelated issue of a channel that reached the streaming limitation - If I stream more than 65 seconds using the same channel, I get an exception. I assume that the source of this exception is the speech API itself and not an internal gRPC logic (is my assumption correct?) Currently I'm handling this by:
A. Not streaming more than 65 seconds of audio data
B. Once I get the final result from the speech API, I immediately create another channel using the above described flow

I'd have to see the error to say. There's not any hard-coded limit of 65 seconds in grpc, but networks can do strange things at times.

If you have to create a new channel, then than sounds like the network and not the speech API. If it was the speech API, I'd expect a failure but creating a new RPC on the Channel would work fine.

If my assumption is correct, I guess that that's the way to avoid the exception. If not, is there a way to re-use the channel by calling a kind of "reset" function? (just like your suggestion above on #2 in which the channel should be reused by calling getState(true) instead of creating a new channel)

Any I/O error the channel experiences should automatically "reset" the channel. There should be no need to trigger something manually.

David Edery

unread,
Apr 23, 2017, 10:17:34 AM4/23/17
to grpc.io, da...@intuitionrobotics.com


On Wednesday, April 19, 2017 at 1:40:37 AM UTC+3, Eric Anderson wrote:
On Mon, Apr 17, 2017 at 10:16 PM, David Edery <da...@intuitionrobotics.com> wrote:
ping :)
 
You didn't include me in the to: in your reply, so it got lost in the noise.

Sorry. All I did was to press the "Post Reply" in the groups web UI. Nothing intentional.
 
 
On Tuesday, April 4, 2017 at 9:14:45 AM UTC+3, David Edery wrote:
On Friday, March 31, 2017 at 10:49:32 PM UTC+3, Eric Anderson wrote:
On Mon, Mar 27, 2017 at 10:11 PM, David Edery <da...@intuitionrobotics.com> wrote:
A:
So if I understand correctly (and please correct me if I'm wrong), once state API is available the flow would be something like:
1. create the channel (as described above) with idleTimeout + listener on connectivity state change
2. In case of connectivity state change, goto #1
3. prior to using the channel, call getState(true) to eagerly connect it (in case that idleTimeout was reached) if is not connected and then do the actual streaming work 

#2 should be calling getState(true). #3 should never be necessary; getState(true) basically does the first half of setting up an RPC, making sure that a connection is available, but then doesn't send an RPC

Just to be sure that I understand the flow - for #2, when the connectivity state changes, I don't need to rebuild the whole channel I just need to call getState(true). Right?

Yes.

Great, thank you :)
 

There's another, probably-unrelated issue of a channel that reached the streaming limitation - If I stream more than 65 seconds using the same channel, I get an exception. I assume that the source of this exception is the speech API itself and not an internal gRPC logic (is my assumption correct?) Currently I'm handling this by:
A. Not streaming more than 65 seconds of audio data
B. Once I get the final result from the speech API, I immediately create another channel using the above described flow

I'd have to see the error to say. There's not any hard-coded limit of 65 seconds in grpc, but networks can do strange things at times.


                                                                           io.grpc.StatusRuntimeException: OUT_OF_RANGE: Exceeded maximum allowed stream duration of 65 seconds.
                                                                               at io.grpc.Status.asRuntimeException(Status.java:545)
                                                                               at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:395)
                                                                               at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:481)
                                                                               at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:398)
                                                                               at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:513)
                                                                               at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:52)
                                                                               at io.grpc.internal.SerializingExecutor$TaskRunner.run(SerializingExecutor.java:154)
                                                                               at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1133)
                                                                               at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:607)
                                                                               at java.lang.Thread.run(Thread.java:761)

Eric Anderson

unread,
Apr 24, 2017, 12:44:32 PM4/24/17
to David Edery, grpc.io
On Sun, Apr 23, 2017 at 7:17 AM, David Edery <da...@intuitionrobotics.com> wrote:
There's another, probably-unrelated issue of a channel that reached the streaming limitation - If I stream more than 65 seconds using the same channel, I get an exception. I assume that the source of this exception is the speech API itself and not an internal gRPC logic (is my assumption correct?) Currently I'm handling this by:
A. Not streaming more than 65 seconds of audio data
B. Once I get the final result from the speech API, I immediately create another channel using the above described flow

I'd have to see the error to say. There's not any hard-coded limit of 65 seconds in grpc, but networks can do strange things at times.
                                                                           io.grpc.StatusRuntimeException: OUT_OF_RANGE: Exceeded maximum allowed stream duration of 65 seconds.

That looks like a speech-api limitation. 65 seconds seems like 1 minute + some fuzz. And yep, the limit is part of the documentation.

David Edery

unread,
Sep 27, 2017, 5:05:58 AM9/27/17
to grpc.io
Reviving this thread.

Hi,
I see that https://github.com/grpc/grpc-java/issues/2292 was marked as closed a week ago.
Does this concludes the issue of (in short) "understanding the state of the channel and act according"? (the details are in the thread of course)

Addressing your last comment in this post "It's currently scheduled for Q2" (see below) - is this the change you were referring?

Thanks,
David.

Kun Zhang

unread,
Sep 28, 2017, 7:14:29 PM9/28/17
to grpc.io


On Wednesday, September 27, 2017 at 2:05:58 AM UTC-7, David Edery wrote:
Reviving this thread.

Hi,
I see that https://github.com/grpc/grpc-java/issues/2292 was marked as closed a week ago.
Does this concludes the issue of (in short) "understanding the state of the channel and act according"? (the details are in the thread of course)

Addressing your last comment in this post "It's currently scheduled for Q2" (see below) - is this the change you were referring?

I think so.
Reply all
Reply to author
Forward
0 new messages