Thank you very much for raising all the concerns and reviewing the code. It helps us revise the library and hopefully improve it.
I don't quite understand the concern here. Can you elaborate? How is the method not doing what is intended? How can we have a race condition?
The Apple APNS protocol is designer for performance and not reliability or "debuggability." Please read the specification of the protocol to understand the code further.
Essentially, Apple doesn't return an acknowledgement for each message. Rather, at the first instance with an invalid message (or any error case), Apple servers drop the connection and return a 6 byte error code. In the life time of the entire connection, APNS servers never write to the connection except for 6-bytes indicating the error and the identifier of the erroneous message and then closes the connection immediately.
Because of such design, checking for the error code at every message sacrifices performance significantly. The library will need to wait to check for an error code for a time-out period before it can assume that the message was sent successfully. Because of the TCP protocol and some OS buffer issues, the time out period will be in the order of tens of microseconds or so. That will take the library throughput from >600 messages/second to <50 messages/second. I opted for an optimistic error checking (just send messages till we get an error and deal with it), rather than a pessimistic design (check for every message for an error).
Regards,- Mahmood
> This is because the read() method will block for the lifetime of the
> entire connection until first 6 bytes are returned?
Correct!
> Doesn't it have some kind of timeout for waiting for input?
Not as far as I know. There are limitations related to being behind
NATs and such, where a connection dies silently, but that's a separate
problem and cannot be solved well unfortunately. The library wil just
detect it and restarts the connection.
> I think that some kind of improvement is to actually check if the read
> returns an error code or something else so the connection can continue
> (and the thread returns to wait on read()) and in the other hand when
> there is an error code returned to check that the socket actually
> closed and did not stay in some other non-stable state (or simply kept
> open...).
Can you elaborate on the proposal and be specific. What do you mean
by "if the read returns [..] or something else" or "connection can
continue". Please note that `sendMessage` is place where I check
whether a connection is closed or not and properly closes it.
Also, with Java API, there is no go mechanism to actually "check that
the socket actually closed"; the only reliable mechanism is to write
to a channel and see whether the write succeeded or not (and that's
not perfect either). So basically, you'll have just to close it,
which is what is done in sendMessage() in the main thread. You may
want to paraphrase the sentence as "force close the socket", rather
than "check".
Feel free to submit a patch to improve it.
- Mahmood