[erlang-questions] R1502 ssl dropping byte(s)?

43 views
Skip to first unread message

sasa

unread,
Nov 2, 2012, 1:33:17 PM11/2/12
to erlang-questions
Hello,

Today I have migrated my production servers to R15B02 (previously they ran on R14).
Generally, everything works fine. However, I do notice a strange behavior occasionally.
I'm not sure if Erlang is to blame, but it's a strange coincidence that this started occurring after the migration, and system is in production for almost two years.

In my system, I am constantly fetching some data from the external provider via ssl. The code is roughly following:

loop(Socket) ->
   ssl:recv(Socket, 0, 5000),
   ...
   loop(Socket).

After switching to R15, I have noticed that occasionally, a byte gets "lost". This always happens after 5 or 10 seconds which leads me to suspicion, that possibly some race condition occurs in ssl.
Could it be that the timeout occurs just as the bytes start to arrive, and the first one is discarded?
Again, I didn't notice this behavior on R14.

Best regards,
Sasa

Dmitry Kolesnikov

unread,
Nov 2, 2012, 1:49:53 PM11/2/12
to sasa, erlang-questions
Hello,

I also had an issue with a first byte in ssl.
Please check this thread:
http://erlang.org/pipermail/erlang-questions/2012-August/068632.html

This could be a reason for your case as well.

- Dmitry
> _______________________________________________
> erlang-questions mailing list
> erlang-q...@erlang.org
> http://erlang.org/mailman/listinfo/erlang-questions

_______________________________________________
erlang-questions mailing list
erlang-q...@erlang.org
http://erlang.org/mailman/listinfo/erlang-questions

sasa

unread,
Nov 2, 2012, 2:27:25 PM11/2/12
to erlang-questions
Thank you, I've checked the thread. 
However, if I understand correctly, your problem was fragmentation.
I don't have that kind of problem. I.e. I have a code which handles fragmentation, and it doesn't bother me if a short message is fragmented. Theoretically, the message can be fragmented in one byte micro-messages, and I will still recompose it correctly

My problem arises because after the receive timeout occurs, one byte is simply missing i.e. it is not received on the next ssl:recv call.

Ingela Andin

unread,
Nov 5, 2012, 5:08:41 AM11/5/12
to sasa, erlang-questions
Hi!

The only thing that might have changed between R14 and R15 is the
timing so that this happens to you more often, but using a timeout to
recv can cause
you to sort of loose data as you will not stop the recv of being
processed by ssl, only cause recv to return early. This is much like
if you time out a POST request to a webserver the request may still
have reached the webserver and have been executed on the server side.
I think that if you do not want to hang
in recv you should use active once instead passive receive (recv).

Regards Ingela Erlang/OTP team - Ericsson AB


2012/11/2 sasa <sas...@gmail.com>:

sasa

unread,
Nov 5, 2012, 9:32:18 AM11/5/12
to erlang-questions
Thank you for the response.

Obviously, active once should resolve my problem, and I will modify the code accordingly.

In the meantime, I have resolved the issue by raising ssl:recv timeout to 15 seconds. In my environment this essentially means that timeout does not occur.
I'd like to stress that in almost 2 years of production, I have never experienced this issue with R14.
It is also very strange that I was always loosing first byte of the message and not more. Which still leads me to believe that some bug might have been introduced regarding passive sockets.

However, active once seems like a cleaner solution, and I will therefore go ahead with that approach.

Best regards,
Sasa

Ingela Andin

unread,
Nov 6, 2012, 1:07:29 PM11/6/12
to Erlang
Hi!

I think you will be better of using active once but however see
comment below ...

2012/11/5, sasa <sas...@gmail.com>:
> Thank you for the response.
>
> Obviously, active once should resolve my problem, and I will modify the
> code accordingly.
>
> In the meantime, I have resolved the issue by raising ssl:recv timeout to
> 15 seconds. In my environment this essentially means that timeout does not
> occur.
> I'd like to stress that in almost 2 years of production, I have never
> experienced this issue with R14.
> It is also very strange that I was always loosing first byte of the message
> and not more. Which still leads me to believe that some bug might have been
> introduced regarding passive sockets.

I agree that we could change the implementation of ssl:recv to handle
the timeout on the server side and avoid this problem and that is how
it should (TM) work! We will do that.

Regards Ingela Erlang/OTP team - Ericsson AB
Reply all
Reply to author
Forward
0 new messages