Hitting amqp_abort() in wait_frame_inner function

161 views
Skip to first unread message

Rakesh K R

unread,
Jun 10, 2020, 9:44:23 AM6/10/20
to rabbitmq-c-users
Hi,
Can someone help me in understand on which scenario I can hit below highlighted code in rmq library?

if (AMQP_STATUS_TIMEOUT == res) {

      if (amqp_time_equal(deadline, state->next_recv_heartbeat)) {

        amqp_socket_close(state->socket, AMQP_SC_FORCE);

        return AMQP_STATUS_HEARTBEAT_TIMEOUT;

      } else if (amqp_time_equal(deadline, timeout_deadline)) {

        return AMQP_STATUS_TIMEOUT;

      } else if (amqp_time_equal(deadline, state->next_send_heartbeat)) {

        /* send heartbeat happens before we do recv_with_timeout */

        goto beginrecv;

      } else {

        amqp_abort("Internal error: unable to determine timeout reason");

      }

    } else if (AMQP_STATUS_OK != res) {

      return res;

    }

Alan Antonuk

unread,
Jun 10, 2020, 3:11:09 PM6/10/20
to Rakesh K R, rabbitmq-c-users
That's one of those "this state should never be reached" kind of asserts. It usually means that there's a bug in rabbitmq-c somewhere (or an invalid assumption).

If you have code that reaches this, it would be useful to determine what the value for deadline, timeout_deadline, and the state struct looks like.

-Alan



--
You received this message because you are subscribed to the Google Groups "rabbitmq-c-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-c-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rabbitmq-c-users/35a40eb1-d651-41c1-be19-0ef0d806cf3co%40googlegroups.com.

Rakesh K R

unread,
Jun 15, 2020, 4:31:18 AM6/15/20
to rabbitmq-c-users
Alan,
Instead of aborting can't this API just return some specific return code on this scenario and user application will reconnect to RMQ server again.
I feel client libraries should not abort this way.
Let me know your thoughts on this.


On Thursday, June 11, 2020 at 12:41:09 AM UTC+5:30, Alan Antonuk wrote:
That's one of those "this state should never be reached" kind of asserts. It usually means that there's a bug in rabbitmq-c somewhere (or an invalid assumption).

If you have code that reaches this, it would be useful to determine what the value for deadline, timeout_deadline, and the state struct looks like.

-Alan



On Wed, Jun 10, 2020 at 6:44 AM Rakesh K R <rakesh...@gmail.com> wrote:
Hi,
Can someone help me in understand on which scenario I can hit below highlighted code in rmq library?

if (AMQP_STATUS_TIMEOUT == res) {

      if (amqp_time_equal(deadline, state->next_recv_heartbeat)) {

        amqp_socket_close(state->socket, AMQP_SC_FORCE);

        return AMQP_STATUS_HEARTBEAT_TIMEOUT;

      } else if (amqp_time_equal(deadline, timeout_deadline)) {

        return AMQP_STATUS_TIMEOUT;

      } else if (amqp_time_equal(deadline, state->next_send_heartbeat)) {

        /* send heartbeat happens before we do recv_with_timeout */

        goto beginrecv;

      } else {

        amqp_abort("Internal error: unable to determine timeout reason");

      }

    } else if (AMQP_STATUS_OK != res) {

      return res;

    }

--
You received this message because you are subscribed to the Google Groups "rabbitmq-c-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq...@googlegroups.com.

Alan Antonuk

unread,
Jun 15, 2020, 1:28:19 PM6/15/20
to Rakesh K R, rabbitmq-c-users
For this type of assert, it means that the library in some very bad state, there is no sensible recovery (e.g., reconnecting may get into an even more invalid state), thus crashing is the most sensible thing to do.

That being said, if you see this type of crash being hit, it means there's a bug in the library, and it should be fixed.

-Alan

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-c-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rabbitmq-c-users/bb0489e6-a487-45cb-9d81-10d1565df216o%40googlegroups.com.

Rakesh K R

unread,
Jun 21, 2020, 6:43:13 AM6/21/20
to rabbitmq-c-users
Alan,
I have put some debug prints in the library to dump states. Let me know if you can find something.

Logs:
Result: -13, deadline: 2933148232333068, timeout_deadline: 2933388090115816
state: 0, channel_max: 65535, frame_max: 131072, heartbeat:60, next_recv_heartbeat:2933253177351812, next_send_heartbeat: 2933194756338051
inbound_offset: 0, target_size: 7, sock_inbound_offset: 8, sock_inbound_limit: 8
first_queued_frame is null
last_queued_frame is null
most_recent_api_result ------>
reply_type: 1, reply.id: 3932171, library_error: 0
most_recent_api_result <------
handshake_timeout->tv_sec: 12, tv_use: 0, internal_handshake_timeout->tv_sec: 12, tv_usec: 0
Internal error: unable to determine timeout reason


Alan Antonuk

unread,
Jun 21, 2020, 4:10:30 PM6/21/20
to Rakesh K R, rabbitmq-c-users
Ok, seems like you have identified a bug. Do you have a pared down example that reproduces the bug and or stacktrace when this assert is hit?

-Alan

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-c-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rabbitmq-c-users/f7af50ab-eaaa-4a70-9016-fc102f2c89dbo%40googlegroups.com.

Rakesh K R

unread,
Jun 23, 2020, 8:35:14 AM6/23/20
to rabbitmq-c-users
Alan,
I dont have specific example to hit this issue and its random in nature.
somehow I am able to get the stack trace and state information just before calling abort().
Let me know if you have any workaround for this issue. 

Invoking consume api something like this =======> ret = amqp_consume_message(conn, &envelope, NULL, 0); 

Logs:
------->
Result: -13, deadline: 70744264525108, timeout_deadline: -1
state: 0, channel_max: 65535, frame_max: 131072, heartbeat:60, next_recv_heartbeat:70845054847582, next_send_heartbeat: 70785421816292
inbound_offset: 0, target_size: 7, sock_inbound_offset: 8, sock_inbound_limit: 8
first_queued_frame is null
last_queued_frame is null
most_recent_api_result ------>
reply_type: 1, reply.id: 3932171, library_error: 0
most_recent_api_result <------
handshake_timeout->tv_sec: 12, tv_use: 0, internal_handshake_timeout->tv_sec: 12, tv_usec: 0
<--------
Internal error: unable to determine timeout reason

7 stack frames.
/acp_ag_ae() [0x70fe7b]
/acp_ag_ae(amqp_simple_wait_frame_noblock+0x97) [0x711137]
/acp_ag_ae(amqp_consume_message+0x9f) [0x7153af]
/acp_ag_ae(ag_ae_rmq_poll+0x83) [0x5d10ae]
/acp_ag_ae(main+0xa62) [0x5bd3e9]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7faf23fe8f45]
/acp_ag_ae() [0x57bb39]

Alan Antonuk

unread,
Jun 23, 2020, 11:48:21 AM6/23/20
to Rakesh K R, rabbitmq-c-users
What platform are you running this on? (OS, Version, and hardware platform)

FYI: a bug in a related part of the code was fixed recently (https://github.com/alanxz/rabbitmq-c/commit/c5cf965ed6d0233468fdea69e8a314f6d46b52c8), while this doesn't completely match the bug you're describing, it may fix it. If you haven't already please try it with this newer version.

-Alan

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-c-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rabbitmq-c-users/e6045922-2307-4d80-a114-737279f45daco%40googlegroups.com.

Rakesh K R

unread,
Jun 24, 2020, 3:54:17 AM6/24/20
to rabbitmq-c-users
Alan,
I am running this app ubuntu 14 and client  library version 0.9 (release build).
Just FYI, My app is deployed on a cloud managed by k8s orchestration.

With existing library, I tried with timeout value as NULL and then with non zero value as {3,1}. Result is same.
Anyways I will try with fix version once.

bigbig zhang

unread,
Nov 22, 2024, 6:36:47 AM11/22/24
to rabbitmq-c-users
  I encountered the same issue. The OS is Ubuntu 22.04, and the version of rabbitmq-c is v0.14.0. Is there a solution available now?  
Reply all
Reply to author
Forward
0 new messages