Exabgp and juniper Hold Timer Expired

1,742 views
Skip to first unread message

Raphael Mazelier

unread,
Aug 26, 2014, 10:07:10 AM8/26/14
to exabgp...@googlegroups.com
Hello all,

I use exabgp version 3.2.9 with juniper mx, to anycast my dns.

Hello all,

I use exabgp version 3.2.9 with juniper mx, to anycast my dns.
I recently review my router log, and I see the session were dropped every hours (aprox).

The exabgp log :

Aug 26 15:55:27 ig1-resolver-02 Tue, 26 Aug 2014 15:55:27 | INFO | 49124 | network | Peer 158.58.177.193 ASN 39605 out loop reset notification sent (4,0) Hold timer expired / Unspecific.
Aug 26 15:56:29 ig1-resolver-02 Tue, 26 Aug 2014 15:56:29 | INFO | 49124 | network | Connected to peer neighbor 158.58.177.193 local-ip 158.58.177.194 local-as 65000 peer-as 39605 router-id 15
8.58.177.194 family-allowed in-open (out)


The log on the juniper side :

Aug 26 13:55:27 cr1.pa3.par rpd[1397]: bgp_recv: peer 158.58.177.194 (External AS 65000): received unexpected EOF
Aug 26 13:55:27 cr1.pa3.par rpd[1397]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer 158.58.177.194 (External AS 65000) changed state from Established to Idle (event Closed) (instance master)
Aug 26 13:56:31 cr1.pa3.par rpd[1397]: bgp_parse_open_options: peer 158.58.177.194+45998 (proto): unsupported AF 2 SAFI 133
Aug 26 13:56:31 cr1.pa3.par rpd[1397]: bgp_parse_open_options: peer 158.58.177.194+45998 (proto): unsupported AF 2 SAFI 134
Aug 26 13:56:31 cr1.pa3.par rpd[1397]: bgp_process_caps: mismatch NLRI with 158.58.177.194 (External AS 65000): peer: <inet-unicast inet-multicast inet-vpn-unicast inet6-unicast inet-labeled-unicast inet6-vpn-unicast inet-flow inet-vpn-flow>(25239) us: <inet-unicast>(1)
Aug 26 13:56:32 cr1.pa3.par rpd[1397]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer 158.58.177.194 (External AS 65000) changed state from OpenConfirm to Established (event RecvKeepAlive) (instance master)

The exabgp.conf :

neighbor 158.58.177.193 {
description "will announce a route";
router-id 158.58.177.194;
local-address 158.58.177.194;
local-as 65000;
peer-as 39605;

static {
route 158.58.177.208/32 next-hop 158.58.177.194 med 150;
route 158.58.177.209/32 next-hop 158.58.177.194 med 100;
route 10.3.255.1/32 next-hop 158.58.177.194 med 150;
route 10.3.255.2/32 next-hop 158.58.177.194 med 100;
}
}

The juniper conf :

group ipv4-anycast-cust {
import no-routes;
family inet {
unicast;
}
export no-routes;
neighbor 158.58.177.194 {
import v4-RESOLVER;
peer-as 65000;
}
}

Any ideas ?

Regards,

Thomas Mangin

unread,
Aug 26, 2014, 1:03:33 PM8/26/14
to exabgp...@googlegroups.com
Hello Raphael,

3.4.2 fixes an issue where ExaBGP could get stuck on one session ( when you have more than one peer ). The bug report did look a lot like yours.
Could you please upgrade to the latest git master or 3.4.2 and tell me if the problem is still present ?
If this is not the same issue and the problem still exists with the latest version, could you please run ExaBGP with more debug, the simplest way is to use "-d" and give me the full logs off-list

Sincerely,

Thomas
> --
> You received this message because you are subscribed to the Google Groups "exabgp-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to exabgp-users...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

signature.asc

Raphael Mazelier

unread,
Aug 27, 2014, 9:01:44 AM8/27/14
to exabgp...@googlegroups.com
Thank you thomas, I will try 3.4.2 and give you the result.

Regards,

Raphael Mazelier

unread,
Aug 27, 2014, 9:45:33 AM8/27/14
to exabgp...@googlegroups.com
And no :

Aug 27 14:58:20 ig1-resolver-02 Wed, 27 Aug 2014 14:58:20 | INFO     | 47892  | network       | Peer        10.3.4.1 ASN 39605   out loop, peer reset, message [notification sent (4,0)] error[Hold timer
 expired / Unspecific]
Aug 27 14:58:20 ig1-resolver-02 Wed, 27 Aug 2014 14:58:20 | INFO     | 47892  | network       | Peer  158.58.177.193 ASN 39605   out loop, peer reset, message [notification sent (4,0)] error[Hold timer
 expired / Unspecific]
Aug 27 14:58:24 ig1-resolver-02 Wed, 27 Aug 2014 14:58:24 | INFO     | 47892  | network       | Connected to peer neighbor 10.3.4.1 local-ip 10.3.4.53 local-as 65000 peer-as 39605 router-id 10.3.4.53 f
amily-allowed in-open (out)
Aug 27 14:58:24 ig1-resolver-02 Wed, 27 Aug 2014 14:58:24 | INFO     | 47892  | network       | Connected to peer neighbor 158.58.177.193 local-ip 158.58.177.194 local-as 65000 peer-as 39605 router-id
158.58.177.194 family-allowed in-open (out)

Just a clarification I stripped my config file, I have two peers, each in a different vrf on the juniper side, so the whole configuration file is :

neighbor 10.3.4.1 {

        description "will announce a route";
        router-id 10.3.4.53;
        local-address 10.3.4.53;

        local-as 65000;
        peer-as 39605;

        static {
                route 10.3.255.1/32 next-hop 10.3.4.53 med 150;
                route 10.3.255.2/32 next-hop 10.3.4.53 med 100;
                route 158.58.177.208/32 next-hop 10.3.4.53 med 150;
                route 158.58.177.209/32 next-hop 10.3.4.53 med 100;

        }
}
neighbor 158.58.177.193 {
        description "will announce a route";
        router-id 158.58.177.194;
        local-address 158.58.177.194;
        local-as 65000;
        peer-as 39605;

        static {
                route 158.58.177.208/32 next-hop 158.58.177.194 med 150;
                route 158.58.177.209/32 next-hop 158.58.177.194 med 100;
                route 10.3.255.1/32 next-hop 158.58.177.194 med 150;
                route 10.3.255.2/32 next-hop 158.58.177.194 med 100;
        }
}







Le mardi 26 août 2014 16:07:10 UTC+2, Raphael Mazelier a écrit :

Thomas Mangin

unread,
Aug 27, 2014, 10:01:46 AM8/27/14
to exabgp...@googlegroups.com
Raphael,

Sorry, and <insert swear word of the day>. Could you please run exabgp with -d and send me the full logs when it happens again.
It is likely to be one of those horrible one-liner bug ...

Interestingly, it is ExaBGP which seems to be sending the Notification message, and his timer which expired.

Thomas

signature.asc

Raphael Mazelier

unread,
Aug 27, 2014, 12:55:53 PM8/27/14
to exabgp...@googlegroups.com
Hello,

Please find the debug log below. It's not clear if it was exabgp or the juniper who first drop the session (the NOTIFCATION log) :

Aug 27 18:43:23 ig1-resolver-02 Wed, 27 Aug 2014 18:43:23 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   << KEEPALIVE
Aug 27 18:43:47 ig1-resolver-02 Wed, 27 Aug 2014 18:43:47 | INFO     | 50337  | message       | Peer        10.3.4.1 ASN 39605   >> KEEPALIVE
Aug 27 18:43:47 ig1-resolver-02 Wed, 27 Aug 2014 18:43:47 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   >> KEEPALIVE
Aug 27 18:43:53 ig1-resolver-02 Wed, 27 Aug 2014 18:43:53 | INFO     | 50337  | message       | Peer        10.3.4.1 ASN 39605   << KEEPALIVE
Aug 27 18:43:53 ig1-resolver-02 Wed, 27 Aug 2014 18:43:53 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   << KEEPALIVE
Aug 27 18:45:59 ig1-resolver-02 Wed, 27 Aug 2014 18:45:59 | INFO     | 50337  | message       | Peer        10.3.4.1 ASN 39605   >> NOTIFICATION (4,0,"")
Aug 27 18:45:59 ig1-resolver-02 Wed, 27 Aug 2014 18:45:59 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   >> NOTIFICATION (4,0,"")
Aug 27 18:45:59 ig1-resolver-02 Wed, 27 Aug 2014 18:45:59 | INFO     | 50337  | network       | Peer        10.3.4.1 ASN 39605   out loop, peer reset, message [notification sent (4,0)] error[Hold timer
 expired / Unspecific]
Aug 27 18:45:59 ig1-resolver-02 Wed, 27 Aug 2014 18:45:59 | INFO     | 50337  | network       | Peer  158.58.177.193 ASN 39605   out loop, peer reset, message [notification sent (4,0)] error[Hold timer
 expired / Unspecific]
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer        10.3.4.1 ASN 39605   >> OPEN version=4 asn=65000 hold_time=180 router_id=10.3.4.53 capabiliti
es=[Multiprotocol(ipv4 unicast,ipv4 multicast,ipv4 nlri-mpls,ipv4 mpls-vpn,ipv4 flow,ipv4 flow-vpn,ipv6 unicast,ipv6 mpls-vpn,ipv6 flow,ipv6 flow-vpn,l2vpn vpls), ASN4(65000)]
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   >> OPEN version=4 asn=65000 hold_time=180 router_id=158.58.177.194 capab
ilities=[Multiprotocol(ipv4 unicast,ipv4 multicast,ipv4 nlri-mpls,ipv4 mpls-vpn,ipv4 flow,ipv4 flow-vpn,ipv6 unicast,ipv6 mpls-vpn,ipv6 flow,ipv6 flow-vpn,l2vpn vpls), ASN4(65000)]
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer        10.3.4.1 ASN 39605   << OPEN
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer        10.3.4.1 ASN 39605   << OPEN version=4 asn=39605 hold_time=90 router_id=10.0.1.1 capabilities
=[Multiprotocol(ipv4 unicast), Route Refresh, Graceful Restart Flags 0x0 Time 120 , ASN4(39605), Route Refresh]
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer        10.3.4.1 ASN 39605   >> KEEPALIVE (OPENCONFIRM)
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer        10.3.4.1 ASN 39605   << KEEPALIVE
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | network       | Connected to peer neighbor 10.3.4.1 local-ip 10.3.4.53 local-as 65000 peer-as 39605 router-id 10.3.4.53 f
amily-allowed in-open (out)
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer        10.3.4.1 ASN 39605   >> 4 UPDATE(s)
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   << OPEN
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   << OPEN version=4 asn=39605 hold_time=90 router_id=158.58.177.1 capabili
ties=[Multiprotocol(ipv4 unicast), Route Refresh, Graceful Restart Flags 0x0 Time 120 , ASN4(39605), Route Refresh]
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   >> KEEPALIVE (OPENCONFIRM)
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   << KEEPALIVE
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | network       | Connected to peer neighbor 158.58.177.193 local-ip 158.58.177.194 local-as 65000 peer-as 39605 router-id
158.58.177.194 family-allowed in-open (out)
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   >> 4 UPDATE(s)
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer        10.3.4.1 ASN 39605   << KEEPALIVE
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | peer 10.3.4.1 ASN 39605   >> EOR(s)
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   << KEEPALIVE
Aug 27 18:46:09 ig1-resolver-02 Wed, 27 Aug 2014 18:46:09 | INFO     | 50337  | message       | peer 158.58.177.193 ASN 39605   >> EOR(s)
Aug 27 18:46:36 ig1-resolver-02 Wed, 27 Aug 2014 18:46:36 | INFO     | 50337  | message       | Peer        10.3.4.1 ASN 39605   << KEEPALIVE
Aug 27 18:46:38 ig1-resolver-02 Wed, 27 Aug 2014 18:46:38 | INFO     | 50337  | message       | Peer  158.58.177.193 ASN 39605   << KEEPALIVE

and juniper side :

Aug 27 16:45:59  cr1.pa3.par rpd[1397]: bgp_read_v4_message:10656: NOTIFICATION received from 10.3.4.53 (External AS 65000): code 4 (Hold Timer Expired Error), socket buffer sndcc: 0 rcvcc: 0 TCP state: 4, snd_una: 2877949999 snd_nxt: 2877949999 snd_wnd: 14608 rcv_nxt: 2315758233 rcv_adv: 2315774596, hold timer out 90s, hold timer remain 1:13.314968s
Aug 27 16:45:59  cr1.pa3.par rpd[1397]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer 158.58.177.194 (External AS 65000) changed state from Established to Idle (event RecvNotify) (instance master)
Aug 27 16:45:59  cr1.pa3.par rpd[1397]: bgp_read_v4_message:10656: NOTIFICATION received from 158.58.177.194 (External AS 65000): code 4 (Hold Timer Expired Error), socket buffer sndcc: 0 rcvcc: 0 TCP state: 5, snd_una: 2445021333 snd_nxt: 2445021333 snd_wnd: 14608 rcv_nxt: 93474176 rcv_adv: 93490539, hold timer out 90s, hold timer remain 1:13.312421s
Aug 27 16:46:09  cr1.pa3.par rpd[1397]: bgp_parse_open_options: peer 10.3.4.53+37186 (proto): unsupported AF 2 SAFI 133
Aug 27 16:46:09  cr1.pa3.par rpd[1397]: bgp_parse_open_options: peer 10.3.4.53+37186 (proto): unsupported AF 2 SAFI 134
Aug 27 16:46:09  cr1.pa3.par rpd[1397]: bgp_process_caps: mismatch NLRI with 10.3.4.53 (External AS 65000): peer: <inet-unicast inet-multicast inet-vpn-unicast inet6-unicast l2vpn inet-labeled-unicast inet6-vpn-unicast inet-flow inet-vpn-flow>(25303) us: <inet-unicast>(1)
Aug 27 16:46:09  cr1.pa3.par rpd[1397]: bgp_parse_open_options: peer 158.58.177.194+34482 (proto): unsupported AF 2 SAFI 133
Aug 27 16:46:09  cr1.pa3.par rpd[1397]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer 158.58.177.194 (External AS 65000) changed state from OpenConfirm to Established (event RecvKeepAlive) (instance master)
Aug 27 16:46:09  cr1.pa3.par rpd[1397]: bgp_parse_open_options: peer 158.58.177.194+34482 (proto): unsupported AF 2 SAFI 134
Aug 27 16:46:09  cr1.pa3.par rpd[1397]: bgp_process_caps: mismatch NLRI with 158.58.177.194 (External AS 65000): peer: <inet-unicast inet-multicast inet-vpn-unicast inet6-unicast l2vpn inet-labeled-unicast inet6-vpn-unicast inet-flow inet-vpn-flow>(25303) us: <inet-unicast>(1)

Hope it help.



Le mardi 26 août 2014 16:07:10 UTC+2, Raphael Mazelier a écrit :

Thomas Mangin

unread,
Aug 27, 2014, 1:16:12 PM8/27/14
to exabgp...@googlegroups.com
Hi,

Thank you, it should help narrowing down the search ....

Below the open says that the received hold_time is 90 seconds and sent is 180, lower is 90, so we should see a KEEPALIVE every 30 secs.

> Aug 27 18:43:23 ig1-resolver-02 Wed, 27 Aug 2014 18:43:23 | INFO | 50337 | message | Peer 158.58.177.193 ASN 39605 << KEEPALIVE

received 43:23

> Aug 27 18:43:47 ig1-resolver-02 Wed, 27 Aug 2014 18:43:47 | INFO | 50337 | message | Peer 10.3.4.1 ASN 39605 >> KEEPALIVE
> Aug 27 18:43:47 ig1-resolver-02 Wed, 27 Aug 2014 18:43:47 | INFO | 50337 | message | Peer 158.58.177.193 ASN 39605 >> KEEPALIVE

sent 43:47

> Aug 27 18:43:53 ig1-resolver-02 Wed, 27 Aug 2014 18:43:53 | INFO | 50337 | message | Peer 10.3.4.1 ASN 39605 << KEEPALIVE
> Aug 27 18:43:53 ig1-resolver-02 Wed, 27 Aug 2014 18:43:53 | INFO | 50337 | message | Peer 158.58.177.193 ASN 39605 << KEEPALIVE

So we expected to receive a KEEPALIVE at 43:23 + 30 -> 43:53 .. Here it was !

> Aug 27 18:45:59 ig1-resolver-02 Wed, 27 Aug 2014 18:45:59 | INFO | 50337 | message | Peer 10.3.4.1 ASN 39605 >> NOTIFICATION (4,0,"")
> Aug 27 18:45:59 ig1-resolver-02 Wed, 27 Aug 2014 18:45:59 | INFO | 50337 | message | Peer 158.58.177.193 ASN 39605 >> NOTIFICATION (4,0,"")

Then nothing .... until this message ...

43:53 + 30 -> 44:23
43:23 + 30 -> 44:53
44:53 + 30 -> 45:23

So why the hell does it takes up to 45:59 to send the notification that the KEEPALIVE is missing. As well during that time we have NOT sent / received our own keepalive.
Could you please turn on the tracking of keepalive timer .. if it does not, can you please enable it "by hand".

> Aug 27 18:45:59 ig1-resolver-02 Wed, 27 Aug 2014 18:45:59 | INFO | 50337 | network | Peer 10.3.4.1 ASN 39605 out loop, peer reset, message [notification sent (4,0)] error[Hold timer
> expired / Unspecific]
> Aug 27 18:45:59 ig1-resolver-02 Wed, 27 Aug 2014 18:45:59 | INFO | 50337 | network | Peer 158.58.177.193 ASN 39605 out loop, peer reset, message [notification sent (4,0)] error[Hold timer
> expired / Unspecific]

saying that we are closing the session due to hold timer.
Re-sucesful session establishment.

> and juniper side :

Do you not have any logs before ? as I can not see the keepalive exchange.

> Aug 27 16:45:59 cr1.pa3.par rpd[1397]: bgp_read_v4_message:10656: NOTIFICATION received from 10.3.4.53 (External AS 65000): code 4 (Hold Timer Expired Error), socket buffer sndcc: 0 rcvcc: 0 TCP state: 4, snd_una: 2877949999 snd_nxt: 2877949999 snd_wnd: 14608 rcv_nxt: 2315758233 rcv_adv: 2315774596, hold timer out 90s, hold timer remain 1:13.314968s
> Aug 27 16:45:59 cr1.pa3.par rpd[1397]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer 158.58.177.194 (External AS 65000) changed state from Established to Idle (event RecvNotify) (instance master)

> Aug 27 16:45:59 cr1.pa3.par rpd[1397]: bgp_read_v4_message:10656: NOTIFICATION received from 158.58.177.194 (External AS 65000): code 4 (Hold Timer Expired Error), socket buffer sndcc: 0 rcvcc: 0 TCP state: 5, snd_una: 2445021333 snd_nxt: 2445021333 snd_wnd: 14608 rcv_nxt: 93474176 rcv_adv: 93490539, hold timer out 90s, hold timer remain 1:13.312421s


ExaBGP asks for the session to be teared down.

> Aug 27 16:46:09 cr1.pa3.par rpd[1397]: bgp_parse_open_options: peer 10.3.4.53+37186 (proto): unsupported AF 2 SAFI 133
> Aug 27 16:46:09 cr1.pa3.par rpd[1397]: bgp_parse_open_options: peer 10.3.4.53+37186 (proto): unsupported AF 2 SAFI 134
> Aug 27 16:46:09 cr1.pa3.par rpd[1397]: bgp_process_caps: mismatch NLRI with 10.3.4.53 (External AS 65000): peer: <inet-unicast inet-multicast inet-vpn-unicast inet6-unicast l2vpn inet-labeled-unicast inet6-vpn-unicast inet-flow inet-vpn-flow>(25303) us: <inet-unicast>(1)

New session established, ExaBGP announce some families the Juniper does not have configured - all fine.

> Aug 27 16:46:09 cr1.pa3.par rpd[1397]: bgp_parse_open_options: peer 158.58.177.194+34482 (proto): unsupported AF 2 SAFI 133
> Aug 27 16:46:09 cr1.pa3.par rpd[1397]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer 158.58.177.194 (External AS 65000) changed state from OpenConfirm to Established (event RecvKeepAlive) (instance master)
> Aug 27 16:46:09 cr1.pa3.par rpd[1397]: bgp_parse_open_options: peer 158.58.177.194+34482 (proto): unsupported AF 2 SAFI 134
> Aug 27 16:46:09 cr1.pa3.par rpd[1397]: bgp_process_caps: mismatch NLRI with 158.58.177.194 (External AS 65000): peer: <inet-unicast inet-multicast inet-vpn-unicast inet6-unicast l2vpn inet-labeled-unicast inet6-vpn-unicast inet-flow inet-vpn-flow>(25303) us: <inet-unicast>(1)

Session established.

Thomas
signature.asc

Raphael Mazelier

unread,
Sep 2, 2014, 12:18:03 PM9/2/14
to exabgp...@googlegroups.com
With the help of thomas we can investigate and solve the problem.
The issue was on the virtual machine wich a very instable clock.

So be sure to check ntp/setting before complaining about exabgp :)



Le mardi 26 août 2014 16:07:10 UTC+2, Raphael Mazelier a écrit :

Thomas Mangin

unread,
Sep 2, 2014, 1:00:20 PM9/2/14
to exabgp...@googlegroups.com
Hello,

ExaBGP has a second based loop and any clock drift will affect it, but in most case it will be harmless. Each second ExaBGP looks after every peer and I/O events and then sleeps for the remaining of the second if nothing need to be done. ExaBGP already bound the sleeping time between 0 and 1 seconds so the main reactor does not really suffer from the issue, however when it comes to check the validity of the keepalive, it is the possible for the code to believe the hold timer expired. It will then cause ExaBGP to close the session and re-establish it. Should you have graceful restart setup, the routes sent should not flap, otherwise a short flap period will be noticed..

There is little which can be done, to prevent this issue. ExaBGP could check every loop to see if the time difference (last recorded time + 1 second - current clock time ) is absurd ( over a few seconds ) but then the best course of action is not obvious : should it correct the last saved time to be what is expected ? And if then, how many time should it be allowed and how often ?

I am not aware of any portable ways to get the relative time in seconds between two systems calls in python. AFAIK, jiffies are not constant. This issue has been acknoledged by the python community which is currently considering PEP 418 to address this issue : http://legacy.python.org/dev/peps/pep-0418/ but I have not looked into it much ..

Thomas

signature.asc
Reply all
Reply to author
Forward
0 new messages