My new job sucked a little time away from me at the start and now I am getting back on board with some nice new blog posts.

(This hot little blog is featured on @packetpushers website right now http://packetpushers.net/slow-down-its-in-the-details-a-story-of-bgp-peering-trauma )

Much like any exam and from what I hear the CCIE it’s all about what is in the details. After a few short discussions and quite a few more emails with our ISP and account managers we were wanting to establish a BGP peering amongst sites. This is a private intra-site BGP peering leveraging the ISP’s internal network. Nothing overally complex. Just a simple BGP peering. Outlined in the long email chain which all parties had was the ISP assigned ASNs and our assigned ASNs.

In the outage window we attempted to peer and saw the error below.

BGP: 192.168.10.54 open failed: Connection refused by remote host, open active delayed 14931ms  (35000ms max, 60% jitter)

Initially thinking it was an ACL blocking port 179 or something along those lines we looked into it. After checking our end thoroughly and confirmed out configs were fine we then looked back to the ISP. ( I bet I am not the first to have to do this 😉 )

It ended up that the ISP configured our ASN on his router and didn’t read their own supplied config diagram. The moment he changed his ASN, our peering came up and we had connectivity.

Other examples where this issue can appear are listed below

  • The neighbor statement is incorrect. 
  • No routes to the neighbor address exist , or the default route (0.0.0.0/0) is being used to reach the peer. 
  •  The update-source command is missing under BGP. 
  • A typing error resulted in the wrong IP address in the neighbor statement or the wrong autonomous system number. .
  •  Unicast is broken due to one of the following reasons:
         – Wrong virtual circuit (VC) mapping in an Asynchronous Transfer Mode (ATM) 
           or Frame Relay environment in a highly redundant network.
         – Access list is blocking the unicast or TCP packet.
         – Network Address Translation (NAT) is running on the router 
           and is translating the unicast packet.
         – Layer 2 is down.

My learnings from this:

It goes to show the importance of reading all the information before running in with half a picture. I am of the strong believe that knowledge is power and allows you to make sound decisions. Slowing down and reading everyones input in this case would of meant a much smoother migration and things working the first time.

Leave a Reply

Your email address will not be published. Required fields are marked *

*