Здравствуйте. По несколько раз на дню начала падать основная BGP-сессия с аплинком.
XX.XX.160.81 - основная сессия, адрес аплинка (XX.XX.160.82 - наш IP-адрес)
ХХ.ХХ.130.249 - резерв, адрес аплинка (ХХ.ХХ.130.250 - наш ІР-адрес)
XX.XX.204.0/22 - наша подсеть (анонсируем аплинку)
ХХ480 - наша AS
От аплинка получаем только 0.0.0.0/0
Включил
debug ip bgp all. В логах видно:
Код:
*Dec 15 13:47:27: %7: XX.XX.160.81-Outgoing [FSM] Keepalive-Timer Expiry
*Dec 15 13:47:27: %7: XX.XX.160.81-Outgoing [FSM] State: Established Event: 11
*Dec 15 13:47:27: %7: XX.XX.160.81-Outgoing [ENCODE] Msg-Hdr: Type 4
*Dec 15 13:47:27: %7: XX.XX.160.81-Outgoing [ENCODE] Keepalive: 146 KAlive msg(s) sent
*Dec 15 13:47:31: %7: XX.XX.160.81-Outgoing [FSM] Hold-Timer Expiry
*Dec 15 13:47:31: %7: XX.XX.160.81-Outgoing [FSM] State: Established Event: 10
*Dec 15 13:47:31: %BGP-5-ADJCHANGE: Neighbor XX.XX.160.81 Down Hold Timer Expired.
*Dec 15 13:47:31: %7: XX.XX.160.81-Outgoing [ENCODE] Msg-Hdr: Type 3
*Dec 15 13:47:31: %BGP-3-NOTIFICATION: Sent to neighbor XX.XX.160.81 4/0 (Hold Timer Expired/Unspecified Error Subcode) 0 bytes.
*Dec 15 13:47:31: %7: XX.XX.160.81-Outgoing [FSM] State Change: Established(6)->Idle(1)
*Dec 15 13:47:31: %7: [NH-1:1] Delete nexthop[XX.XX.160.81/32] from cache[lock:1->0]
Полный лог за 10 минут в момент падения в файле (с 13:40 по 13:50).
Аплинк говорит что ето проблема с нашым маршрутизатором, он не обрататывает keepalive пакеты.
У себя снимаю загрузку CPU и памяти - проблем нет, все в норме как и раньше. До этого работало 2 года без проблем.
У них такая ошибка:
Код:
bgp_read_v4_message:10151: NOTIFICATION received from ХХ.ХХ.160.82 (External AS ХХ480): code 4 (Hold Timer Expired Err
or), socket buffer sndcc: 57 rcvcc: 0 TCP state: 4, snd_una: 1774554643 snd_nxt: 1774554681 snd_wnd: 4096 rcv_nxt: 636004 rcv_adv: 652367, hol
d timer out 90s, hold timer remain 25.011062s
sh ip bgpКод:
[b]G#sh ip bgp[/b]
BGP table version is 13, local router ID is XX.XX.160.82
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
S Stale
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 0.0.0.0/0 XX.XX.160.81 0 200 0 XX55 i
* XX.XX.130.249 0 100 0 XX55 i
*> XX.XX.204.0/22 0.0.0.0 0 32768 i
Total number of prefixes 2
[b]G#sh ip bgp summary[/b]
BGP router identifier XX.XX.160.82, local AS number XX480
BGP table version is 13
3 BGP AS-PATH entries
1 BGP Community entries
2 BGP Prefix entries (Maximum-prefix:4294967295)
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
XX.XX.130.249 4 XX55 3565 3982 8 0 0 1d04h49m 1
XX.XX.160.81 4 XX55 393 441 12 0 0 03:09:42 1
Total number of neighbors 2
[b]G#sh ip bgp neighbors[/b]
BGP neighbor is XX.XX.130.249, remote AS XX55, local AS XX480, external link
Description: UPLINK-Slave
BGP version 4, remote router ID XX.XX.212.43
BGP state = Established, up for 1d04h49m
Last read 1d04h49m, hold time is 90, keepalive interval is 30 seconds
Neighbor capabilities:
Route refresh: advertised and received (old and new)
Four-octets ASN Capability: advertised and received
Address family IPv4 Unicast: advertised and received
Graceful Restart Capability: received
Remote Restart timer is 120 seconds
Address families preserved by peer:
IPv4 Unicast (was preserved)
Received 3566 messages, 0 notifications, 0 in queue
open message:1 update message:1 keepalive message:3564
refresh message:0 dynamic cap:0 notifications:0
Sent 3982 messages, 0 notifications, 0 in queue
open message:1 update message:1 keepalive message:3980
refresh message:0 dynamic cap:0 notifications:0
Route refresh request: received 0, sent 0
Minimum time between advertisement runs is 30 seconds
For address family: IPv4 Unicast
BGP table version 13, neighbor version 8
Index 1, Offset 0, Mask 0x2
Inbound soft reconfiguration allowed
Community attribute sent to this neighbor (both)
Inbound path policy configured
Outbound path policy configured
Route map for incoming advertisements is *map-UPLINK-Slave-in
Route map for outgoing advertisements is *map-UPLINK-Slave-out
1 accepted prefixes
1 announced prefixes
Connections established 7; dropped 6
Local host: XX.XX.130.250, Local port: 1112
Foreign host: XX.XX.130.249, Foreign port: 179
Nexthop: XX.XX.130.250
Nexthop global: ::
Nexthop local: ::
BGP connection: non shared network
Last Reset: 1d04h49m, due to BGP Notification sent
Notification Error Message: (Hold Timer Expired/Unspecified Error Subcode)
BGP neighbor is XX.XX.160.81, remote AS XX55, local AS XX480, external link
Description: UPLINK-Main
BGP version 4, remote router ID XX.XX.212.43
BGP state = Established, up for 03:09:46
Last read 03:09:43, hold time is 90, keepalive interval is 30 seconds
Neighbor capabilities:
Route refresh: advertised and received (old and new)
Four-octets ASN Capability: advertised and received
Address family IPv4 Unicast: advertised and received
Graceful Restart Capability: received
Remote Restart timer is 120 seconds
Address families preserved by peer:
IPv4 Unicast (was preserved)
Received 393 messages, 0 notifications, 0 in queue
open message:1 update message:1 keepalive message:391
refresh message:0 dynamic cap:0 notifications:0
Sent 441 messages, 0 notifications, 0 in queue
open message:1 update message:1 keepalive message:439
refresh message:0 dynamic cap:0 notifications:0
Route refresh request: received 0, sent 0
Minimum time between advertisement runs is 30 seconds
For address family: IPv4 Unicast
BGP table version 13, neighbor version 12
Index 2, Offset 0, Mask 0x4
Inbound soft reconfiguration allowed
Community attribute sent to this neighbor (both)
Inbound path policy configured
Outbound path policy configured
Route map for incoming advertisements is *map-UPLINK-in
Route map for outgoing advertisements is *map-UPLINK-out
1 accepted prefixes
1 announced prefixes
Connections established 10; dropped 9
Local host: XX.XX.160.82, Local port: 179
Foreign host: XX.XX.160.81, Foreign port: 52155
Nexthop: XX.XX.160.82
Nexthop global: ::
Nexthop local: ::
BGP connection: non shared network
Last Reset: 03:09:46, due to BGP Notification sent
Notification Error Message: (Cease/Unspecified Error Subcode)
G#
G#sh versionКод:
System description : DGS-3610-26G Gigabit Ethernet Switch
System start time : 2013-02-13 7:8:10
System uptime : 292:17:6:43
System hardware version : A1.0
System software version : v10.4(3T59) Release(136633)
System BOOT version : 10.3 Release(70398)
System CTRL version : 10.3 Release(70398)
Device information:
Device-1
Hardware version : A1.0
Software version : v10.4(3T59) Release(136633)
BOOT version : 10.3 Release(70398)
CTRL version : 10.3 Release(70398)