The BGP Protocol

The Border Gateway Protocol (BGP) is an exterior routing protocol used for exchanging routing information between autonomous systems. BGP is used for exchange of routing information between multiple transit autonomous systems as well as between transit and stub autonomous systems. BGP is related to EGP but operates with more capability, greater flexibility, and less required bandwidth. BGP uses path attributes to provide more information about each route, and in particular maintain an AS path, which includes the AS number of each autonomous system the route has transited, providing information sufficient to prevent routing loops in an arbitrary topology. Path attributes may also be used to distinguish between groups of routes to determine administrative preferences, allowing greater flexibility in determining route preference to achieve a variety of administrative ends.

BGP supports two basic types of sessions between neighbours, internal (sometimes refered to as IBGP) and external. Internal sessions are run between routers in the same autonomous system, while external sessions run between routers in different autonomous systems. When sending routes to an external peer the local AS number is prepended to the AS path, hence routes received from an external peer are guaranteed to have the AS number of that peer at the start of the path. Routes received from an internal neighbour will not in general have the local AS number prepended to the AS path, and hence will in general have the same AS path that the route had when the originating internal neighbour received the route from an external peer. Routes with no AS numbers in the path may be legitimately received from internal neighbours; these indicate that the received route should be considered internal to your own AS.

The BGP implementation supports three versions of the BGP protocol, versions 2, 3 and 4. BGP versions 2 and 3 are quite similar in capability and function. They will only propagate classed network routes, and the AS path is a simple array of AS numbers. BGP 4 will propagate fully general address-and-mask routes, and the AS path has some structure to represent the results of aggregating dissimilar routes.

External BGP sessions may or may not include a single metric, which BGP calls the Multi-Exit Discriminator, in the path attributes. For BGP versions 2 and 3 this metric is a 16-bit unsigned integer, for BGP version 4 it is a 32-bit unsigned integer. In either case smaller values of the metric are to be preferred. Currently this metric is only used to break ties between routes with equal preference from the same neighbour AS. Internal BGP sessions carry at least one metric in the path attributes, which BGP calls the LocalPref. The size of the metric is identical to the MED. For BGP versions 2 and 3 this metric is considered better when its value is smaller, for version 4 it is better when it is larger. BGP version 4 sessions may optionally carry a second metric on internal sessions, this being an internal version of the Multi-Exit Discriminator. The use of these metrics is dependent on the type of internal protocol processing which is specified.

BGP collapses routes with similar path attributes into a single update for advertisement. Routes that are received in a single update will be readvertised in a single update. The churn caused by the loss of a neighbor will be minimized and the initial advertisement sent during peer establishment will be maximally compressed. BGP does not read information from the kernel message-by-message, but fills the input buffer. It processes all complete messages in the buffer before reading again. BGP also does multiple reads to clear all incoming data queued on the socket. This feature may cause other protocols to be blocked for prolonged intervals by a busy peer connection.

All unreachable messages are collected into a single message and sent prior to reachable routes during a flash update. For these unreachable announcements, the next hop is set to the local address on the connection, no metric is sent and the path origin is set to incomplete. On external connections the AS path in unreachable announcements is set to the local AS, on internal connections the AS path is set to zero length.

The BGP implementation expects external peers to be directly attached to a shared subnet, and expects those peers to advertise next hops which are host addresses on that subnet (though this constraint can be relaxed by configuration for testing). For groups of internal peers, however, there are several alternatives which may be selected from by specifying the group type. Type internal groups expect all peers to be directly attached to a shared subnet so that, like external peers, the next hops received in BGP advertisements may be used directly for forwarding. Type routing groups instead will determine the immediate next hops for routes by using the next hop received with a route from a peer as a forwarding address, and using this to look up an immediate next hop in an IGP's routes. Such groups support distant peers, but need to be informed of the IGP whose routes they are using to determine immediate next hops. Finally, type igp groups expect routes from the group peers to not be used for forwarding at all. Instead they expect that copies of the BGP routes received will also be received via an IGP, and that the BGP routes will only be used to determine the path attributes associated with the IGP routes. Such groups also support distant peers, and also need to be informed of the IGP they are running with.

For internal BGP group types (and for test groups), where possible a single outgoing message is built for all group peers based on the common policy. A copy of the message is sent to every peer in the group, with possible adjustments to the next hop field as appropriate to each peer. This minimizes the computational load of running large numbers of peers in these types of groups. BGP allows unconfigured peers to connect if an appropriate group has been configured with an allow clause.


The BGP Statement

    bgp yes | no | on | off
    [ {
        preference preference ;
        defaultmetric metric ;
        traceoptions trace_options ;
        group type ( external peeras autonomous_system )
	    | ( internal peeras autonomous_system )
            | ( igp peeras autonomous_system proto proto )
            | ( routing peeras autonomous_system proto proto
                    interface interface_list )
	    | ( test peeras autonomous_system )
        {
            allow {
                network
                network mask mask
                network masklen number
                all
                host host
            } ;
            peer host
                [ metricout metric ]
                [ localas autonomous_system ]
                [ nogendefault ]
                [ gateway gateway ]
                [ preference preference ]
                [ preference2 preference ]
                [ lcladdr local_address ]
                [ holdtime time ]
                [ version number ]
                [ passive ]
                [ sendbuffer number ]
                [ recvbuffer number ]
                [ indelay time ]
                [ outdelay time ]
                [ keep [ all | none ] ]
                [ analretentive ]
                [ noauthcheck ]
                [ noaggregatorid ]
                [ keepalivesalways ]
                [ v3asloopokay ]
                [ nov4asloop ]
                [ logupdown ]
                [ ttl ttl ]
                [ traceoptions trace_options ]
                ;
        } ;
    } ] ;

external | internal | igp | test     

The bgp statement enables or disables BGP. By default BGP is disabled. The default metric for announcing routes via BGP is not to send a metric.

preference preference
Sets the preference for routes learned from RIP. The default preference is 170. This preference may be overridden by a preference specified on the group or peer statments or by import policy.
defaultmetric metric
Defines the metric used when advertising routes via BGP. If not specified, no metric is propagated. This metric may be overridden by a metric specified on the neighbor or group statements or in export policy.
traceoptions trace_options
Specifies the tracing options for BGP. By default these are inherited from the global trace options. These values may be overridden on a group or neighbor basis. (See Trace Statements and the BGP specific tracing options below.)

Groups

BGP peers are grouped by type and the autonomous system of the peers. Any number of groups may be specified, but each must have a unique combination of type and peer autonomous system. There are four possible group types:
group type external peeras autonomous_system
In the classic external BGP group, full policy checking is applied to all incoming and outgoing advertisements. The external neighbors must be directly reachable through one of the machine's local interfaces. By default no metric is included in external advertisements, and the next hop is computed with respect to the shared interface.

group type internal peeras autonomous_system
An internal group operating where there is no IP-level IGP, for example an SMDS network or MILNET. All neighbors in this group are required to be directly reachable via a single interface. All next hop information is computed with respect to this interface. Import and export policy may be applied to group advertisements. Routes received from external BGP or EGP neighbors are by default readvertised with the received metric.

group type igp peeras autonomous_system proto proto
An internal group that runs in association with an interior protocol. The IGP group examines routes which the IGP is exporting and sends an advertisement only if the path attributes could not be entirely represented in the IGP tag mechanism. Only the AS path, path origin, and transitive optional attributes are sent with routes. No metric is sent, and the next hop is set to the local address used by the connection. Received internal BGP routes are not used or readvertised. Instead, the AS path information is attached to the corresponding IGP route and the latter is used for readvertisement. Since internal IGP peers are sent only a subset of the routes which the IGP is exporting, the export policy used is the IGP's. There is no need to implement the "don't routes from peers in the same group" constraint since the advertised routes are routes that IGP already exports.

group type routing peeras autonomous_system proto proto interface interface_list
An internal group which uses the routes of an interior protocol to resolve forwarding addresses. A type routing group propagates external routes between routers which are not directly connected, and computes immediate next hops for these routes by using the BGP next hop which arrived with the route as a forwarding address to be resolved via an internal protocol's routing information. In essence, internal BGP is used to carry AS external routes, while the IGP is expected to only carry AS internal routes, and the latter is used to find immediate next hops for the former.

The proto names the interior protocol to be used to resolve BGP route next hops, and may be the name of any IGP in the configuration. By default the next hop in BGP routes advertised to type routing peers will be set to the local address on the BGP connection to those peers, as it is assumed a route to this address will be propagated via the IGP. The interface_list can optionally provide a list interfaces whose routes are carried via the IGP for which third party next hops may be used instead.

group type test peeras autonomous_system
An extension to external BGP which implements a fixed policy using test peers. Fixed policy and special case code make test peers relatively inexpensive to maintain. Test peers do not need to be on a directly attached network. If GateD and the peer are on the same (directly attached) subnet, the advertised next hop is computed with respect to that network, otherwise the next hop is the local machine's current next hop. All routing information advertised by and received from a test peer is discarded, and all BGP advertiseable routes are sent back to the test peer. Metrics from EGP- and BGP-derived routes are forwarded in the advertisement, otherwise no metric is included.

Group parameters

The BGP statement has group clauses and peer subclauses. Any number of peer subclauses may be specified within a group. A group clause usually defines default parameters for a group of peers, these parameters apply to all subsidiary peer subclauses. Any parameters from the peer subclause may be specified on the group clause to provide defaults for the whole group (which may be overridden for individual peers).

Specifying peers

Within a group, BGP peers may be configured in one of two ways. They may be explicitly configured with a peer statement, or implicitly configured with the allow statement. Both are described here:
allow
The allow clauses allows for peer connections from any addresses in the specified range of network and mask pairs. All parameters for these peers must be configured on the group clause. The internal peer structures are created when an incomming open request is received and destroyed when the connection is broken. For more detail on specifying the network/mask pairs, see the section on Route Filtering.
peer host
A peer clause configures an individual peer. Each peer inherits all parameters specified on a group as defaults. Those default may be overridden by parameters explicitly specified on the peer subclaus.

Within each group clause, individual peers can be specified or a group of potential peers can be specified using allow. Allow is used to specify a set of address masks. If GateD receives a BGP connection request from any address in the set specified, it will accept it and set up a peer relationship.

Peer parameters

The BGP peer subclause allows the following parameters, which can also be specified on the group clause. All are optional.

metricout metric
If specified, this metric is used as the primary metric on all routes sent to the specified peer(s). This metric overrides the default metric, a metric specified on the group and any metric specified by export policy.
localas autonomous_system
Identifies the autonomous system which GateD is representing to this group of peers.. The default is that which has been set globally in the autonomoussystem statement.
nogendefault
Prevents GateD from generating a default route when EGP receives a valid update from its neighbor. The default route is only generated when the gendefault option is enabled.
gateway gateway
If a network is not shared with a peer, gateway specifies a router on an attached network to be used as the next hop router for routes received from this neighbor. This parameter is not needed in most cases.
preference preference
Specifies the preference used for routes learned from these peers. This can differ from the default BGP preference set in the bgp statement, so that GateD can prefer routes from one peer, or group of peer, over othes. This preference may be explicitly overriden by import policy.
preference2 preference
In the case of a preference tie, the second preference, preference2 may be used to break the tie. The default value is 0.
lcladdr local_address
Specifies the address to be used on the local end of the TCP connection with the peer. For external peers the local address must be on an interface which is shared with the peer or with the peer's gateway when the gateway parameter is used. A session with an external peer will only be opened when an interface with the appropriate local address (through which the peer or gateway address is directly reachable) is operating. For other types of peers, a peer session will be maintained when any interface with the specified local address is operating. In either case incoming connections will only be recognized as matching a configured peer if they are addressed to the configured local address.
holdtime time
Specifies the BGP holdtime value to use when negotiating the connection with this peer, in seconds. According to BGP, if GateD does not receive a keepalive, update, or notification message within the period specified in the Hold Time field of the BGP Open message, then the BGP connection will be closed. The value must be either 0 (no keepalives will be sent) or at least 3.
version version
Specifies the version of the BGP protocol to use with this peer. If not specified, the highest supported version is used first and version negotiation is attempted. If it is specified, only the specified version will be offered during negotiation. Currently supported version are 2, 3 and 4.
passive
Specifies that active OPENs to this peer should not be attempted. GateD should wait for the peer to issue an open. By default all explicitly configured peers are active, they periodically send OPEN messages until the peer responds.
sendbuffer buffer_size
recvbuffer buffer_size
Control the amount of send and receive buffering asked of the kernel. The maximum supported is 65535 bytes although many kernels have a lower limit. By default, GateD configures the maximum supported. These parameters are not needed on normally functioning systems.
indelay time
outdelay time
Used to dampen route fluctuations. Indelay is the amount of time a route learned from a BGP peer must be stable before it is accepted into the gated routing database. Outdelay is the amount of time a route must be present in the gated routing database before it is exported to BGP. The default value for each is 0, meaning that these features are disabled.
keep all
Used to retain routes learned from a peer even if the routes' AS paths contain one of our exported AS numbers.
analretentive
Causes GateD to issue warning messages when receiving questionable BGP updates such as duplicate routes and/or deletions of non-existing routes. Normally these events are silently ignored.
noauthcheck
Normally GateD verifies that incoming packets have an authentication field of all ones. This option may be used to allow communication with an implementation that uses some other form of authentication.
noaggregatorid
Causes GateD to specify the routerid in the aggregator attribute as zero (instead of its routerid) in order to prevent different routers in an AS from creating aggregate routes with different AS paths.
keepalivesalways
Causes gated to always send keepalives, even when an update could have correctly substituted for one. This allows interoperability with routers that do not completely obey the protocol specifications on this point.
v3asloopokay
By default gated will not advertise routes whose AS path is looped (i.e. with an AS appearing more than once in the path) to version 3 external peers. Setting this flag removes this constraint. Ignored when set on internal groups or peers.
nov4asloop
Prevents routes with looped AS paths from being advertised to version 4 external peers. This can be useful to avoid advertising such routes to peer which would incorrectly forward the routes on to version 3 neighbours.
logupdown
Causes a message to be logged via the syslog mechanism whenever a BGP peer enters or leaves the ESTABLISHED state.
ttl ttl
By default, GateD sets the IP TTL for local peers to one and the TTL for non-local peers to 255. This option mainly is provided when attempting to communicate with improperly functioning routers that ignore packets sent with a TTL of one. Not all kernels allow the TTL to be specified for TCP connections.
traceoptions trace_options
Specifies the tracing options for this BGP neighbor. By default these are inherited from group or BGP global trace options. (See Trace Statements and the BGP specific tracing options below.)

Tracing options

Note that the state option works with BGP, but does not provide true state transition information.

Packet tracing options (which may be modified with detail, send and recv):

packets
All BGP packets
open
BGP OPEN packets which are used to establish a peer relationship.
update
BGP UPDATE packets which are used to pass network reachability information.
keepalive
BGP KEEPALIVE packets which are used to verify peer reachability.

Last updated 1994/05/26 02:24:30.

gated@gated.cornell.edu