Creating ExternalIPs in OpenShift with BGP and MetalLB

Posted by Mark DeNeve on Tuesday, September 30, 2025

Introduction

In a my previous article Creating ExternalIPs in OpenShift with MetalLB, we used MetalLB and the MetalLB Operator to create LoadBalancer IP addresses using the Layer 2 (or L2) mode in an OpenShift cluster. This mode uses a Gossip Protocol to detect failed nodes, and ARP to broadcast which node is serving the IP address assigned by MetalLB. This allows an OpenShift cluster to create LoadBalancer IP addresses that are available on the same subnet as the nodes of the cluster by using additional IPs from that subnet.

In addition to the Layer 2 mode, MetalLB supports another mode known as BGP Mode:

  • BGP Mode: In BGP mode, each node in your cluster establishes a BGP peering session with your network routers, and uses that peering session to advertise the IPs of external cluster services.

BGP Mode puts the responsibility of detecting a failed node on the network hardware, thus eliminating the need for the Gossip protocol and ARPs.

One other major difference between Layer2 and BGP configurations is that in L2 you need to use IP addresses that are on the same subnet as your host nodes. With BGP this is the exact opposite and you need to assign IPs from a different subnet that will be exclusively used by MetalLB. You will also need an upstream Router that speaks BGP and can be properly configured to handle BGP peering. This blog post will discuss using Mikrotik hardware as our upstream router. Any commands or configurations for the Mikrotik router will be shown in the UI as well as through the command terminal.

Network Setup

Since this post will be very network-centric, we need to discuss a little about the network setup in the lab. In the lab we have the following network configuration:

  • HostSubnet: 172.16.25.0/24 - This is the network on which the Control Plane and Worker nodes exist. IPs are handed out via DHCP.
  • ClusterNetwork: 10.128.0.0/14 - The IP address blocks for pods.
  • MachineNetwork: 10.0.0.0/16 - The IP address blocks for machines.
  • ServiceNetwork: 172.30.0.0/16 - The IP address block for services.
  • ExternalIP: 172.16.130.0/24 - We are using MetalLB in “BGP mode”. The Upstream router has been allocated 172.16.128.0/17 for all BGP Traffic. We will take one slice of 24 out of this much larger allocation.

NOTE: Caution must be taken to ensure that the HostSubnet, ClusterNetwork, MachineNetwork, ServiceNetwork, and ExternalIP ranges do not conflict with each other or other networks in your environment.

Configuring Mikrotik Router for BGP Peering

In my lab I use a mixture of Ubiquity and Mikrotik hardware. Since this Blog post will focus in on using BGP for creating our service IPs I will be giving details on how I have configured my lab router to accept the BGP routes. You will need to configure your specific network equipment to work with MetalLB. As we will be using BGP for managing routes we need to gather some data from our upstream Mikrotik router. This can be done via the Web UI (“Webfig”), or via the command line. The first thing we are going to do is set up a “BGP Template”. The BGP Template will be used in our next step when we create a set of “BGP Connections”.

Creating Mikrotik BGP Template

The BGP Template contains all BGP protocol-related configuration options. In this case, to work with MetalLB there are a few options we will need to set the following:

  • name: metallb
  • AS: 65003
  • AFI: ensure that IPv4, and IPv6 are checked
  • Router ID: Set this to the gateway on the Mikrotik that will be the “first hop” for your OpenShift cluster
  • Extra -> Use BFD: Set this to True, as this will speed up failover detection

These settings will look like this in the UI:

bgp template

If you are setting these settings on the command line it will look like this:

/routing bgp template
add afi=ip,ipv6 as=65003 disabled=no name=metallb router-id=172.16.25.1 routing-table=main use-bfd=yes

Note that you will need to update the command above using the IP address of your router if it is not 172.16.25.1.

Creating Mikrotik BGP Connections

With our metallb BGP template created, we can now configure the bgp connections. A BGP connection must be created for EACH NODE in your cluster that will be acting as a router for the IP address space 172.16.130.0/24 that we allocated earlier. The lab environment I will be using as part of this post is a “Compact Cluster” or a three node cluster. So, as part of our setup (and eventual MetalLB deployment) we will make all three nodes speakers, so we need to create a connection in the Mikrotik router for each node. In my lab there are three nodes, with IPs 172.16.25.140, 172.16.25.141, 172.16.25.142. We will create an individual connection for each of these nodes:

/routing bgp connection
add afi=ip,ipv6 as=65003 disabled=no local.address=172.16.25.1 .role=ebgp name=compact-master-0 remote.address=172.16.25.142 .as=65230 router-id=172.16.25.1 routing-table=main templates=metallb use-bfd=yes
add afi=ip,ipv6 as=65003 disabled=no local.address=172.16.25.1 .role=ebgp name=compact-master-1 remote.address=172.16.25.141 .as=65230 router-id=172.16.25.1 routing-table=main templates=metallb use-bfd=yes
add afi=ip,ipv6 as=65003 disabled=no local.address=172.16.25.1 .role=ebgp name=compact-master-2 remote.address=172.16.25.140 .as=65230 router-id=172.16.25.1 routing-table=main templates=metallb use-bfd=yes

In the Webfig UI this looks like this:

bgp connection configuration

With our Mikrotik router configured for BGP, we can now move on to configuring our OpenShift cluster to take advantage of BGP routing with MetalLB.

Prerequisites

  • OpenShift Cluster 4.19 or later
  • Cluster Admin privileges on an OpenShift Cluster
  • oc command
  • iPerf client installed on a test machine

Operator Install

The MetalLB Operator is the easiest way to get MetalLB installed in your cluster. We will navigate to OperatorHub in the OpenShift UI and follow the steps below to install the operator:

Start by creating a file called metallb-namespace.yaml with the following contents:

apiVersion: metallb.io/v1beta1
kind: MetalLB
metadata:
  name: metallb
  namespace: metallb-system
  labels:
    openshift.io/cluster-monitoring: true

Apply the namespace with oc apply -f metallb-namespace.yaml and then follow the instructions below to install the operator.

  1. Log in to the OpenShift Console
  2. Select Operators -> OperatorHub
  3. Search “MetalLB” operator
  4. Select MetalLB Operator
  5. Click Install
  6. Create a new Project called “metallb-system”
  7. Accept all defaults, click Install

Once Operator is installed, select “View Operator” and ensure that the Status shows Succeeded before proceeding.

Configure MetalLB

With the MetalLB operator installed, we will create an instance of a MetalLB Controller and then configure an address Pool for the MetalLB controller to leverage. This controller will handle the deployment of the MetalLB components.

Create an instance of MetalLB

Start by creating a file called metallb-controller.yaml with the following contents:

apiVersion: metallb.io/v1beta1
kind: MetalLB
metadata:
  name: metallb
  namespace: metallb-system

With the yaml file created, we will apply the yaml and create our MetalLB instance:

$ oc login
$ oc project metallb-system
$ oc create -f metallb-controller.yaml

We now need to give the MetalLB controller a group of IPs to work with. For this, we will create an address pool.

Create an address pool

We will be creating a BGP IPAddressPool in this blog post. If you are looking for how to do with with a Layer2 IP address pool be sure to check my previous blog post Creating ExternalIPs in OpenShift with MetalLB. As previously identified in the Network Setup section we will be using IPs from the range 172.16.130.0/24 and will set the protocol to bgp. Create a file called 172-16-130-0-address-pool-bgp.yml with the following contents, making sure to up date the IP address range for your needs.

---
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: 172-16-1130-0-address-pool-bgp
  namespace: metallb-system
spec:
  addresses:
  - 172.130.0/24
  autoAssign: true
  protocol: bgp
  avoidBuggyIPs: true

NOTE: It is possible to add multiple IP address ranges by having multiple lines of - <IP start> - <IP end> ranges listed.

We can now apply this address pool to our cluster:

$ oc create -f 172-16-130-0-address-pool-bgp.yml

We can validate the configuration using the oc describe command:

$ oc describe addressPool -n metallb-system
Name:         172-16-130-0-address-pool-bgp
Namespace:    metallb-system
API Version:  metallb.io/v1beta1
Kind:         AddressPool
Spec:
  Addresses:
    172.16.130.0/24
  Auto Assign:  true
  Protocol:     bgp
Events:         <none>

SUCCESS, we have now configured MetalLB to create a bgp Address Pool to work with. We now need to complete the setup of our BGP connection by configuring the OpenShift side of our BGP connection with our northbound router.

Create BFD Profile

We will start by creating a BFD Profile which will define our default settings for BFD. BFD is used to detect faults between routers and switches. MetalLB can take advantage of this protocol to increase the detection of failed nodes occurs. We will now create a default BFD profile to our cluster that will assist the Mikrotik router in detecting failed speaker nodes in our cluster. Create a file called mikrotik-bfd-profile.yml with the following contents:

apiVersion: metallb.io/v1beta1
kind: BFDProfile
metadata:
  name: mikrotik-bfd-profile
  namespace: metallb-system
spec:
  receiveInterval: 380
  transmitInterval: 270

NOTE: Additional BFDProfile settings can be found in the MetalLB API reference Docs.

We can now apply this BFDProfile to our cluster:

$ oc create -f mikrotik-bfd-profile.yml

Configure BGP Peering

Create a file called mikrotik-peer-config.yml with the following contents, making sure to up date the myASN, peerASN and peerAddress with the information that we added/created in the section Configuring Mikrotik Router for BGP Peering.

apiVersion: metallb.io/v1beta2
kind: BGPPeer
metadata:
  name: mikrotik-peer-config
  namespace: metallb-system
spec:
  bfdProfile: mikrotik-bfd-profile
  myASN: 65220
  peerASN: 65003
  peerAddress: 172.16.25.1

We can now apply this BGPPeer to our cluster:

$ oc create -f mikrotik-bgp-peer.yml

Configure BGP Advertisement

Finally we need to configure which nodes will be communicating with our upstream router. This is accomplished by creating a BGPAdvertisement. We can use nodeSelectors to identify only certain nodes to create BGP connections, or you can leave the nodeSelector section off, and all nodes will be configured to communicate with the upstream router.

apiVersion: metallb.io/v1beta1
kind: BGPAdvertisement
metadata:
  name: bgpadvertisement-1
  namespace: metallb-system
spec:
  ipAddressPools:
    - 172-16-130-0-address-pool-bgp
  peers:
    - mikrotik-peer-config
  aggregationLength: 32
  aggregationLengthV6: 128
  localPref: 100
  nodeSelectors:
    - matchLabels:
        kubernetes.io/hostname: NodeA
    - matchLabels:
        kubernetes.io/hostname: NodeB

NOTE: If you have configured ALL nodes in your cluster as BGP connections you do not need to specify nodeSelectors, HOWEVER if you have only configured a subset of your nodes to connect to your north-bound router you must be sure to specify the nodes here.

With the BGPAdvertisement configured, we can go back into our Mikrotik router and see if the connections are established.

[admin@MikroTik] > routing/bgp/session/print 
Flags: E - established 
 0 E name="compact-master-2-1" 
     remote.address=172.16.25.140 .as=65230 .id=172.16.25.140 .capabilities=mp,rr,em,gr,as4,ap,err,llgr,fqdn .afi=ip,ipv6 .messages=167 .bytes=3450 .gr-time=120 .eor=ip 
     local.address=172.16.25.1 .as=65003 .id=172.16.25.1 .cluster-id=172.16.25.1 .capabilities=mp,rr,gr,as4 .afi=ip,ipv6 .messages=167 .bytes=3504 .eor="" 
     output.procid=20 
     input.procid=20 ebgp 
     hold-time=3m keepalive-time=1m use-bfd=yes uptime=2h40m18s530ms last-started=2025-09-23 13:06:53 last-stopped=2025-09-23 13:06:52 prefix-count=2 

 1 E name="compact-master-1-1" 
     remote.address=172.16.25.141 .as=65230 .id=172.16.25.141 .capabilities=mp,rr,em,gr,as4,ap,err,llgr,fqdn .afi=ip,ipv6 .messages=152 .bytes=2944 .gr-time=120 .eor=ip 
     local.address=172.16.25.1 .as=65003 .id=172.16.25.1 .cluster-id=172.16.25.1 .capabilities=mp,rr,gr,as4 .afi=ip,ipv6 .messages=151 .bytes=2908 .eor="" 
     output.procid=22 
     input.procid=22 ebgp 
     hold-time=3m keepalive-time=1m use-bfd=yes uptime=2h29m7s930ms last-started=2025-09-23 13:18:03 last-stopped=2025-09-23 13:18:02 prefix-count=0 

 2 E name="compact-master-0-1" 
     remote.address=172.16.25.142 .as=65230 .id=172.16.25.142 .capabilities=mp,rr,em,gr,as4,ap,err,llgr,fqdn .afi=ip,ipv6 .messages=179 .bytes=3678 .gr-time=120 .eor=ip 
     local.address=172.16.25.1 .as=65003 .id=172.16.25.1 .cluster-id=172.16.25.1 .capabilities=mp,rr,gr,as4 .afi=ip,ipv6 .messages=184 .bytes=4067 .eor="" 
     output.procid=21 
     input.procid=21 ebgp 
     hold-time=3m keepalive-time=1m use-bfd=yes uptime=2h52m37s740ms last-started=2025-09-23 12:54:33 last-stopped=2025-09-23 12:54:32 prefix-count=2 

Or we can view it in the Webfig UI:

bgp connection configuration

Note that in both the examples above, you can see that a session has been established for each node in our cluster, and each node that we configured in our Mikrotik router.

Deploy a test Application

We will deploy a simple application and then create a new service of type: LoadBalancer to test it out. Typically we would deploy something like a simple HTTP web application, but these types of applications can be hosted through the OpenShift Router, so let’s host a simple application that does not work through the traditional OpenShift routes. We will use iPerf to test our new MetalLB configuration. You will need to have a local copy of the iPerf client to run this test.

We will use the Docker image from networkstatic/iperf3 in our deployment below. If you would like to use a different container image, swap out the image name in the deployment below.

kind: Deployment
apiVersion: apps/v1
metadata:
  name: iperf
  namespace: iperf
  labels:
    app: iperf
spec:
  replicas: 1
  selector:
    matchLabels:
      app: iperf
  template:
    metadata:
      labels:
        app: iperf
    spec:
      containers:
        - name: iperf
          command:
            - iperf3
          image: 'networkstatic/iperf3'
          ports:
            - name: tcp
              containerPort: 5201
              protocol: TCP
          resources: {}
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
          args:
            - '-s'
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      securityContext: {}
      schedulerName: default-scheduler

We can now deploy our test iperf application into your cluster. We will create a new project called “iperf” and deploy our iPerf pod into this project:

$ oc new-project iperf
$ oc create -f iperf-deployment.yml

Create a LoadBalancer SVC

Let’s look at our routing tables on our Router before we create the new LoadBalance Service:

[admin@MikroTik] > /ip/route/print 
Flags: D - DYNAMIC; A - ACTIVE; c - CONNECT, s - STATIC, b - BGP
Columns: DST-ADDRESS, GATEWAY, DISTANCE
#     DST-ADDRESS      GATEWAY        DISTANCE
0  As 0.0.0.0/0        172.16.2.1            1
  DAc 172.16.2.0/24    1Gb-ether1            0
  DAc 172.16.10.0/24   VLAN10                0
1  As 172.16.11.0/24   172.16.10.11          1
  DAc 172.16.15.0/24   VLAN15                0
  DAc 172.16.20.0/24   VLAN20                0
  DAc 172.16.25.0/24   VLAN25                0
  DAc 172.16.35.0/24   VLAN35                0
2  As 192.168.5.0/24   172.16.2.1            1

You can see there is no entries for our target subnet of 172.16.130.0. Now lets create a LoadBalancer Service leveraging MetalLB.

It is time to put our new MetalLB LoadBalancer to use. To do this, we need to create a new service of type “LoadBalancer” and point it to our iPerf pod. Create a file called iperf-svc.yml and put the following configuration in that file.

apiVersion: v1
kind: Service
metadata:
  name: iperf-lb
  namespace: iperf
spec:
  selector:
    app: iperf
  ports:
    - port: 5201
      targetPort: 5201
      protocol: TCP
      name: tcp
    - port: 5201
      targetPort: 5201
      protocol: UDP
      name: udp
  type: LoadBalancer

With our file created, we can now apply this service to our cluster.

$ oc create -f iperf-svc.yml
service/iperf-lb created

With the service created, we need to see what EXTERNAL-IP address was assigned to the service.

$ oc get svc
NAME          TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                         AGE
iperf-lb      LoadBalancer   172.30.211.85   172.16.130.2   5201:32540/TCP,5201:32540/UDP   16m

Now lets check the Router and see what is in place on the Router in the Routing Tables:

[admin@MikroTik] > /ip/route/print 
Flags: D - DYNAMIC; A - ACTIVE; c - CONNECT, s - STATIC, b - BGP
Columns: DST-ADDRESS, GATEWAY, DISTANCE
#     DST-ADDRESS      GATEWAY        DISTANCE
0  As 0.0.0.0/0        172.16.2.1            1
  DAc 172.16.2.0/24    1Gb-ether1            0
  DAc 172.16.10.0/24   VLAN10                0
1  As 172.16.11.0/24   172.16.10.11          1
  DAc 172.16.15.0/24   VLAN15                0
  DAc 172.16.20.0/24   VLAN20                0
  DAc 172.16.25.0/24   VLAN25                0
  DAc 172.16.35.0/24   VLAN35                0
  DAb 172.16.130.2/32  172.16.25.140        20
  D b 172.16.130.2/32  172.16.25.141        20
  D b 172.16.130.2/32  172.16.25.142        20

Note the lines highlighted above, showing three paths to get to “172.16.130.2”, one path through each of the nodes in our cluster. Keep this in mind as you start to investigate the use of MetalLB in your cluster, for each node that you have as a speaker, a route will be published for each IP assigned by MetalLB through every node configured as a speaker. This can very quickly grow and you need to be aware of this and plan for it as part of your overall network design.

We can now run our iperf test using the command iperf3 -c 172.16.130.2.

$ iperf3 -c 172.16.130.2
Connecting to host 172.16.130.2, port 5201
[  5] local 192.168.6.22 port 49690 connected to 172.16.130.2 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  36.4 MBytes   305 Mbits/sec   28    898 KBytes       
[  5]   1.00-2.00   sec  34.9 MBytes   292 Mbits/sec    1    961 KBytes       
[  5]   2.00-3.00   sec  31.4 MBytes   263 Mbits/sec    0    985 KBytes       
[  5]   3.00-4.00   sec  31.4 MBytes   263 Mbits/sec    0   1007 KBytes       
[  5]   4.00-5.00   sec  37.4 MBytes   313 Mbits/sec   17   1.01 MBytes       
[  5]   5.00-6.00   sec  27.2 MBytes   229 Mbits/sec    4    777 KBytes       
[  5]   6.00-7.00   sec  29.4 MBytes   246 Mbits/sec    0    864 KBytes       
[  5]   7.00-8.00   sec  33.6 MBytes   283 Mbits/sec    0    931 KBytes       
[  5]   8.00-9.00   sec  30.1 MBytes   253 Mbits/sec    0    978 KBytes       
[  5]   9.00-10.00  sec  33.2 MBytes   279 Mbits/sec   23    737 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   325 MBytes   273 Mbits/sec   73            sender
[  5]   0.00-10.01  sec   322 MBytes   270 Mbits/sec                  receiver

We can also run this in UDP mode since we created both a TCP and a UDP service port when we defined our Kubernetes service:

iperf3 -c 172.16.130.2 --udp -b 0
Connecting to host 172.16.130.2, port 5201
[  5] local 192.168.6.22 port 50009 connected to 172.16.130.2 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec  64.9 MBytes   544 Mbits/sec  50465  
[  5]   1.00-2.00   sec  70.8 MBytes   594 Mbits/sec  55060  
[  5]   2.00-3.00   sec  67.0 MBytes   562 Mbits/sec  52137  
[  5]   3.00-4.00   sec  59.4 MBytes   499 Mbits/sec  46230  
[  5]   4.00-5.00   sec  65.8 MBytes   552 Mbits/sec  51196  
[  5]   5.00-6.00   sec  60.3 MBytes   506 Mbits/sec  46887  
[  5]   6.00-7.00   sec  68.9 MBytes   578 Mbits/sec  53607  
[  5]   7.00-8.00   sec  61.3 MBytes   514 Mbits/sec  47704  
[  5]   8.00-9.00   sec  62.4 MBytes   524 Mbits/sec  48535  
[  5]   9.00-10.00  sec  63.6 MBytes   533 Mbits/sec  49509  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec   644 MBytes   541 Mbits/sec  0.000 ms  0/501330 (0%)  sender
[  5]   0.00-10.04  sec   311 MBytes   260 Mbits/sec  0.108 ms  0/493424 (0%)  receiver

SUCCESS! We have connected to the iPerf service, and run a successful test using both the TCP and UDP protocols.

References

https://www.redhat.com/en/blog/metallb-in-bgp-mode