Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This document details the supported process for enabling Jumbo MTU (Maximum Transmission Unit) on Azure Red Hat OpenShift (ARO) clusters. Enabling Jumbo MTU in ARO is strictly limited to intra-cluster network traffic, specifically covering pod-to-pod, pod-to-service, and node-to-node communication that utilizes the OVN overlay network. It is important to note that this configuration does not impact outbound or external network traffic, which continues to adhere to standard Azure networking MTU limits.
Overview
Changing the MTU may be beneficial for workloads that generate large volumes of east–west traffic within the cluster. Common use cases include high-throughput data processing pipelines, distributed databases, large-scale logging and monitoring systems, AI/ML training workloads, and storage-intensive applications. These are the workloads where reducing packet fragmentation can significantly improve throughput and CPU efficiency. Customers should consider enabling Jumbo MTU when network performance within the cluster is a known bottleneck or when running workloads that benefit from larger packet sizes.
Azure Red Hat OpenShift supports increasing the cluster network MTU and machine-level NIC MTU when the underlying Azure VM hardware uses the Microsoft Azure Network Adapter (MANA) driver. While Azure exposes a maximum NIC MTU of 9,000 bytes, the maximum configurable MTU is 8900 to account for 100 bytes of OVN overlay overhead.
Prerequisites
Jumbo MTU can only be enabled when every node in the cluster, including all control-plane and worker nodes, is running on Azure Virtual Machines (VM) that support the Microsoft Azure Network Adapter (MANA) driver.
- An ARO cluster running OpenShift 4.19 or higher.
- NICs with Accelerated Networking enabled.
- All cluster nodes must use VMs that support the MANA driver. Supported types include the Dv6 series (for example, D4as_v6, D8s_v6, D16s_v6, etc.).
Note
If the cluster has a mix of VMs where some support MANA and some do not, the MTU migration must be delayed until all nodes support MANA. Otherwise, the cluster uses the default MTU of 1500, and may lead to fragmentation or connectivity issues for traffic between nodes with different MTU capabilities.
Validate the cluster nodes have the MANA driver
Validate that the MANA driver is in use by running the following commands on all nodes in the cluster (control plane and worker):
Get the node names in your cluster using this command.
oc get nodes -o nameFor each node, check what driver is used by the
enP*interface.As the interface name may vary, first get the interface name:
oc debug node/<NODE_NAME> -- chroot /host ip link | grep enPYou see an output like:
3: enP30xxxxx: <BROADCAST,MULTICAST,...,UP,LOWER_UP> mtu 1500 qdisc mq master eth0 state UP mode DEFAULT group default qlen 1000 altname enP30xxxxxxxUse the interface name returned (like "enP30xxxxx") in the following command:
oc debug node/<NODE_NAME> -- /bin/sh -c 'chroot /host ethtool -i <INTERFACE_NAME> | grep driver'You should see an output that shows that the
MANAdriver is in use.
Important
If the driver is mlx4, mlx5, hv_netvsc, or anything other than MANA, the node does not support changing the MTU to 9,000.
Change the Maximum Transmission Unit (MTU)
Before changing the MTU, ensure that the cluster and all operators are healthy. Also ensure that all machine config pools are in a stable, fully updated, and healthy state. Plan for rolling reboots, as MTU changes trigger multiple MachineConfigPool rollouts.
To begin the MTU migration, specify the migration configuration by entering the following command. The Machine Config Operator performs a rolling reboot of the nodes in the cluster in preparation for the MTU change.
oc patch Network.operator.openshift.io cluster --type=merge --patch \ '{"spec": { "migration": { "mtu": { "network": { "from": 1400, "to": 8900 }, "machine": { "to": 9000 } } } } }'Monitor the rollout status by running the following command.
oc get machineconfigpoolWait for all MachineConfigPool groups (master and worker) to reach a stable state, indicated by the following status values:
UPDATED=true,UPDATING=false,DEGRADED=false. This takes some time to complete. The amount of time also depends on the size of the cluster.It should look similar to the following output.
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-xxxxx True False False 3 3 3 0 2d4h worker rendered-worker-xxxxx True False False 3 3 3 0 2d4hVerify MachineConfig rollout and MTU migration injection.
After initiating the MTU migration in Step 1, verify that the Machine Config Operator (MCO) has successfully rendered and applied the updated MachineConfig to all nodes. Confirm that each node is transitioned to the expected rendered MachineConfig and that the configuration state is stable by running:
oc describe node | egrep "hostname|machineconfig"An example of the expected output is shown:
kubernetes.io/hostname=master-0 [...] machineconfiguration.openshift.io/currentConfig: rendered-master-xxxx machineconfiguration.openshift.io/desiredConfig: rendered-master-xxxx [...] machineconfiguration.openshift.io/state: DoneEnsure that the value of
machineconfiguration.openshift.io/stateisDoneand that the value of themachineconfiguration.openshift.io/currentConfigfield is equal to the value of themachineconfiguration.openshift.io/desiredConfigfield.Verify the presence of the MTU migration script.
During MTU migration, the Cluster Network Operator (CNO) injects a temporary systemd unit into the rendered MachineConfig. This unit runs the
mtu-migration.shscript, which safely orchestrates the MTU transition across nodes and prevents network disruption during rolling reboots.To validate, inspect the MachineConfig referenced in the previous step (for example,
rendered-master-xxxxorrendered-worker-xxxx).oc get machineconfig <CONFIG_NAME> -o yaml | grep mtu-migration.shWhere
<CONFIG_NAME>specifies the name of the machine config from themachineconfiguration.openshift.io/currentConfigfield. The expected output should include the following entry: "ExecStart=/usr/local/bin/mtu-migration.sh".Note
The migration script is present only in the rendered MachineConfig generated by the MCO, not in user-created MachineConfigs. Always verify the specific
rendered-*MachineConfig shown on the node.Apply the new hardware MTU value.
After verifying that all previous steps are successful, create the following two MachineConfig files and apply them to the cluster.
Create the master MachineConfig (99-master-mtu.yaml) file
apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: master name: 99-master-mtu spec: config: ignition: version: 3.5.0 storage: files: - contents: compression: "" source: data:,%5Bconnection%5D%0Amatch-device%3Dinterface-name%3Aeth0%0Aethernet.mtu%3D9000 mode: 420 path: /etc/NetworkManager/conf.d/99-eth0-mtu.confApply the MachineConfig by running the following command:
oc create -f 99-master-mtu.yamlCreate the worker MachineConfig (99-worker-mtu.yaml) file.
apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: 99-worker-mtu spec: config: ignition: version: 3.5.0 storage: files: - contents: compression: "" source: data:,%5Bconnection%5D%0Amatch-device%3Dinterface-name%3Aeth0%0Aethernet.mtu%3D9000 mode: 420 path: /etc/NetworkManager/conf.d/99-eth0-mtu.confApply the MachineConfig by running the following command:
oc create -f 99-worker-mtu.yamlMonitor the rollout status by running the following command.
oc get machineconfigpoolWait for all MachineConfigPool groups (master and worker) to reach a stable state, indicated by the following status values:
UPDATED=true,UPDATING=false,DEGRADED=false. It will take some time to complete and also depends on the size of the cluster.
Reverify MachineConfig rollout.
Verify that the Machine Config Operator (MCO) is successfully rendered and applies the updated MachineConfig to all nodes. Confirm that each node is transitioned to the expected rendered MachineConfig and that the configuration state is stable by running:
oc describe node | egrep "hostname|machineconfig"An example of the expected output is shown:
kubernetes.io/hostname=master-0 [...] machineconfiguration.openshift.io/currentConfig: rendered-master-xxxx machineconfiguration.openshift.io/desiredConfig: rendered-master-xxxx [...] machineconfiguration.openshift.io/state: DoneEnsure that the value of
machineconfiguration.openshift.io/stateisDoneand that the value of themachineconfiguration.openshift.io/currentConfigfield is equal to the value of themachineconfiguration.openshift.io/desiredConfigfield.Finalize the MTU migration.
Clear the migration specification and apply the final MTU value by running the following command:
oc patch Network.operator.openshift.io cluster --type=merge --patch \ '{"spec": { "migration": null, "defaultNetwork": { "ovnKubernetesConfig": { "mtu": 8900 } } }}'Monitor the final rollout by checking the MachineConfigPool status, which should show UPDATED=true, UPDATING=false, DEGRADED=false.
oc get machineconfigpool
Verify the change is completed
To verify that the Jumbo MTU configuration is successfully applied across the ARO cluster, you should begin by checking the cluster-wide network MTU. This can be done by inspecting the network configuration, where the expected value should reflect "Cluster Network MTU: 8900", confirming that the OVN overlay network is updated.
oc describe network.config cluster | grep "Cluster Network MTU"Verify the MTU at the node level by examining the primary network interface within a debug session. The interface should report an MTU of 9,000. OpenShift configures the overlay MTU conservatively (8900) to accommodate platform and overlay encapsulation and avoid packet fragmentation.
oc debug node/<NODE_NAME> -- chroot /host ip -d link show eth0For end-to-end pod connectivity validation, use a payload size of 8,872 bytes. This value is calculated to ensure the resulting packet size is exactly 8900 bytes (8,872 payload + 8 ICMP header + 20 IP header). This should succeed without fragmentation, indicating that the Jumbo MTU is consistently applied throughout the data path.
You may select any pod to ping. For example:
oc get pods -n openshift-monitoring -o wideWe can see an output like:
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES ... metrics-server-xxxxxxxxxx-xxxxx 1/1 Running 0 2d16h 10.129.2.13 myarocluster-worker-westus2-xxxxx <none> <none> metrics-server-xxxxxxxxxx-xxxxx 1/1 Running 0 2d16h 10.131.0.7 myarocluster-worker-westus2-xxxxx <none> <none> monitoring-plugin-xxxxxxxxxx-xxxxx 1/1 Running 0 2d16h 10.129.2.8 myarocluster-worker-westus2-xxxxx <none> <none> monitoring-plugin-xxxxxxxxxx-xxxxx 1/1 Running 0 2d16h 10.131.0.16 myarocluster-worker-westus2-xxxxx <none> <none> node-exporter-xxxxx 2/2 Running 10 3d 10.0.0.10 myarocluster-master-0 <none> <none> ....We can select the
metrics-serverpod.Send an ICMP packet with a payload size of 8,872 bytes using the following command.
oc debug node/<NODE_NAME> -- chroot /host ping -M do -s 8,872 <POD_IP>We should see a result like:
PING 10.129.2.13 (10.129.2.13) 8872(8900) bytes of data. 8880 bytes from 10.129.2.13: icmp_seq=1 ttl=62 time=5.26 ms 8880 bytes from 10.129.2.13: icmp_seq=2 ttl=62 time=0.408 ms 8880 bytes from 10.129.2.13: icmp_seq=3 ttl=62 time=0.198 ms
Troubleshooting
MachineConfigPool is stuck in an UPDATING state
If the MachineConfigPool becomes stuck in the UPDATING state during the MTU migration process, you can begin troubleshooting by reviewing the Machine Config Operator (MCO) logs. This is done by running the command
oc logs -n openshift-machine-config-operator deploy/machine-config-operator, which provides insight into what may be preventing the rollout from completing. Common issues that lead to an MCO stall include incorrectly formatted or improperly indented YAML within the applied MachineConfig, conflicts caused by multiple MachineConfigs attempting to modify the same files or settings, or nodes failing to reboot automatically after receiving updated configurations. Reviewing and correcting these issues typically allows the MachineConfigPool to resume progress and complete the update successfully.MTU not updated on node NIC
If the MTU does not appear to be applied correctly at the node level, begin troubleshooting by first identifying the physical network interface on the node (for example,
eth0orenp*), as interface names may vary across environments. Verify the NetworkManager configuration file that sets the interface MTU by reviewing/etc/NetworkManager/conf.d/99-eth0-mtu.confto ensure the expected MTU value is present. Next, confirm the MTU applied to the physical interface by inspecting its link settings rather than an OVS bridge such asbr-ex. If the physical interface still reports an MTU of 1500, further investigation is required. In such cases, verify that the NIC driver in use isMANA, ensure that the underlying Azure VM size supports Jumbo MTU capabilities, and check whether the MachineConfigPool is still progressing by confirming whetherUPDATING=true, which may indicate that the rollout has not yet completed.Missing MANA driver
Each NIC should list MANA as the active driver. If it is missing, the node’s VM instance type may not support the Microsoft Azure Network Adapter (MANA). In such cases, you must resize the node to a Dv6-series or another MANA-supported Azure VM size. After resizing, the node may need to be rebuilt for the change to take effect which is typically done by draining and deleting the node, allowing ARO to automatically recreate it with the correct VM configuration.
OVN pods showing MTU mismatch
If OVN pods report an MTU mismatch, you can begin troubleshooting by reviewing the OVN node logs using the command
oc logs -n openshift-ovn-kubernetes ds/ovnkube-node --all-containers=true | grep mtu. This helps identify any discrepancies between the MTU values applied at the OVN layer and those configured through the MachineConfigs. If mismatches appear in the logs, ensure that the Cluster Network Operator (CNO) MTU settings align with the values defined in the MachineConfig files, as inconsistencies between these components can prevent the MTU migration from completing correctly.Connectivity issues after migration
Connectivity issues after the MTU migration typically indicate that MTU values were not fully propagated across the cluster. To diagnose this, first verify that each node reports an MTU of 9,000, reflecting the expected hardware-level MTU on Azure. Next, ensure that the OVN overlay network is configured with an MTU of 8900, which aligns with the cluster-wide network settings. It is also critical to confirm that the Azure NIC is using the MANA driver, as non-MANA drivers do not support Jumbo MTU. If these values appear correct yet connectivity problems persist, you can further validate the end-to-end path MTU by running
tracepath <destination>, which helps identify where packet fragmentation or MTU drops may be occurring in the network path.
Next steps
- View OpenShift documentation for Changing the MTU for the cluster network.