Open vSwitch on Linux: The Complete Guide to Flow Tables and KVM Integration

Linux bridging works fine — until it doesn’t. The moment you need per-flow QoS, VXLAN tunnels between hypervisors, or anything resembling a real network policy, the kernel bridge hits a wall. That’s where Open vSwitch steps in.

OVS is a production-grade virtual switch that speaks OpenFlow, supports GRE/VXLAN/Geneve tunneling, integrates with SDN controllers, and plugs natively into KVM’s tap device model. It’s what OpenStack, oVirt, and most serious private clouds run under the hood. This guide gets you from zero to a working OVS+KVM setup with real flow table rules — no hand-waving.

The official project lives at https://github.com/openvswitch/ovs. The docs are dense but accurate; bookmark them.


Why Not Just Use a Linux Bridge?

Nothing wrong with brctl for a home lab. But consider what you lose:

  • No per-flow matching. A kernel bridge forwards by MAC table, full stop. OVS lets you match on any combination of L2–L4 headers and act on it.
  • No built-in tunneling. VXLAN on a plain bridge requires a vxlan device glued on with duct tape. OVS treats tunnels as first-class ports.
  • No QoS per-port or per-queue. OVS exposes policing and queuing primitives you can actually configure.
  • No controller plane. When you outgrow static config and want a programmatic control plane (OpenDaylight, ONOS, your own Python controller), OVS speaks OpenFlow natively.

If your workload is two VMs on a laptop — stick with a bridge. If you’re building anything that needs to scale or be automated, OVS earns its complexity.


Architecture in 60 Seconds

OVS has three moving parts you need to keep straight:

ovsdb-server — the configuration database. Stores everything: bridges, ports, interfaces, flow tables. Persists to disk. Think of it as the source of truth for the control plane.

ovs-vswitchd — the switching daemon. Reads config from ovsdb, programs the kernel datapath, and handles the slow path for flows that miss the kernel cache.

Kernel datapath (openvswitch.ko) — the fast path. Once a flow is installed, packets are forwarded entirely in kernel space without touching vswitchd.

The key insight: ovs-vswitchd handles the first packet of a new flow (slow path), installs a kernel flow entry, and then the rest of the packets in that flow go through the kernel at line rate. This is why OVS throughput looks bad in micro-benchmarks (per-packet overhead) but fine in practice (flows are long-lived).


Installation

Debian/Ubuntu

apt update
apt install -y openvswitch-switch openvswitch-common

That’s it. The package pulls in the kernel module and starts both daemons. Verify:

systemctl status ovsdb-server ovs-vswitchd
ovs-vsctl show

You should see an empty ovs-vsctl show output — just the OVS version and the ovs_version line. No bridges yet.

RHEL/Rocky/AlmaLinux

dnf install -y centos-release-nfv-openvswitch
dnf install -y openvswitch3.1
systemctl enable --now ovsdb-server ovs-vswitchd

The version number in the package name matters — check what’s current in your repo with dnf search openvswitch.

From Source (when you need a bleeding-edge feature)

apt install -y build-essential libssl-dev python3-dev python3-pip \
               autoconf automake libtool libcap-ng-dev libunbound-dev

git clone https://github.com/openvswitch/ovs.git
cd ovs
./boot.sh
./configure --with-linux=/lib/modules/$(uname -r)/build
make -j$(nproc)
make install
make modules_install

Building from source is fine for development. In production, use distro packages — you want security backports without rebuilding.


Creating Your First Bridge

# Create a bridge named br0
ovs-vsctl add-br br0

# Verify
ovs-vsctl show

Output should look like:

Bridge br0
    Port br0
        Interface br0
            type: internal
    ovs_version: "3.1.0"

The br0 port with type internal is an automatic loopback port that corresponds to a Linux network interface of the same name. You can assign an IP to it:

ip addr add 192.168.100.1/24 dev br0
ip link set br0 up

Now add a physical uplink. If eth1 is a dedicated NIC for VM traffic:

ovs-vsctl add-port br0 eth1
ip link set eth1 up

Gotcha: Don’t add your management NIC to OVS unless you know exactly what you’re doing. The moment you attach a NIC that carries your SSH session to a bridge without correct flow rules, you lose access. Always use a dedicated NIC for OVS or configure it from the console.


Flow Tables: Where OVS Gets Interesting

Every packet that enters OVS goes through flow tables. A flow table is a prioritized list of match-action rules. Packets are matched top-to-bottom by priority (higher number = higher priority), and the first match wins.

The Default Behavior

By default, OVS has exactly one flow rule in table 0:

ovs-ofctl dump-flows br0
cookie=0x0, duration=42.5s, table=0, n_packets=0, n_bytes=0,
priority=0 actions=NORMAL

priority=0 actions=NORMAL means "for anything that doesn’t match a more specific rule, act like a normal learning switch." This is why OVS works out of the box without you writing any flows.

Writing Real Flow Rules

The tool for flow management is ovs-ofctl. Let’s do something practical: block all ICMP between two VMs while allowing everything else.

First, identify the ports:

ovs-vsctl show
# Note the port names, e.g. vnet0 (VM1), vnet1 (VM2)

Block ICMP from VM1 to VM2:

# Drop ICMP from vnet0 destined to vnet1's MAC
# In practice you'd match on IP, not MAC — this is illustrative
ovs-ofctl add-flow br0 \
  "priority=100,in_port=vnet0,icmp,actions=drop"

Allow everything else to fall through to normal:

ovs-ofctl add-flow br0 \
  "priority=1,actions=NORMAL"

Check what’s installed:

ovs-ofctl dump-flows br0

A More Useful Example: VLAN Tagging Per VM

You have two VMs that should live on different VLANs:

# Tag traffic from vnet0 with VLAN 10
ovs-ofctl add-flow br0 \
  "priority=100,in_port=vnet0,actions=mod_vlan_vid:10,NORMAL"

# Tag traffic from vnet1 with VLAN 20
ovs-ofctl add-flow br0 \
  "priority=100,in_port=vnet1,actions=mod_vlan_vid:20,NORMAL"

Or use OVS port configuration (simpler for static VLAN assignment):

# Set vnet0 as an access port on VLAN 10
ovs-vsctl set port vnet0 tag=10

# Set vnet1 as an access port on VLAN 20
ovs-vsctl set port vnet1 tag=20

The tag= method is cleaner for straightforward access port scenarios. Use flow rules when you need conditional logic.

Multi-Table Pipelines

Real deployments use multiple tables for pipeline stages — security policy in table 1, routing decisions in table 2, QoS in table 3. Here’s the pattern:

# Table 0: dispatch — send all traffic to table 1 for ACL check
ovs-ofctl add-flow br0 "table=0,priority=0,actions=resubmit(,1)"

# Table 1: ACL — drop SSH from a specific source
ovs-ofctl add-flow br0 \
  "table=1,priority=100,ip,nw_src=10.0.0.50,tcp,tp_dst=22,actions=drop"

# Table 1: ACL — allow everything else, continue to table 2
ovs-ofctl add-flow br0 \
  "table=1,priority=0,actions=resubmit(,2)"

# Table 2: forwarding — normal L2 learning
ovs-ofctl add-flow br0 \
  "table=2,priority=0,actions=NORMAL"

Gotcha: actions=NORMAL only works in the last stage. You can’t do NORMAL and then resubmit — NORMAL is terminal. If you need to both learn MACs and apply further processing, you need to handle MAC learning explicitly with learn() actions, which gets hairy fast. For most homelab/small production setups, NORMAL in the last table is fine.


KVM Integration

This is where things get operationally solid. KVM VMs use TAP devices for network access. The workflow is:

  1. QEMU creates a tap device (vnet0, vnet1, etc.)
  2. You plug that tap device into OVS as a port
  3. Packets from the VM appear on the OVS bridge and get processed by flow tables

Manual Integration (for understanding)

# Create a tap device manually
ip tuntap add dev tap0 mode tap
ip link set tap0 up

# Add it to OVS
ovs-vsctl add-port br0 tap0

# Start a VM using that tap device
qemu-system-x86_64 \
  -m 2048 \
  -drive file=/var/lib/libvirt/images/vm1.qcow2,format=qcow2 \
  -netdev tap,id=net0,ifname=tap0,script=no,downscript=no \
  -device virtio-net-pci,netdev=net0,mac=52:54:00:AA:BB:01 \
  -nographic

The script=no,downscript=no tells QEMU not to run /etc/qemu-ifup and /etc/qemu-ifdown — those are for traditional bridge setup and would fight with OVS.

libvirt Integration (for production)

Libvirt has native OVS support. Define a network in XML:

<!-- /etc/libvirt/qemu/networks/ovs-network.xml -->
<network>
  <name>ovs-network</name>
  <forward mode='bridge'/>
  <bridge name='br0'/>
  <virtualport type='openvswitch'/>
</network>

Import and start it:

virsh net-define /etc/libvirt/qemu/networks/ovs-network.xml
virsh net-start ovs-network
virsh net-autostart ovs-network

Then in your VM definition, use this network:

<interface type='network'>
  <mac address='52:54:00:AA:BB:01'/>
  <source network='ovs-network'/>
  <model type='virtio'/>
  <virtualport type='openvswitch'>
    <!-- Optional: pass custom OVS port metadata -->
    <parameters interfaceid='YOUR-UUID-HERE'/>
  </virtualport>
</interface>

When libvirt starts the VM, it creates the tap device and calls ovs-vsctl add-port br0 <tapN> automatically. You can verify:

virsh start vm1
ovs-vsctl show
# You should see the new port appear under br0

Gotcha: If you previously configured the default libvirt network (which uses NAT through virbr0), VMs attached to it won’t magically move to OVS. You need to explicitly change the network in the VM’s XML. Edit with virsh edit vm1 and swap the <source network='default'/> to <source network='ovs-network'/>.

Setting Per-VM Port Properties from libvirt

You can push OVS port metadata from the VM XML directly. This is useful for assigning external IDs that SDN controllers use for policy lookup:

<virtualport type='openvswitch'>
  <parameters interfaceid='a8c18e4f-7e52-4a3c-bf76-de1d5e7e6a0b'/>
</virtualport>

Or set them after the fact via ovs-vsctl:

# Tag a port with VM-level metadata
ovs-vsctl set Interface vnet0 \
  external-ids:vm-id=vm1 \
  external-ids:owner=team-infra

These external IDs don’t affect packet forwarding — they’re metadata for automation and controllers.


VXLAN Tunneling Between Two Hypervisors

This is where OVS pays for itself. Connecting VMs across two physical hosts without VLAN stretching:

On host1 (192.168.1.10):

ovs-vsctl add-port br0 vxlan0 -- \
  set interface vxlan0 type=vxlan \
  options:remote_ip=192.168.1.11 \
  options:key=100 \
  options:dst_port=4789

On host2 (192.168.1.11):

ovs-vsctl add-port br0 vxlan0 -- \
  set interface vxlan0 type=vxlan \
  options:remote_ip=192.168.1.10 \
  options:key=100 \
  options:dst_port=4789

VMs on both hosts connected to br0 now share L2 — they can ping each other by IP without any router in between. The VXLAN UDP encapsulation happens in the OVS kernel datapath, so it’s fast.

Gotcha: VXLAN uses UDP/4789. Make sure your firewall allows it. On iptables-managed hosts:

iptables -I INPUT -p udp --dport 4789 -j ACCEPT

Also, MTU matters. VXLAN adds 50 bytes of overhead. If your physical NICs are at MTU 1500, your VMs need to run at MTU 1450 or you’ll see mysterious packet drops for large transfers. Either configure VMs to use MTU 1450, or — better — enable jumbo frames (MTU 9000) on the physical underlay.


Monitoring and Debugging

Flow Statistics

# Watch packet/byte counters live
watch -n1 'ovs-ofctl dump-flows br0'

# Dump flows with timestamps (useful for stale flow detection)
ovs-ofctl dump-flows br0 | sort -k6 -rn | head -20

Packet Tracing

OVS has a built-in packet tracer — one of its killer features:

# Simulate a packet from vnet0: what would OVS do with it?
ovs-appctl ofproto/trace br0 \
  "in_port=vnet0,dl_src=52:54:00:AA:BB:01,dl_dst=52:54:00:CC:DD:02,\
  ip,nw_src=10.0.0.1,nw_dst=10.0.0.2,tcp,tp_src=12345,tp_dst=80"

The output walks through each table and shows exactly which rule matched and what action was taken. Invaluable when a flow rule isn’t doing what you think.

Interface Statistics

ovs-vsctl list Interface vnet0

This dumps everything OVS knows about the interface — link state, error counters, external IDs, DPDK config if applicable.

ovsdb Direct Queries

For scripting and automation:

# List all bridges
ovs-vsctl list Bridge

# Get a specific field
ovs-vsctl get Bridge br0 fail_mode

# Set fail mode (what happens when controller disconnects)
ovs-vsctl set Bridge br0 fail_mode=standalone

fail_mode=standalone means OVS falls back to normal L2 learning if a controller disconnects. fail_mode=secure means it drops all traffic. For production without a controller, always use standalone.


Production Gotchas (The Real Ones)

Flow table overflow. The kernel datapath caches flows in a hash table. By default it’s sized for ~200k concurrent flows. On a busy host with many short-lived connections, you can exhaust it. Monitor with ovs-dpctl show and look at flows vs max_flows. Tune with:

ovs-vsctl set Open_vSwitch . other_config:max-idle=10000

ovs-vswitchd memory creep. Under heavy flow churn, vswitchd can accumulate state. It’s not a leak exactly, but it grows. Budget ~500MB for vswitchd on a busy hypervisor. Monitor with systemctl status ovs-vswitchd and keep an eye on RSS.

Netfilter and OVS don’t play nicely. If you’re running iptables rules on the bridge or on the tap interfaces, you may see double-processing or rules not applying where you expect. OVS traffic bypasses ebtables but does pass through iptables at the IP layer. Keep your firewall policy either entirely in OVS flow tables or entirely in iptables — mixing them creates debugging hell.

Persisting flow rules. Flows added with ovs-ofctl add-flow are not persistent. They vanish on vswitchd restart. For persistent config, either use ovs-vsctl (which writes to ovsdb), use a controller, or write a startup script that re-applies your flows. In production, a controller is the right answer. For a single box, a simple systemd ExecStartPost that runs your ovs-ofctl commands works fine.

KVM + OVS + SELinux. On RHEL-family systems, SELinux can block QEMU from accessing the tap device. If VMs fail to start with cryptic permission errors, check ausearch -m avc -ts recent before disabling SELinux entirely. Usually the fix is a single setsebool or a targeted policy.


Wrapping Up

OVS is not a drop-in replacement for a kernel bridge — it has operational overhead and a learning curve. But once you’ve built your first multi-table pipeline or watched ofproto/trace walk a packet through your rules in real time, going back to brctl feels like debugging with printf.

The architecture scales from a single KVM host to a thousand-node cluster, with the same tooling and the same flow rule semantics. That uniformity is worth something when you’re on call at 2am trying to figure out why VM traffic is hitting the wrong VLAN.

Start simple: one bridge, the default NORMAL action, VMs connecting through libvirt. Get comfortable with ovs-vsctl show and ovs-ofctl dump-flows. Then start adding flow rules incrementally, using ofproto/trace to verify before you break anything.

The jump from "OVS as a smarter bridge" to "OVS as a programmable network fabric" is mostly a matter of reading the OpenFlow spec and experimenting. The tooling is already there.

Leave a comment

👁 Views: 2,289 · Unique visitors: 1,646