WireGuard Mesh Networking for Homelabs: Topology, Routing, and Persistent Keepalive Explained

If you have more than two machines you want to connect privately — a VPS in a cloud somewhere, a NAS at home, a Pi at a friend’s place, a dev box at the office — you’ve probably already outgrown the classic "one server, one client" WireGuard setup. You hit the wall fast: all traffic has to bounce through the central server, latency stacks up, and if that server goes down, every peer loses connectivity to every other peer.

A full mesh fixes all of that. Every node talks directly to every other node. No single point of failure. No unnecessary hop. This article walks through building one properly — topology, routing logic, AllowedIPs mechanics, and the PersistentKeepalive setting that most guides mention and few actually explain correctly.

WireGuard’s official GitHub: https://github.com/WireGuard/wireguard-linux. The userspace tools (wg, wg-quick): https://github.com/WireGuard/wireguard-tools.


Hub-and-Spoke Is a Trap

The standard homelab WireGuard tutorial gives you a server config with one [Peer] block and a client config that sets AllowedIPs = 0.0.0.0/0. That works fine for routing your laptop’s traffic through a VPN exit node. It’s completely wrong for multi-node private networking.

In hub-and-spoke, when your NAS wants to reach your Pi, it sends the packet to the VPS, the VPS decrypts it, re-encrypts it to the Pi, and sends it out. Every single packet makes that round trip. If the VPS is in Frankfurt and your NAS and Pi are both in the same apartment, you’re routing packets to Germany and back to talk between two machines 2 meters apart.

In a mesh, the NAS and Pi have a direct peer relationship. Their packets go directly between them. The VPS is still a peer, but it’s only involved in traffic actually meant for it.


The Topology

For this guide, let’s work with three nodes. Adapting to five or ten follows the exact same pattern.

Node Role Public IP WireGuard IP
vps Cloud server, has a stable public IP 203.0.113.1 10.10.0.1/24
nas Home NAS, behind NAT 10.10.0.2/24
pi Raspberry Pi at a remote location, behind NAT 10.10.0.3/24

In a full mesh with 3 nodes, every node has 2 peers. With N nodes, each has N-1 peers. The config files get longer but the logic doesn’t get more complex.


Key Generation

On each machine:

# Generate a private key
wg genkey | tee /etc/wireguard/privatekey | wg pubkey > /etc/wireguard/publickey

# Lock down permissions immediately
chmod 600 /etc/wireguard/privatekey
chmod 644 /etc/wireguard/publickey

You need the public key of each peer before you can write the configs. Generate on all three nodes first, collect the public keys, then write configs. Do this in one session — trying to do it incrementally while nodes are running is how you end up with mismatched keys and hours of debugging.

Optionally generate a preshared key for each peer pair. This adds a post-quantum layer on top of WireGuard’s already solid crypto:

# Run once per pair, share the output with both nodes in that pair
wg genpsk

Writing the Configs

vps — 10.10.0.1

# /etc/wireguard/wg0.conf on vps

[Interface]
Address = 10.10.0.1/24
ListenPort = 51820
PrivateKey = <vps_private_key>

# Forward traffic between peers (required for routing)
PostUp = iptables -A FORWARD -i %i -j ACCEPT; iptables -A FORWARD -o %i -j ACCEPT; iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
PostDown = iptables -D FORWARD -i %i -j ACCEPT; iptables -D FORWARD -o %i -j ACCEPT; iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE

[Peer]
# nas
PublicKey = <nas_public_key>
PresharedKey = <psk_vps_nas>    # optional but recommended
AllowedIPs = 10.10.0.2/32
PersistentKeepalive = 25

[Peer]
# pi
PublicKey = <pi_public_key>
PresharedKey = <psk_vps_pi>
AllowedIPs = 10.10.0.3/32
PersistentKeepalive = 25

nas — 10.10.0.2

# /etc/wireguard/wg0.conf on nas

[Interface]
Address = 10.10.0.2/24
ListenPort = 51820
PrivateKey = <nas_private_key>

[Peer]
# vps - has a public IP, so we can point directly at it
PublicKey = <vps_public_key>
PresharedKey = <psk_vps_nas>
Endpoint = 203.0.113.1:51820
AllowedIPs = 10.10.0.1/32
PersistentKeepalive = 25

[Peer]
# pi - also behind NAT, no Endpoint here initially
PublicKey = <pi_public_key>
PresharedKey = <psk_nas_pi>
AllowedIPs = 10.10.0.3/32
PersistentKeepalive = 25

pi — 10.10.0.3

# /etc/wireguard/wg0.conf on pi

[Interface]
Address = 10.10.0.3/24
ListenPort = 51820
PrivateKey = <pi_private_key>

[Peer]
# vps
PublicKey = <vps_public_key>
PresharedKey = <psk_vps_pi>
Endpoint = 203.0.113.1:51820
AllowedIPs = 10.10.0.1/32
PersistentKeepalive = 25

[Peer]
# nas - also behind NAT
PublicKey = <nas_public_key>
PresharedKey = <psk_nas_pi>
AllowedIPs = 10.10.0.2/32
PersistentKeepalive = 25

AllowedIPs: The Most Misunderstood Setting

AllowedIPs does two things simultaneously, and conflating them is the source of most WireGuard routing confusion.

Outbound: It’s a routing table entry. When you write AllowedIPs = 10.10.0.2/32, wg-quick installs a route that says "packets destined for 10.10.0.2 go out through this peer’s tunnel."

Inbound: It’s a source filter (sometimes called a cryptographic routing table). WireGuard will only accept packets from a peer if the packet’s source IP is listed in that peer’s AllowedIPs. A packet arriving through the tunnel from nas claiming to be from 10.10.0.5 will be silently dropped because 10.10.0.5 isn’t in nas‘s AllowedIPs.

This dual role is elegant but bites you in specific scenarios.

The classic mistake: You want nas to also route traffic for your home LAN subnet 192.168.1.0/24 through the mesh. You add AllowedIPs = 10.10.0.2/32, 192.168.1.0/24 on the other peers. That works for routing — but now WireGuard will also accept packets from nas that claim to originate from any 192.168.1.x address. That’s actually fine and intended when nas is acting as a gateway for that subnet. Just know it’s happening.

The /24 on the Interface and /32 on peers: Notice the Interface has Address = 10.10.0.1/24 but each peer’s AllowedIPs is a /32. The /24 on the interface tells the kernel what subnet wg0 belongs to — it’s used for ARP-adjacent logic and local subnet determination. The /32 on each peer is the explicit route saying "this specific IP goes through this specific peer." If you put /24 in AllowedIPs for a peer, you’re saying all traffic to the entire 10.10.0.0/24 subnet should go through that one peer — which is wrong in a mesh because you have multiple peers on that subnet.


PersistentKeepalive: What It Actually Does

WireGuard is stateless by design. There’s no connection handshake, no connection state to maintain. A tunnel "exists" only as a set of cryptographic keys and routing rules on disk. This is a feature — it means there’s nothing to crash, nothing to reconnect.

The downside: when both peers are behind NAT (like nas and pi), neither knows the other’s current external IP:port. WireGuard learns the real endpoint of a peer dynamically — the first time a peer sends you a packet, WireGuard records their source IP:port as the current endpoint for that peer. But if neither side ever sends a packet, neither side knows where to send one.

PersistentKeepalive = 25 tells WireGuard: "if I haven’t sent anything to this peer in the last 25 seconds, send an empty keepalive packet." This does two things:

  1. It keeps the NAT mapping alive on your router. Most home routers time out UDP NAT entries after 30-300 seconds of silence. A keepalive every 25 seconds beats that.
  2. It means the remote peer always has a fresh endpoint for you — your keepalive packet tells them your current public IP:port.

When to use it: On every peer that’s behind NAT. Always. The overhead is negligible — an empty WireGuard packet is 32 bytes.

When not to bother: A server with a static public IP and ListenPort set doesn’t need to initiate contact first. Its peers will connect to it by Endpoint. The server benefits from having keepalive from the NAT’d peers, not by sending its own. That said, setting it on both ends of a NAT-to-NAT pair is correct and safe.

The nat-to-nat problem: nas and pi are both behind NAT with no Endpoint set for each other initially. How do they establish a direct tunnel?

This is where WireGuard’s endpoint learning comes in. Both nas and pi connect to vps on startup (they have an Endpoint for it). When nas sends a keepalive to vps, vps learns nas‘s current public IP:port. When pi sends a keepalive to vps, vps learns pi‘s current public IP:port. But vps doesn’t automatically relay this information to nas or pi.

If nas sends a packet to 10.10.0.3 (pi’s WireGuard IP), that packet goes… nowhere. The route is installed, but WireGuard has no endpoint for pi on nas.

For true NAT-to-NAT direct tunneling, you have two real options:

Option A — Use a STUN-like approach or dynamic endpoint discovery tool. Tools like wg-dynamic or scripts that periodically update endpoints can help, but this is complex.

Option B — Route all inter-peer traffic through the VPS. This is simpler and often the right call for a homelab. Change AllowedIPs on nas and pi so that traffic to other NAT’d peers goes through the VPS:

# On nas, instead of a direct peer entry for pi:
# Remove the [Peer] block for pi entirely.
# On the vps, add pi's 10.10.0.3/32 to nas's AllowedIPs — no wait.

# Simpler: on nas, route pi's address through vps
[Peer]
# vps — now handles routing to pi as well
PublicKey = <vps_public_key>
Endpoint = 203.0.113.1:51820
AllowedIPs = 10.10.0.1/32, 10.10.0.3/32
PersistentKeepalive = 25

This isn’t a full mesh anymore — it’s more of a hub-and-spoke for NAT’d peers with a direct spoke to the VPS. But it’s reliable. True NAT traversal between two NAT’d WireGuard peers without a relay requires the equivalent of hole-punching, which WireGuard doesn’t do natively.

Option C — Put one NAT’d node in a DMZ or with port forwarding. If you can forward UDP port 51820 on your home router to nas, it gets a de-facto public endpoint. Then pi can use Endpoint = <your-home-public-ip>:51820 and connect directly. Use a dynamic DNS service (ddclient, inadyn) if your home IP changes.


Enabling and Starting

# Enable IP forwarding (required on the vps for routing between peers)
# Add to /etc/sysctl.conf or /etc/sysctl.d/99-wireguard.conf
net.ipv4.ip_forward = 1
net.ipv6.conf.all.forwarding = 1

# Apply immediately
sysctl -p

# Start WireGuard on each node
systemctl enable --now wg-quick@wg0

# Check status
wg show

wg show is your best diagnostic tool. Check that latest handshake is recent (within the last ~30 seconds if keepalive is set), and that transfer shows bytes both sent and received. A peer stuck at "no handshake" with 0 bytes received means the tunnel hasn’t established — check firewall rules, verify the public keys match, and make sure the Endpoint IP:port is reachable.


Firewall Rules

WireGuard packets are UDP. On the VPS, allow incoming on the listen port:

# ufw
ufw allow 51820/udp

# iptables directly
iptables -A INPUT -p udp --dport 51820 -j ACCEPT

On the VPS, the PostUp/PostDown rules in the config handle forwarding between peers. If you’re using nftables instead of iptables, the equivalent:

table inet wg-mesh {
    chain forward {
        type filter hook forward priority 0;
        iifname "wg0" accept
        oifname "wg0" accept
    }
}

Gotchas

MTU mismatch kills throughput silently. WireGuard adds overhead to each packet (about 60 bytes for IPv4). If your underlying network has an MTU of 1500, your WireGuard interface should have an MTU of roughly 1420 (1500 – 60 – some headroom). Misconfigured MTU causes large packets to be silently dropped while small packets (like ICMP pings) work fine — a classic and maddening symptom. Set it explicitly:

[Interface]
MTU = 1420

AllowedIPs overlap crashes wg-quick. If two peers have overlapping AllowedIPs, WireGuard refuses to start because the route would be ambiguous. Common in copy-paste errors when adding a new peer. wg-quick up wg0 will tell you the conflicting range.

Clock skew breaks handshakes. WireGuard’s handshake has a replay window with a time component. If node clocks drift more than ~3 minutes apart, handshakes will fail with no useful error message. Run chrony or systemd-timesyncd on everything and make sure it’s actually syncing. Check with chronyc tracking or timedatectl.

Config reloads don’t apply AllowedIPs changes. wg syncconf can hot-reload most settings, but routing table entries set by wg-quick (from AllowedIPs) aren’t managed by WireGuard itself — they’re injected by wg-quick’s scripts. To pick up changes to AllowedIPs, you need wg-quick down wg0 && wg-quick up wg0. Plan your maintenance windows accordingly.

NAT behind CGNAT is a special hell. If your ISP uses carrier-grade NAT (you have a 100.64.x.x or non-routable address on your WAN interface), port forwarding won’t work — there’s no way to get inbound connections. Your only options are: use the VPS as a relay, or use a VPS with inbound connectivity as a "TURN server" equivalent. Some ISPs offer native IPv6 even when using CGNAT — WireGuard works fine over IPv6, and global IPv6 addresses don’t have NAT issues.

wg-quick’s DNS setting interferes with systemd-resolved. If you add DNS = 10.10.0.1 to the Interface block, wg-quick uses resolvconf or directly edits /etc/resolv.conf. On systems using systemd-resolved, this can break or be overridden. For homelab use, it’s often cleaner to manage DNS separately (via /etc/hosts, a local resolver like unbound, or systemd-resolved with per-link DNS configuration) rather than relying on wg-quick’s DNS injection.


Production-Ready Additions

Automate config generation. Once you have more than 4 nodes, writing configs by hand is error-prone. Write a simple script that reads a node list, generates all key pairs, and outputs all configs. Tools like wg-meshconf (Python, on PyPI) do exactly this and are worth the 10 minutes of setup.

Monitoring. Set up a cron or systemd timer to run wg show and alert if any peer hasn’t had a handshake in the last 5 minutes. A dead keepalive is usually a sign of network changes (new ISP IP, router reboot) and failing silently is annoying to debug later.

#!/bin/bash
# /usr/local/bin/wg-health-check
threshold=300  # 5 minutes in seconds

wg show all latest-handshakes | while read iface pubkey timestamp; do
    age=$(( $(date +%s) - timestamp ))
    if [ "$age" -gt "$threshold" ]; then
        echo "WARN: peer $pubkey on $iface last handshake ${age}s ago"
    fi
done

Split DNS per-peer. If your NAS exposes services on .local or a custom domain, configure your mesh-internal DNS to resolve those names only inside the tunnel. Running a lightweight resolver like dnsmasq on one well-connected node (the VPS) and pointing all peers at it via the WireGuard IP gives you clean internal naming without leaking to external DNS.

Key rotation policy. WireGuard doesn’t do automatic key rotation (Noise_IK doesn’t need it for most threat models), but for high-security setups you should still plan a manual rotation cadence. Document it. The process is: generate new key pair on the node, distribute new public key to all peers, update configs, restart WireGuard. Preshared keys are easier to rotate since they’re symmetric — one new PSK per pair, update both sides, done.

Scripted bringup order. In a mesh where some nodes are behind NAT and rely on the VPS to learn remote endpoints, bring the VPS up first. Then bring up the NAT’d peers. This isn’t strictly required for eventual convergence (keepalives will sort it out within 25 seconds of everyone starting), but if you’re scripting automated bringup — say, after a VPS reboot — sequencing it reduces the "why isn’t anything talking" confusion window.


Putting It Together

A WireGuard mesh is genuinely one of the more elegant pieces of networking infrastructure you can run yourself. The configuration format is minimal, the kernel module is fast and audited, and once it’s running it just… keeps running. No daemons to crash, no certificates to renew, no state machines to confuse.

The only part that requires real thought is routing — specifically, understanding that AllowedIPs is simultaneously a routing policy and an ingress filter, and that NAT traversal between two NAT’d peers isn’t automatic. Know your topology before you write your first config file, and the rest falls into place.

Start with three nodes. Get them talking. Then add the fourth. By the time you’re managing eight nodes, you’ll want a config generator — but the underlying mechanics won’t have changed at all.

Leave a comment

👁 Views: 2,289 · Unique visitors: 1,646