VLAN for virtual machines

Aus Ingos Wiki
Version vom 27. September 2017, 15:13 Uhr von Ingo (Diskussion | Beiträge) (describe bridge with hook scripts)
Wechseln zu: Navigation, Suche

Introduction

I wanted to update VLAN connections for virtual machines to newer technologies and put a question on unix.stackexchange. But I do not get any answer. It seems there is very little knowledge for this out there. So I decided to work on it by myself and document it here.

In gerneral I will look at four methods:

  1. oldstyle linux bridge as hub
  2. linux bridge as hub
  3. linux bridge with libvirt hook scripts
  4. Open vSwitch

Preparation

I have Debian GNU/Linux 9.1 (stretch) on the host and on virtual machines for testing. Setup is described here: Setup KVM with console. I'm sitting on harley as host, my all day workstation. Now I start the virtual machine, login and show its interface setting:

harley$ virsh start --console deb9-test

login

deb9-test$ cat /etc/systemd/network/08-vlan10.netdev
[NetDev]
Name=vlan10
Kind=vlan
[VLAN]
Id=10
deb9-test$ cat /etc/systemd/network/12-vlan10_attach-to-if.network
[Match]
Name=ens2
[Network]
VLAN=vlan10
deb9-test$ cat /etc/systemd/network/16-vlan10_up.network
[Match]
Name=vlan10
[Network]
DHCP=ipv4
IPv6AcceptRA=no
LinkLocalAddressing=no

To test if the virtual machine has connection I use:

deb9-test$ journalctl -b --no-hostname -u systemd-networkd.service
-- Logs begin at Fri 2017-09-15 17:09:51 CEST, end at Sat 2017-09-23 20:34:20 CEST. --
Sep 23 20:34:05 systemd-networkd[204]: Enumeration completed
Sep 23 20:34:05 systemd[1]: Started Network Service.
Sep 23 20:34:05 systemd-networkd[204]: vlan10: netdev ready
Sep 23 20:34:05 systemd-networkd[204]: ens2: IPv6 enabled for interface: Success
Sep 23 20:34:05 systemd-networkd[204]: ens2: Gained carrier
Sep 23 20:34:05 systemd-networkd[204]: vlan10: Gained carrier
Sep 23 20:34:06 systemd-networkd[204]: ens2: Gained IPv6LL
Sep 23 20:34:06 systemd-networkd[204]: vlan10: Gained IPv6LL
Sep 23 20:34:09 systemd-networkd[204]: vlan10: DHCPv4 address 192.168.10.89/24 via 192.168.10.1
Sep 23 20:34:09 systemd-networkd[204]: vlan10: Configured
Sep 23 20:34:19 systemd-networkd[204]: ens2: Configured
deb9-test$

4 sec after Started Network Service it gets an IP-Address and 14 sec later interface ens2 was Configured. If ens2 is Configured and the guest hasn't got an IP-Address the connection failed. It looks like this:

deb9-test$ journalctl -b --no-hostname -u systemd-networkd.service
-- Logs begin at Fri 2017-09-15 17:09:51 CEST, end at Sat 2017-09-23 20:45:13 CEST. --
Sep 23 20:44:59 systemd-networkd[197]: Enumeration completed
Sep 23 20:44:59 systemd[1]: Started Network Service.
Sep 23 20:44:59 systemd-networkd[197]: vlan10: netdev ready
Sep 23 20:44:59 systemd-networkd[197]: ens2: IPv6 enabled for interface: Success
Sep 23 20:44:59 systemd-networkd[197]: ens2: Gained carrier
Sep 23 20:44:59 systemd-networkd[197]: vlan10: Gained carrier
Sep 23 20:45:00 systemd-networkd[197]: ens2: Gained IPv6LL
Sep 23 20:45:00 systemd-networkd[197]: vlan10: Gained IPv6LL
Sep 23 20:45:13 systemd-networkd[197]: ens2: Configured
deb9-test$

Because I have to start the test virtual machine many times I setup autologin. It's no problem. There is nothing on the guest.

deb9-test$ grep ^ExecStart= /lib/systemd/system/serial-getty@.service
ExecStart=-/sbin/agetty --keep-baud 115200,38400,9600 %I $TERM

modify to

ExecStart=-/sbin/agetty --autologin yourloginname --keep-baud 115200,38400,9600 %I $TERM

To list all settings of the bridge you can use:

harley$ find /sys/class/net/br0/bridge/ -type f -readable -printf '%f = ' -exec cat {} \; | sort

oldstyle linux bridge as hub

This works always with the old linux bridge that do not know anything about VLAN. The trick is to set it to a complete transparent state for all connected interfaces like a hub. But you have to know that the bridge will then forward all packets to all interfaces simultanously. You can do it by setting the ageing time to 0.

Disable systemd-networkd and start networking with ifupdown:

harley$ sudo systemctl stop systemd-networkd
Warning: Stopping systemd-networkd.service, but it can still be activated by:
  systemd-networkd.socket
harley$ sudo systemctl disable systemd-networkd
Removed /etc/systemd/system/multi-user.target.wants/systemd-networkd.service.
Removed /etc/systemd/system/sockets.target.wants/systemd-networkd.socket.
harley$ sudo ip link set dev br0 down && sudo ip link del dev br0
harley$ sudo systemctl enable networking.service
Synchronizing state of networking.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable networking
harley$ sudo systemctl start networking.service
harley$

Setup the bridge and start it:

harley$ cat /etc/network/interfaces
auto br0
iface br0 inet manual
    bridge_ports enp1s0
    bridge_ageing 0
    bridge_stp off
harley$ sudo ifup br0
Waiting for br0 to get ready (MAXWAIT is 32 seconds).
harley$

It's all in place now:

harley$ cat /sys/class/net/br0/bridge/ageing_time 
0
harley$ cat /sys/class/net/br0/bridge/stp_state 
0
harley$ cat /sys/class/net/br0/bridge/vlan_filtering 
0

Yes, there is no VLAN filtering, means VLAN on the bridge is disabled but the guest sees the VLAN-tagged packets.

References

linux bridge as hub

Now I try to setup #oldstyle linux bridge as hub just with systemd-networkd.

Disable networking with ifupdown and start systemd-networkd:

harley$ sudo systemctl stop networking.service
harley$ sudo systemctl disable networking.service
Synchronizing state of networking.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install disable networking
harley$ sudo ip link set dev br0 down && sudo ip link del dev br0
harley$ sudo systemctl enable systemd-networkd
Created symlink /etc/systemd/system/multi-user.target.wants/systemd-networkd.service → /lib/systemd/system/systemd-networkd.service.
Created symlink /etc/systemd/system/sockets.target.wants/systemd-networkd.socket → /lib/systemd/system/systemd-networkd.socket.
harley$ sudo systemctl start systemd-networkd
harley$

Setup the bridge and start it:

harley$ cat /etc/systemd/network/08-br0.netdev
[NetDev]
Name=br0
Kind=bridge
[Bridge]
AgeingTimeSec=0
STP=false
harley$ cat /etc/systemd/network/12-br0_add-enp1s0.network
[Match]
Name=enp1s0
[Network]
Bridge=br0
harley$ cat /etc/systemd/network/16-br0_up.network 
[Match]
Name=br0
harley$ sudo ip link set dev br0 down && sudo ip link del dev br0
harley$ sudo systemctl restart systemd-networkd
harley$

AgeingTimeSec=0 is not acepted but should:

harley$ cat /sys/class/net/br0/bridge/ageing_time 
30000   (means 300 sec)
harley$

But I've found a workaround. Useing a number between .01 and .000001 (there are dots) will set ageing_time to 0. So set AgeingTimeSec=.000001 in /etc/systemd/network/08-br0.netdev. I suppose it's a bug. Then we will get:

harley$ cat /sys/class/net/br0/bridge/ageing_time 
0
harley$ cat /sys/class/net/br0/bridge/stp_state 
0
harley$ cat /sys/class/net/br0/bridge/vlan_filtering 
0
harley$

The guest gets now an IP-Address on boot and is connected to VLAN 10.

Discussion

This works because of three conditions.

  1. ageing time is 0: ageing time specifies the number of seconds a MAC Address will be kept in the forwarding database after having a packet received from this MAC Address. Setting it to 0 means there is never a MAC Address stored in the FDB.
  2. unicast flood on interfaces is on: this controls whether the bridge should flood traffic for which an FDB entry is missing and the destination is unknown through this port. Defaults to on.
  3. spanning tree protocol (stp) is disabled: we don't have a forward_delay at startup for the learning phase of spanning tree.

I have a running and connected virtual machine:

harley$ sudo bridge vlan show
port    vlan ids
enp1s0   1 PVID Egress Untagged
br0      1 PVID Egress Untagged
vnet0    1 PVID Egress Untagged
harley$ cat /sys/class/net/br0/bridge/ageing_time
0
harley$ cat /sys/class/net/br0/bridge/forward_delay
1500
harley$ cat /sys/class/net/br0/bridge/stp_state
0

Indeed we have forward_delay 1500 (means 15 sec) but it doesn't matter. stp_state is 0 (disabled), no spanning tree. Flood (means unicast flood) is on as I can see:

harley$ sudo bridge -d link show
3: enp1s0 state UP : <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 4
   hairpin off guard off root_block off fastleave off learning on flood on mcast_flood on 
95: vnet0 state UNKNOWN : <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 100
   hairpin off guard off root_block off fastleave off learning on flood on mcast_flood on
harley$


Let's have a look at flooding on the interfaces. I disable it on the physical interface enp1s0 of the bridge and reboot the guest:

harley$ sudo bridge link set dev enp1s0 flood off
harley$

The guest gets an IP-Address from the DHCP-Server but then can't ping its gateway. DHCP-REQUEST is broadcast and goes thru enp1s0. DHCP-ANSWER comes back thru it to any other (here only vnet0) interface which has flood on. Ping is unicast and isn't forwareded on enp1s0. If I set enp1s0 flood on and vnet0 flood off and deb9-test$ sudo systemctl restart systemd-networkd, I get no IP-Address from DHCP-Server and can't ping the interface. Incoming DHCP-ANSWER isn't broadcast and vnet0 doesn't forward it to the guest.

Btw. this method has bad performance as we can see with monitor. We insert MAC-Addresses into FDB for just deleting it immediately, all for nothing.

harley$ sudo bridge monitor fdb
52:54:00:01:76:20 dev enp1s0 master br0 
52:54:00:b0:ca:63 dev vnet0 master br0 
f4:f2:6d:2c:87:f7 dev enp1s0 master br0 
00:80:3f:2a:31:1a dev enp1s0 master br0 
Deleted 52:54:00:01:76:20 dev enp1s0 master br0 stale
Deleted 52:54:00:b0:ca:63 dev vnet0 master br0 stale
Deleted 00:80:3f:2a:31:1a dev enp1s0 master br0 stale
Deleted f4:f2:6d:2c:87:f7 dev enp1s0 master br0 stale
...

References

linux bridge with libvirt hook scripts

We setup a bridge with VLAN enabled:

harley$ cat 08-br0.netdev 
[NetDev]
Name=br0
Kind=bridge
[Bridge]
DefaultPVID=none
VLANFiltering=true
STP=false
harley$ cat 12-br0_add-enp1s0.network 
[Match]
Name=enp1s0
[Network]
Bridge=br0
[BridgeVLAN]
VLAN=10
[BridgeVLAN]
VLAN=20
[BridgeVLAN]
VLAN=30
harley$ cat 16-br0_up.network 
[Match]
Name=br0

With this I get:

harley$ sudo bridge vlan show
port    vlan ids
enp1s0   1 PVID Egress Untagged
         10
         20
         30  
br0      1 PVID Egress Untagged
harley$

But what is this? We have default VLAN 1 PVID Egress Untagged. I don't want this. Seems setting DefaultPVID=none in 08-br0.netdev doesn't work. I've made a Workaround for setting DefaultPVID=none. Looking at this behavior I found that we can set default_pvid in the kernel only if vlan_filtering = 0. By hand I have to do:

harley$ sudo bash -c 'echo 0 >/sys/class/net/br0/bridge/vlan_filtering'
harley$ sudo bash -c 'echo 0 >/sys/class/net/br0/bridge/default_pvid'
harley$ sudo bash -c 'echo 1 >/sys/class/net/br0/bridge/vlan_filtering'
harley$

If I start a guest I will get now:

harley$ virsh start deb9-test
harley$ sudo bridge vlan show
port    vlan ids
enp1s0   10
         20
         30
br0     None
vnet0   None
harley$

The virtual network interface vnet0 for deb9-test has no VLAN Id. Libvirt does not know something about this so we have to tell it. Libvirt provides hook scripts that we can use for this. We have to:

  1. #define VLAN-ID the virtual machine belongs to
  2. #get information on startup from the runtime XML-config of the domain
  3. #set VLAN-ID to the dynamic virtual network interface vnet*

For debugging the hook-scripts I've made a small script:

harley$ cat debug.sh
#!/bin/bash -e
# https://www.libvirt.org/hooks.html
# If you make a new hook script then 'sudo systemctl restart libvirtd'.
# For debug set symlink to hook-script daemon, qemu, lxc, libxl and/or network,
# e.g. 'sudo ln -s debug.sh qemu' and restart libvirtd.

logfile='/var/log/libvirt/hooks.log'

echo "$0" >>$logfile
date -Iseconds >>$logfile
echo "\$1=$1, \$2=$2, \$3=$3, \$4=$4" >>$logfile
cat - >>$logfile
echo -e "\n---------------------------------------------" >>$logfile
harley$

define VLAN-ID the virtual machine belongs to

For thist we have an extra element <metadata> in Domain XML format for custom metadata. We can simply add the information to the static configuration with harley$ virsh edit deb9-test like this (look only at the <metadata> element):

harley$ virsh dumpxml deb9-test | head -n9
<domain type='kvm' id='1'>
  <name>deb9-test</name>
  <uuid>70d56a28-795d-4010-9403-513a4bd6b66a</uuid>
  <metadata>
    <my:home xmlns:my="http://hoeft-online.de/my/">
      <my:vlan>10</my:vlan>
    </my:home>
  </metadata>
  <memory unit='KiB'>1048576</memory>

get information on startup from the runtime XML-config of the domain

It seems a little bit difficult to get needed information out of the big XML-config but it's no problem with XSLT. I've made a XSL-stylesheet for this and use xmlstarlet. For developing I took a snapshot from runtime XML-config useing debug.sh and prepaired it to a well formed xml-document by hand for hook-parameter $2=start. This is the result:

harley$ cat qemu.xsl 
<?xml version="1.0" encoding="UTF-8"?>
<!-- This stylesheet extracts the VLAN-Id and the target device of the
     bridge from the domain-xml given to the libvirt hook-script "qemu".
     Example output: <meta><vlan>10</vlan><dev>vnet0</dev></meta>
-->
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
     xmlns:my="http://hoeft-online.de/my/" exclude-result-prefixes="my">
  <xsl:output omit-xml-declaration="yes" indent="no"
       encoding="utf-8" media-type="text/xml"/>
  <xsl:strip-space elements="*"/>
  <xsl:template match="text()|@*"/>

  <xsl:template match="/domain">
    <meta>
      <xsl:apply-templates/>
    </meta>
  </xsl:template>

  <xsl:template match="metadata/my:home/my:vlan">
    <vlan>
      <xsl:value-of select="."/>
    </vlan>
  </xsl:template>

  <xsl:template match='interface[@type="bridge"]/target'>
    <dev>
      <xsl:value-of select="@dev"/>
    </dev>
  </xsl:template>

<!-- vim: set sts=2 sw=2: --&t;
</xsl:stylesheet>
harley$
harley$ xmlstarlet tr qemu.xsl /var/log/libvirt/hooks.xml
<meta><vlan>10</vlan><dev>vnet0</dev></meta>harley$

set VLAN-ID to the dynamic virtual network interface vnet*

Putting it all together here is the hook-script:

harley$ cat /etc/libvirt/hooks/qemu
#!/bin/bash -e
#/etc/libvirt/hooks/qemu
# Docs: https://www.libvirt.org/hooks.html
# If you make a new hook script then 'sudo systemctl restart libvirtd'.

# On startup of the domain (guest) This script does:
# get Metadata VLAN-Id of the guest and target device of the bridge from
    # the domain-xml available on standard input. It is the runtime
    # version from 'virsh dumpxml domainname'. For extracting the
    # information we use a XSL-stylesheet. Example input into $META:
    # <meta><vlan>10</vlan><dev>vnet0</dev></meta>
# Select $DEV  from $META
# Select $VLAN from $META
# Set $VLAN to $DEV on the bridge

case "$2" in
  prepare)
    ;;
  start)
    META=$(/usr/bin/xmlstarlet tr /etc/libvirt/hooks/qemu.xsl -)
    DEV=$(echo "$META" | /usr/bin/xmlstarlet sel -t -v '/meta/dev')
    VLAN=$(echo "$META" | /usr/bin/xmlstarlet sel -t -v '/meta/vlan')
    if [[ -n $DEV && -n $VLAN ]]; then
      /sbin/bridge vlan add vid "$VLAN" dev "$DEV"
    fi
    ;;
  started)
    ;;
  stopped)
    ;;
  release)
    ;;
  migrate)
    ;;
  restore)
    ;;
  reconnect)
    ;;
  attach)
    ;;
  *)
    echo "qemu hook called with unexpected options $*" >&2
    exit 1
    ;;
esac
harley$

Workaround for setting DefaultPVID=none

References