Fixing VirtualBox "bridge_ports none" NO-CARRIER VirtualBox networking on Debian 11

Posted on

I recently upgraded from Debian 10 (Buster) to Debian 11 (Bullseye). I have a unique networking setup that allows VirtualBox VMs to hang off of non-bridged (bridge_ports none) bridge interfaces, allowing NAT and firewalling to be handled by my hosts’s iptables/nftables. Upgrading to Debian 11 caused this setup to mysteriously break. Hunting down the solution was super difficult, so this is a short post that’ll hopefully make it near to the top of Google results for things like “debian 11 virtualbox bridge no carrier” so the next person doesn’t have to suffer quite as many pages of purple links as I did 🤞

tl;dr

Comment/remove the line MACAddressPolicy=persistent in /lib/systemd/network/99-default.link to fix VirtualBox bridge_ports none bridge-based networking on Debian 11. This may have other implications. It might also get trampled by updates and so I need to look into how I can effect the same change somewhere that is meant to be for local configuration (probably in /etc/) but I haven’t noticed any harm done on my machine (yet) and it fixes a networking problem that’s important to me.

The setup

I have:

  • Debian as the host OS running nftables for NAT and packet filtering, and a handful of bridge interfaces for VirtualBox VMs
    • Docker running Docker containers
      • Getting Docker containers to co-exist nicely with my custom nftables rules was, and continues to be, a challenge, but that’s another story
    • VirtualBox running VMs connected to the host’s bridge interfaces

What do these bridge interfaces look like?

My host sees bridge interfaces such as:

% cat /etc/network/interfaces.d/vb0
auto vb0
iface vb0 inet static
  bridge_ports none
  address [SNIP]
  broadcast [SNIP].255
  netmask 255.255.255.0
  bridge_stp off
  bridge_waitport 0
  bridge_fd 0

% ip a sh dev vb0
5: vb0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether [SNIP] brd ff:ff:ff:ff:ff:ff
    inet [SNIP]/24 brd [SNIP].255 scope global vb0
       valid_lft forever preferred_lft forever
    inet6 fe80::[SNIP]/64 scope link
       valid_lft forever preferred_lft forever

And then by going into the settings of a VirtualBox VM, I’m able to do a “bridged” (in the VirtualBox sense) connection to this “bridge” (in the Linux sense) interface, at which point my nftables NAT/masquerade rules take over and are able to forward packets from the VirtualBox VM out to my LAN, to the WAN, and to hosts attached to other bridge interfaces and to containers hanging off of the docker0 interface - all subject to rules I specify in the nftables FORWARD chain.

I can’t remember where I first read about this type of setup, but for a long time it has served me well. For more details see:

There are undoubtedly other discussion of this setup on the web, but that’s not the point of this post. If you’re reading this post you might be having the same problem I was, in which case…

The problem

After doing a clean install of Debian 11 over the top of my working Debian 10 install, things went wrong. I had restored /etc/network/interfaces.d/* to configure my trusty old bridge interfaces, but VirtualBox VMs weren’t getting DHCP leases, and if I configured them with a static IP address and the expected default gateway, tcpdump on my host was showing packets arriving on the bridge interface but they weren’t doing anything useful or going anywhere important.

I noticed the interface looked a bit wonky:

% ip a sh dev vb0
125: vb0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default qlen 1000
    link/ether [SNIP] brd ff:ff:ff:ff:ff:ff
    inet [SNIP]/24 brd [SNIP].255 scope global vb0
       valid_lft forever preferred_lft forever

It has:

  • A state of DOWN instead of UNKNOWN
  • A state of NO-CARRIER
  • No link-local IPv6 address

At this point, there was much gnashing of teeth and Googling and reading and experimenting and backing things out. Nothing was working. I was heartened to read https://blog.germancoding.com/2021/08/28/dedicated-ip-addresses-and-virtual-machines/ which perfectly described the problem I was having (so it wasn’t just me or my machine) but the solution it proposes, to use tuntap interfaces instead of bridge_ports none bridge interfaces, didn’t fix the problem for me.

The solution

I stumbled upon https://github.com/systemd/systemd/issues/9252 “systemd-networkd creates bridges with no-carrier #9252” far too deep in the Google results to be pleasing, and a particular reply said:

I was able to track this down to a single udev file, 99-default.link:

root@host:/lib/systemd/network# cat 99-default.link.bak 
#  SPDX-License-Identifier: LGPL-2.1+
#
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.

[Match]
OriginalName=*

[Link]
NamePolicy=keep kernel database onboard slot path
MACAddressPolicy=persistent

Linux 5.10 systemd version 244, commit: 3ceaa81c61b654ebf562464d142675bd4d57d7b6, Yocto Dunfell, custom distro Patches applied are listed here: http://cgit.openembedded.org/openembedded-core/tree/meta/recipes-core/systemd/systemd_244.5.bb?h=dunfell#n17 Their content can be found here: http://cgit.openembedded.org/openembedded-core/tree/meta/recipes-core/systemd/systemd?h=dunfell

After further debugging, it’s specifically the MACAddressPolicy=persistent line that causes the issue for me.

Similarly, adding Type=!bridge in [Match] made it work. The only issue is, I do not have a persistent MAC address for my device.

Lo and behold, commenting out the line MACAddressPolicy=persistent in /lib/systemd/network/99-default.link followed by doing a sudo systemctl restart networking.service fixed the problem for me 🎉

As I said in the tl;dr, doing this may have other implications, and it might get trampled by updates. It’s on my list to look into the “right” way to negate this line in the default systemd config. I’m not as up on systemd as I’d like to be, but hacking on files in /lib/ feels wrong, and I’d like to think there’s something I could do in /etc/ to a file that isn’t managed by apt to do the same thing.

It’s been a week or so, and I haven’t noticed any harms done yet. If I’ve made some kind of horrible mistake, hit me up on Twitter or Mastodon and kindly let me know. Cheers!