Recently I had a routing issue. To provide a little of context, I host my own server in a datacenter in my local associative ISP. It works really great, and I'm happy of it. Even more rad, the whole infrastructure is based on IPv6 and my ISP freely share me a /48 block, great! To give a little more context, the server is a hypervisor with, obviously, the host and VMs. VMs don't access directly to the Internet but have to pass through a special VM which act like a router. The following image pictures it well (I hope it does):

Everything was working nice, everybody can reach the Internet and go smooth. Until that day. You know, the day when you realize that everything was just masquerade and world is just smoke screen. Anyway, from a VM, I tried to reach the host and… all packets are dropped. Quick inspection to try to troubleshoot and understand what's going on:

  • Host is on DC's /48;
  • VM is on my /48 (shared and routed by ISP);
  • VM router has one IP in each /48 (DC & mine).

So, starts the troubleshooting, port knocking to mimic the behavior I wish:

duponin@host $ nc -vzw5 vm.locahlo.st 80
nc: connect to vm.locahlo.st port 80 (tcp) failed: Connection timed out

duponin@vm $ nc -vzw5 host.locahlo.st 80
nc: connect to host.locahlo.st port 80 (tcp) failed: Connection timed out
Netcat port knocking and timeout after 5 seconds.

A basic TCP connection can't be established between each machine, so obviously my web browser can't (I used a Socks5 Proxy). Let's go deeper!

duponin@host $ ping -6c1 vm.locahlo.st
PING vm.locahlo.st(...) 56 data bytes
64 bytes from ... (...): icmp_seq=1 ttl=63 time=0.752 ms

--- vm.locahlo.st ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.752/0.752/0.752/0.000 ms


#


duponin@vm $ ping -6c1 host.locahlo.st
PING host.locahlo.st(...) 56 data bytes
64 bytes from ... (...): Destination Host Unreachable

--- vm.locahlo.st ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms
Ping

Really curious, the ping can only happen in a single direction: from the host to the vm. There is definitely a problem and a low-level one. You know what? Let's go even more deep! Since traffic must go through the VM router we are going to inspect which packets are transiting:

# Router's interface connected to the switch
[duponin@router:~]$ sudo tcpdump -i ens3 host vm.locahlo.st and host host.locahlo.st and icmp6
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
# ping host → vm
17:43:35.476356 IP6 host.locahlo.st > vm.locahlo.st: ICMP6, echo request, seq 1, length 64
17:43:35.476624 IP6 vm.locahlo.st > host.locahlo.st: ICMP6, echo reply, seq 1, length 64
# It works!

# ping vm → host
17:43:53.575408 IP6 vm.locahlo.st > host.locahlo.st: ICMP6, echo request, seq 1, length 64
# It don't works


# Router's interface connected to the VM
[duponin@router:~]$ sudo tcpdump -i ens4 host vm.locahlo.st and host host.locahlo.st and icmp6
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens4, link-type EN10MB (Ethernet), capture size 262144 bytes
# ping host → vm
17:43:35.476395 IP6 host.locahlo.st > vm.locahlo.st: ICMP6, echo request, seq 1, length 64
17:43:35.476616 IP6 vm.locahlo.st > host.locahlo.st: ICMP6, echo reply, seq 1, length 64
# It works!

# ping vm → host
17:42:33.404911 IP6 vm.locahlo.st > host.locahlo.st: ICMP6, echo request, seq 1, length 64
# It don't works
TCPDump analysis on the VM router

Interesting, isn't it? You don't understand? Well ok, I admit it's not really obvious if you aren't familiar with ICMP. With these two tcpdumps (you can consider it like the CLI's Wireshark) is, yes, it's working for Host → VM, but we already know it. The most interesting part is to see that traffic is well forwarded from the VMs network to the network where Host is. Now we know that the issue is not in the VMs network.

Just in case to be sure that Host is working correctly, let's inspect on:

duponin@host $ sudo tcpdump -i enp38s0 host vm.locahlo.st and host host.locahlo.st and icmp6
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp38s0, link-type EN10MB (Ethernet), capture size 262144 bytes
18:00:34.387744 IP6 vm.locahlo.st > coruscant.locahlo.st: ICMP6, echo request, seq 1, length 64
18:00:34.387787 IP6 coruscant.locahlo.st > vm.locahlo.st: ICMP6, echo reply, seq 1, length 64
TCPDump on the Host

echo request, echo reply and that's all. The Host is working correctly and send the answer to the ping.

Because I'm lucky or just an admin member of my associative ISP, I have access to the DC's router, cool, isn't it? Router is pretty basic, it's an OpenBSD OS which do a few routing between VLans and does a little firewall to protect equipments from Public networks. Anyway, let's tcpdump there:

-bash-5.0$ doas tcpdump -i vlan100 host host.locahlo.st and host vm.locahlo.st and icmp6
tcpdump: listening on vlan100, link-type EN10MB
# vm → host
18:06:24.459343 host.locahlo.st > vm.locahlo.st: icmp6: echo reply [flowlabel 0x70672]

# host → vm
18:06:33.105555 host.locahlo.st > vm.locahlo.st: icmp6: echo request [flowlabel 0x90248]
18:06:33.105646 host.locahlo.st > vm.locahlo.st: icmp6: echo request [flowlabel 0x90248]
TCPDump on the DC's router

Huuuuuu JUICY! Ahem, excuse me. Do you notice it? (·_· ')

When Host pings the VM it works, and we can see on the tcpdump, both packets bounce on the DC's router (running OpenBSD). But when the VM pings the Host, on the router we only see an echo reply; no echo request, nada.

Do you want the answer now, or you want to wait a little more? Ok, there were two issues:

  1. Asymmetric routing;
  2. Stateful firewall.

In itself, asymmetric routing is not a problem, unless a stateful firewall is implied. The working case:

  1. On the Host, there were no routes to my /48, so packets were simply sent to the default gateway which is DC's router;
  2. DC's router knows the route to my /48 and forward it to my VM router;
  3. VM router knows about my /48 and send it directly to the VM;
  4. VM receives the echo request and send back an echo reply;
  5. VM router knows about the Host IP and send it to it;
  6. The Host is happy to get the echo reply.

And now the non-working case:

  1. The VM sends the echo request to its default gateway to reach the Host;
  2. VM router knows how to reach the Host and sends it the packet;
  3. The Host receives the echo request and send back an echo reply but don't know about my /48, send it to its default gateway, DC's router;
  4. DC's router receives an echo reply without a being aware of an echo request and drops the packet.
-bash-5.0$ doas tcpdump -e -ttt -i pflog0
doas (duponin@router.myisp.tld) password: 
tcpdump: WARNING: snaplen raised from 116 to 160
tcpdump: listening on pflog0, link-type PFLOG
Jan 20 00:34:05.116959 rule 8/(match) block in on vlan100: host.locahlo.st > vm.locahlo.st: icmp6: echo reply [flowlabel 0x70672]
TCPDump on the DC's router

A tcpdump on the DC's router firewall interface (where drop/block traffic is logged) proves us the packet is refused!

Now you may have two questions:

  1. Why is the packet dropped?
  2. Why in VM → Host, Host don't directly send back to VM Router?

OpenBSD has a stateful firewall. Unlike Linux's IPtables, it cares about states. It's rules not only care about if a packet can match the rules but also about the previous packets that gone through the flow. That's why an icmp reply without an icmp request is dropped, it means nothing (and could be an attack).

If you think that Host would directly send back to VM router it's probably because you are used to NAT (Network Address Translation), where the router reforges an out coming packet with its own IP address (and removes the original source IP, to put its). If there were NAT then yes, Host would have just send back packet to VM router and not to DC's router.

How I fixed this issue? I added a static route to my block in the Host, to send via VM router 'external' IP (the one available in DC's /48).

What is the lesson? Do symmetric routing when a stateful router+firewall is involved in the loop.

Happy Networking!

Edit: Linux can have stateful firewall too with IPTables and NFTables, ctstate RELATED,ESTABLISHED.