Recently I had a routing issue. To provide a little of context, I host my own server in a datacenter in my local associative ISP. It works really great, and I'm happy of it. Even more rad, the whole infrastructure is based on IPv6 and my ISP freely share me a
/48 block, great! To give a little more context, the server is a hypervisor with, obviously, the host and VMs. VMs don't access directly to the Internet but have to pass through a special VM which act like a router. The following image pictures it well (I hope it does):
Everything was working nice, everybody can reach the Internet and go smooth. Until that day. You know, the day when you realize that everything was just masquerade and world is just smoke screen. Anyway, from a VM, I tried to reach the host and… all packets are dropped. Quick inspection to try to troubleshoot and understand what's going on:
- Host is on DC's
- VM is on my
/48(shared and routed by ISP);
- VM router has one IP in each
/48(DC & mine).
So, starts the troubleshooting, port knocking to mimic the behavior I wish:
TCP connection can't be established between each machine, so obviously my web browser can't (I used a
Socks5 Proxy). Let's go deeper!
Really curious, the ping can only happen in a single direction: from the host to the vm. There is definitely a problem and a low-level one. You know what? Let's go even more deep! Since traffic must go through the VM router we are going to inspect which packets are transiting:
Interesting, isn't it? You don't understand? Well ok, I admit it's not really obvious if you aren't familiar with
ICMP. With these two
tcpdumps (you can consider it like the CLI's
Wireshark) is, yes, it's working for Host → VM, but we already know it. The most interesting part is to see that traffic is well forwarded from the VMs network to the network where Host is. Now we know that the issue is not in the VMs network.
Just in case to be sure that Host is working correctly, let's inspect on:
echo reply and that's all. The Host is working correctly and send the answer to the ping.
Because I'm lucky or just an admin member of my associative ISP, I have access to the DC's router, cool, isn't it? Router is pretty basic, it's an OpenBSD OS which do a few routing between VLans and does a little firewall to protect equipments from Public networks. Anyway, let's
Huuuuuu JUICY! Ahem, excuse me. Do you notice it?
When Host pings the VM it works, and we can see on the
tcpdump, both packets bounce on the DC's router (running OpenBSD). But when the VM pings the Host, on the router we only see an
echo reply; no
echo request, nada.
Do you want the answer now, or you want to wait a little more? Ok, there were two issues:
- Asymmetric routing;
- Stateful firewall.
In itself, asymmetric routing is not a problem, unless a stateful firewall is implied. The working case:
- On the Host, there were no routes to my
/48, so packets were simply sent to the default gateway which is DC's router;
- DC's router knows the route to my
/48and forward it to my VM router;
- VM router knows about my
/48and send it directly to the VM;
- VM receives the
echo requestand send back an
- VM router knows about the Host IP and send it to it;
- The Host is happy to get the
And now the non-working case:
- The VM sends the
echo requestto its default gateway to reach the Host;
- VM router knows how to reach the Host and sends it the packet;
- The Host receives the
echo requestand send back an
echo replybut don't know about my
/48, send it to its default gateway, DC's router;
- DC's router receives an
echo replywithout a being aware of an
echo requestand drops the packet.
tcpdump on the DC's router firewall interface (where drop/block traffic is logged) proves us the packet is refused!
Now you may have two questions:
- Why is the packet dropped?
- Why in VM → Host, Host don't directly send back to VM Router?
OpenBSD has a stateful firewall. Unlike Linux's IPtables, it cares about states. It's rules not only care about if a packet can match the rules but also about the previous packets that gone through the flow. That's why an
icmp reply without an
icmp request is dropped, it means nothing (and could be an attack).
If you think that Host would directly send back to VM router it's probably because you are used to NAT (Network Address Translation), where the router reforges an out coming packet with its own IP address (and removes the original source IP, to put its). If there were NAT then yes, Host would have just send back packet to VM router and not to DC's router.
How I fixed this issue? I added a static route to my block in the Host, to send
via VM router 'external' IP (the one available in DC's
What is the lesson? Do symmetric routing when a stateful router+firewall is involved in the loop.
Edit: Linux can have stateful firewall too with IPTables and NFTables,