Weave doesn't work (consistently) on RHEL8
To borrow a phrase I sometimes over-use with my team: sometimes, you have to go round a problem rather than through it.
We use Kubernetes, quite a lot. And one of its disappointments to me is the way it has yet to stablize. To take a concrete example … do we really need multiple different ways to do container networking?
Maybe some people do, but it makes me sad that we don’t have a “right answer” which is shipped by default for those of us who just want things to work out of the box.
For many years, we have mostly used Weave for our container networking needs. And I’ll admit right up front that we may not be doing things the latest and greatest way. However, it seemed to work and that’s all we needed.
However, RHEL8 really spoilt the party and gave me many more grey hairs working this all out.
In summary:
- Some customer RHEL8 boxes never had working pod networking while we used Weave
- We had a reference environment of our own, as near-identical as we could make it, which did work.
We still don’t know what the difference was, although we suspect it’s something related to this. Certainly, the way RHEL8 doesn’t include iptables-legacy feels like it might be related - Weave adds legacy rules, other things on the box add proper nftables rules, the two meet and sadness ensues. It’s possible our reference environment “got away with it” because it had nothing else adding any kind of rules, where the customer box did.
How did we solve it? Well, we switched over to Calico, installed like this:
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml
The fact that I really struggled to find the above on Google again shows how many layers of compexity there are around this stuff.
Anyway, Calico, unlike Weave, appears to be reliably working on RHEL8 and also doesn’t need configuring with the pod network CIDR (I assume it interrogates the cluster for it). So it will be our choice for as long as it stays out of our way.