"Cloud-like" Infrastructure at Home - Part 1: LoadBalancers on the Metal
By Calum MacRae
August 10, 2020
This is the introductory post to a series outlining how I achieve "Cloud-like" application deployments on my personal servers. If you're interested in Platform Engineering/SRE/DevOps, read on!
What does "Cloud-like" mean?
I'm using this phrase to express a set of capabilities that engineers have come to expect from cloud providers in their offerings for hosting infrastructure. Namely: load balancers, dynamic DNS, certificate leasing, and perhaps a few others.
Who's this post for?
I'm guessing by now you have at least some interest in the subject, so I'll state some assumptions:
- You're somewhat familiar with Kubernetes & its configuration with YAML
- You're somewhat familiar with basic networking concepts, like DHCP, DNS, TCP/IP, ARP
- You're not looking for a guide to set-up self-hosted Kubernetes (I'll write about how I do this in a future post) - we're going to be working with an already running cluster
All set then? Let's dive in!
What're we solving today?
LAN routable traffic to services hosted in Kubernetes
Say we have a web service we want to deploy - let's call it "coffee"
We deploy it into k8s (Kubernetes) and can access it by using
$ kubectl port-forward coffee-5b8f7c69bd-9x6sk 8080:80
Now we can visit localhost:8080/
to reach our service - great!
But we don't want to have to rely on kubectl
to handle the proxying for us.
What about other devices on the network that we want to be able to access the coffee
service?
Well, once we've gotten through this post, we'll end up getting a LAN accessible web service we can reach via a segment of our LAN address space - like any other physical device on the network - just by deploying k8s manifests.
No need to pick out an IP address and set up static routes, no need for any iptables
magic.
Accessing Kubernetes services via IP
When deploying a pod of containers providing a network application, we can expose its port
so it's accessible using a Service
. Here's what a Service
accompanying a Deployment
for
coffee
might look like:
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: coffee
spec:
replicas: 1
selector:
matchLabels:
app: coffee
template:
metadata:
labels:
app: coffee
spec:
containers:
- name: coffee
image: example/coffee
ports:
- name: api
containerPort: 80
---
kind: Service
apiVersion: v1
metadata:
name: coffee
spec:
selector:
app: coffee
ports:
- name: api
port: 8080
targetPort: api
After evaluating this manifest, our k8s cluster would have a coffee
service that's available
within the cluster network. Other pods within the same namespace could simply call out to http://coffee:8080
,
or pods deployed in other namespaces could reach it via http://coffee.example:8080
(example
being the
namespace we deployed the coffee
resources to).
By default, we'll get a ClusterIP
type Service
. I won't go into much detail here about k8s Service
objects, but it's useful to outline the available ServiceTypes
, in particular "publishing" services.
Publishing ServiceTypes
Let's take a look at each type, leaning on excerpts from the official documentation.
ClusterIP
: Exposes the Service on a cluster-internal IP. Choosing this value makes the Service only reachable from within the cluster. This is the default ServiceType.
We can quite plainly see this is a no-go for what we want, we want our service to be reachable outside the cluster.
NodePort
: Exposes the Service on each Node’s IP at a static port (the NodePort). A ClusterIP Service, to which the NodePort Service routes, is automatically created. You’ll be able to contact the NodePort Service, from outside the cluster, by requesting <NodeIP>:<NodePort>.
This looks like it meets our needs in terms of external (to the cluster, not our LAN) access. But on second thought, it begs the question "how do we know which node, and thus which NodeIP we'll be using to reach the service?" We don't want to have to maintain node tains/tolerations to schedule on specific nodes to achieve this.
No dice.
ExternalName
: Maps the Service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record with its value. No proxying of any kind is set up.
Hmm, this doesn't seem like a good fit either. There's no mention of traffic being routable for cluster external networking. It's simply about mapping external DNS records to services within the cluster, for other cluster residents.
Next!
LoadBalancer
: Exposes the Service externally using a cloud provider’s load balancer. NodePort and ClusterIP Services, to which the external load balancer routes, are automatically created.
This sounds perfect! Except… we're not deploying to some cloud environment. We don't have a cloud provider's solution to provision load balancers.
…Or do we?
LoadBalancers on the metal
Turns out a service of type LoadBalancer
is actually achievable, without the cloud provider!
MetalLB is a project that aims to bring the LoadBalancer
ServiceType to k8s clusters provisioned
on bare metal - and does so very well.
Deployment & configuration
MetalLB has two modes of operation: BGP & Layer2. Both are self explanatory, if you're familiar with the respective network protocols - I won't be expanding on either, as it's out of scope for this write-up.
For our implementation, we're going to keep things simple and go with Layer2.
MetalLB is deployed entirely with native k8s manifests. How you choose to deploy your YAML is up to you. There are a few options:
I personally use ArgoCD to deploy the Helm chart, but details on that are for another post.
The installation documentation is straight forward and can be found here.
Let's focus on the configuration. MetalLB evaluates its configuration through a ConfigMap
.
For the Layer2 configuration, it's as simple as picking out an IP range you want your services
to be allocated an address from.
My LAN is 10.0.0.0/16
, I opted to slice out 10.0.42.0/24
. So my configuration looks like:
apiVersion: v1
kind: ConfigMap
metadata:
namespace: metallb-system
name: config
data:
config: |
address-pools:
- name: default
protocol: layer2
addresses:
- 10.0.42.0/24
If we configure MetalLB with this, we can then deploy Service
objects with spec.type: LoadBalancer
in their manifest and expect to get an IP leased from the 10.0.42.0/24
pool.
Take it for a spin
Let's update our coffee
service manifest to set spec.type
to LoadBalancer
---
kind: Service
apiVersion: v1
metadata:
name: coffee
spec:
selector:
app: coffee
ports:
- name: coffee
port: 8080
targetPort: coffee
type: LoadBalancer # <- here
Applying this will yield something like
$ kubectl describe svc coffee
Name: coffee
Namespace: default
Labels: <none>
Annotations: Selector: app=coffee
Type: LoadBalancer
IP: 172.16.57.10
LoadBalancer Ingress: 10.0.42.0
Port: api 8080/TCP
TargetPort: coffee/TCP
NodePort: api 30786/TCP
Endpoints: 192.168.2.7:8080
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal IPAllocated 9s metallb-controller Assigned IP "10.0.42.0"
For those of us who aren't networking wizards, something may look a bit strange here: 10.0.42.0
.
Rest assured, that's actually a routable address. Most day-to-day DHCP configurations will
start the allocation range at X.X.X.1
, but there's no need here.
So, if our service is alive, we should be able to establish a TCP session with it. Let's try with good old netcat
$ nc -vz 10.0.42.0 8080
Connection to 10.0.42.0 port 8080 [tcp/http] succeeded!
Woo! If we look a little closer with a simple ping
, we'll see something interesting:
$ ping 10.0.42.0
PING 10.0.42.0 (10.0.42.0): 56 data bytes
Request timeout for icmp_seq 0
92 bytes from 10.0.10.2: Redirect Host(New addr: 10.0.42.0)
Vr HL TOS Len ID Flg off TTL Pro cks Src Dst
4 5 00 0054 903f 0 0000 3f 01 ac67 10.0.1.3 10.0.42.0
It's not that it didn't respond to the ICMP request, it's that we got
92 bytes from 10.0.10.2: Redirect Host(New addr: 10.0.42.0)
That address, 10.0.10.2
, is the IP for compute2
, an active compute node in my cluster.
Let's take a look at where coffee
's pod was scheduled
NAME READY STATUS RESTARTS AGE IP NODE
coffee-65b9b69679-96kl8 1/1 Running 0 30m 192.168.2.7 compute2.cmacr.ae
There it is, on compute2
. Since this is a layer 2 let's see what those two addresses
look like in the ARP table
$ arp -a
? (10.0.10.2) at b8:ae:ed:7d:19:6 on en0 ifscope [ethernet]
? (10.0.42.0) at b8:ae:ed:7d:19:6 on en0 ifscope [ethernet]
And there you have it: the same MAC address. Hopefully by stepping through the flow of traffic so far, it becomes a little clearer how layer 2 mode in MetalLB works.
Our cluster is now set up to receive external traffic, from the rest of our LAN. Perfect! Though, those IPs that MetalLB is leasing are dynamic. We don't want to have to keep track of which service has which IP by asking k8s…
Watch out for 'Part 2: Hosting your own dynamic DNS solution'
Next time I'll detail how I simplify reaching these services with human friendly DNS records
Thanks for reading!
- Posted on:
- August 10, 2020
- Length:
- 7 minute read, 1451 words
- Tags:
- kubernetes linux networking devops