Disclaimer
This post is aimed at those just starting to use kubernetes, it’s not a production worthy solution, though you can build upon it.
Installing kubernetes
I have used and tailored this post from IT Wonder Lab to create my home lab setup. It may fit the bill for you too. I like it as it uses vagrant as the infrastructure build tool along with ansible to do the k8s installation and configuration
Getting started
One of the first things that you want to do with k8s when you try out a multi-node cluster rather than the easy single all-in-one node build, is to install a load of stuff that you have done before on single node and scale it. As scaling is what its all about right π But you hit a brick wall when you realise you don’t have any shared storage available. Shared storage makes life much easier when you are clustering.
There is local storage you can use to at least get some sort of storage available: HostPath Volumes and Local Persistent Volumes but these tie your pods (containers) to a specific node which you probably didn’t notice when you have used a single node k8s build
But of course when you have a multi-node cluster then you find you can’t share files between your pods (containers) across those nodes very easily without some messing around, copying files between the nodes, outside of k8s.
Readily usable, simple and cheap shared storage
NFS of course is a really simple solution to this and it is simple to install and run. Not everyone has NetApp ONTAP available to them π
There are a lot of tutorials to setup an NFS share, so i wont go into that here.
For example techmint do a good tutorial here
What you will need to be aware of and do, is make sure that your NFS exported share allow access to your nodes.
This is because the NFS provisioner we will use, allocates the volume share from the node itself, not the actual pod. So it’s a little like Local Storage. The pod talks to the local k8s node it is running on and that node talks to the remote NFS server.
Here we can see that the NFS server (homeserver) allows access to two subnets one of which is the k8s node subnet.
$ showmount -e homeserver
Export list for homeserver:
/data/nfs_share 192.168.50.0/24,172.30.5.0/24
Installing a k8s package manager to do the heavy lifting – helm
Next we need to install helm – helm is brilliant, it allows reuse of code via a powerful package manager. Basically, someone else has done all the hard work of writing an installer for the NFS provisioner we will be using.
Install helm onto your control plane instance (manager) from the helm website
$ curl https://baltocdn.com/helm/signing.asc | sudo apt-key add -
$ sudo apt-get install apt-transport-https --yes
$ echo "deb https://baltocdn.com/helm/stable/debian/ all main" | sudo $ tee /etc/apt/sources.list.d/helm-stable-debian.list
$ sudo apt-get update
$ sudo apt-get install helm
Installing the NFS Provisioner
Once helm is installed we need to install the NFS provisioner.
Here is the code repository:
https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner
But don’t panic, helm takes care of this for us.
This is all it takes to tell your control plane manager to make that the storage provisioner available.
$ helm repo add nfs-subdir-external-provisioner https://kubernetes-sigs.github.io/nfs-subdir-external-provisioner/
Once the provisioner is enabled, we need to actually install and configure it. There are a lot of options that can be selected but the ones below will give you a simple enough configuration to be able to play.
$ helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--set nfs.server=172.30.5.67 \
--set nfs.path=/data/nfs_share \
--set persistence.storageClass=nfs-client \
--set persistence.size=10Gi
We have told the provisioner where out shared storage is (IP and share name), given it a k8s Storage Class name and how big we want it to be. Really simple.
Now, let’s check to see if we can see that Storage Class
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
nfs-client cluster.local/nfs-subdir-external-provisioner Delete Immediate true 34s
A tip here is to make that Storage Class the default for your cluster.
$ kubectl patch storageclass nfs-client -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
storageclass.storage.k8s.io/nfs-client patched
Lets check again and make sure it is the default
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
nfs-client (default) cluster.local/nfs-subdir-external-provisioner Delete Immediate true 3m3s
Now you can see that the flag (default) is set.
Uh-oh, I broke something
uh oh something is not right, the pod that manages the provisioner is not starting. You can see below it is stuck in ContainerCreating for 6mins now. That is not right.
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nfs-subdir-external-provisioner-76b4bc6f7d-sxvdr 0/1 ContainerCreating 0 6m21s
Looks like we have an issue somewhere, lets check the logs (events)
$ kubectl get events
52s Warning ProvisioningFailed persistentvolumeclaim/test-claim-client failed to provision volume with StorageClass "nfs-client": unable to create directory to provision new pv: mkdir /persistentvolumes/default-test-claim-client-pvc-0335846f-3e6f-4cc8-aef3-2699365165be: permission denied
okay that looks bad. We need to dig further through the events.
Warning FailedMount 52s kubelet MountVolume.MountDevice failed for volume "pvc-0335846f-3e6f-4cc8-aef3-2699365165be" : NFSDisk - mountDevice:FormatAndMount failed with mount failed: exit status 32
After a bit of Googling for that exit status 32, it seems the nodes do not have the underlying NFS storage drivers installed LOL, so lets do that now on each node. I did the control plane manger too.
sudo apt-get install -y nfs-common
This is what happens when you take someone else’s Infrastructure as Code and do not check it properly. The virtual box images i used ubuntu/focal64 did not have NFS installed. Of course we can update that ansible code from IT Wonder Lab to do this for us next time around. π
ok, lets uninstall that helm chart and reinstall it again.
$ helm uninstall nfs-subdir-external-provisioner
release "nfs-subdir-external-provisioner" uninstalled
$ helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner --set nfs.server=172.30.5.67 --set nfs.path=/data/nfs_share --set persistence.storageClass=nfs-client --set persistence.size=10Gi
NAME: nfs-subdir-external-provisioner
LAST DEPLOYED: Sat Mar 13 12:23:53 2021
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
Lets check that pod again
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nfs-subdir-external-provisioner-76b4bc6f7d-5bjgg 1/1 Running 0 17s
Woo Hoo! simples!
Now create a PVC and pod to use this shared storage
The next step is to allocate this storage. This is done via a PVC, or Persistent Volume Claim.
A PersistentVolumeClaim (PVC) is a request for storage by a user. … Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany, see AccessModes).
We do this by creating some YAML to define the state we want.
Create a file like the one below. You can see that we allocated 10Gi on the NFS server, but we only want 1Mi of that for our test claim.
$ cat test-pvc.yml
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: test-claim-client
annotations:
nfs.io/storage-path: "test-path" # not required, depending on whether this annotation was shown in the storage class description
spec:
storageClassName: nfs-client
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
In the above we have done a few interesting things.
In the spec we defined the Storage Class we have created above.
The access mode, in this case read/write and to allow other pods to access the same storage (Many).
Also it’s size 1Mi
Now we apply it to the cluster.
$ kubectl apply -f test-pvc.yml
persistentvolumeclaim/test-claim-client created
Let’s check
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
test-claim-client Bound pvc-1213d39e-b623-44db-a1f1-f3835197a212 1Mi RWX nfs-client 14s
If we look on the actual NFS server we can see it, and the UUIDs match
/data/nfs_share/ $ ls -la
total 12
drwxrwxrwx 3 root root 4096 Mar 13 13:12 .
drwxr-xr-x 7 root root 4096 Feb 26 16:08 ..
drwxrwxrwx 2 nobody nogroup 4096 Mar 13 13:11 default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212
Let’s check inside, as we expect there is nothing there.
s -l default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212/
total 0
Access the storage from a pod
Lets create a pod to use that and put something inside the storage.
Again, we create some yaml, to describe a pod and it’s attributes.
$ cat test-pod.yml
kind: Pod
apiVersion: v1
metadata:
name: test-pod
spec:
containers:
- name: test-pod
image: gcr.io/google_containers/busybox:1.24
command:
- "/bin/sh"
args:
- "-c"
- "touch /mnt/SUCCESS && exit 0 || exit 1"
volumeMounts:
- name: nfs-pvc
mountPath: "/mnt"
restartPolicy: "Never"
volumes:
- name: nfs-pvc
persistentVolumeClaim:
claimName: test-claim-client
I wont go into too much detail on how this pro is created, thats a whole blog post on its own. We use a small image (busybox) and get it to write a file to the pods file system. The pods filesystem has mounted within it (/mnt) our new PVC.
It will do this and exit (stop), thus the file that is written to is actually written to NFS and we should see it there.
$ kubectl apply -f test-pod.yml
pod/test-pod created
Let’s check it
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nfs-subdir-external-provisioner-76b4bc6f7d-5bjgg 1/1 Running 0 72m
test-pod 0/1 Completed 0 36s
Yes, the pod has run and completed. Let’s check the NFS share folder on the NFS server, and we can see the file was created.
$ ls -l default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212/
total 0
-rw-r--r-- 1 nobody nogroup 0 Mar 13 13:35 SUCCESS
Let’s clear up that pod.
$ kubectl delete -f test-pod.yml
pod "test-pod" deleted
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nfs-subdir-external-provisioner-76b4bc6f7d-5bjgg 1/1 Running 0 76m
Now check the NFS share folder on the NFS server again and we can see the file is still there, even though the pod is gone. So depending on how you define your PVC, you may need to do housekeeping.
$ ls -l default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212/
total 0
-rw-r--r-- 1 nobody nogroup 0 Mar 13 13:35 SUCCESS
Let’s remove the PVC and see what happens then
$ kubectl delete -f test-pvc.yml
persistentvolumeclaim "test-claim-client" deleted
$ kubectl get pvc
No resources found in default namespace.
What has happened to the folder on the NFS server?
$ ls -l default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212/
ls: cannot access 'default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212/': No such file or directory
Oh no! What has happened?
$ ls -l
total 4
drwxrwxrwx 2 nobody nogroup 4096 Mar 13 13:35 archived-pvc-1213d39e-b623-44db-a1f1-f3835197a212
As you can see it has been archived. Take a peek inside…
$ ls -l archived-pvc-1213d39e-b623-44db-a1f1-f3835197a212/
total 0
-rw-r--r-- 1 nobody nogroup 0 Mar 13 13:35 SUCCESS
Ah phew! nothing lost.
What about scaling?
Let’s use a deployment definition and scale that up.
We use the definition below as it uses some environment variable magic to allow us to write to a file that contains the pods IP, thus as the pods scale and increase in number it won’t overwrite any of the other pods files.
$ cat scaled-test-pod.yml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: test-nfs
name: test-nfs
spec:
replicas: 1
selector:
matchLabels:
app: test-nfs
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
app: test-nfs
spec:
containers:
- name: test-pod
image: gcr.io/google_containers/busybox:1.24
command:
- "/bin/sh"
args:
- "-c"
- "touch /mnt/SUCCESS-$MY_POD_IP && sleep 3600 || exit 1"
# from https://stackoverflow.com/a/58800597/7396553
env:
- name: MY_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: MY_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: MY_POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: MY_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: MY_POD_SERVICE_ACCOUNT
valueFrom:
fieldRef:
fieldPath: spec.serviceAccountName
volumeMounts:
- name: nfs-pvc
mountPath: "/mnt"
volumes:
- name: nfs-pvc
persistentVolumeClaim:
claimName: test-claim-client
Let’s apply just one pod first and see the output
$ kubectl apply -f scaled-test-pod.yml
deployment.apps/test-nfs created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nfs-subdir-external-provisioner-76b4bc6f7d-5bjgg 1/1 Running 0 108m
test-nfs-7d6d956c4-h9t9z 1/1 Running 0 19s
And check the underlying NFS data store
/data/nfs_share/default-test-claim-client-pvc-451af214-10b8-48c2-ac6a-704bbf9f6339 $ ls -la
total 8
drwxrwxrwx 2 nobody nogroup 4096 Mar 13 14:11 .
drwxrwxrwx 4 root root 4096 Mar 13 13:46 ..
-rw-r--r-- 1 nobody nogroup 0 Mar 13 13:52 SUCCESS
-rw-r--r-- 1 nobody nogroup 0 Mar 13 14:11 SUCCESS-192.168.122.139
So we have a file with the IP appended. π
Let’s scale that up
$ kubectl get deployments
NAME READY UP-TO-DATE AVAILABLE AGE
nfs-subdir-external-provisioner 1/1 1 1 110m
test-nfs 1/1 1 1 2m16s
$ kubectl scale --replicas=4 deployment test-nfs
deployment.apps/test-nfs scaled
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nfs-subdir-external-provisioner-76b4bc6f7d-5bjgg 1/1 Running 0 110m
test-nfs-7d6d956c4-h9t9z 1/1 Running 0 3m5s
test-nfs-7d6d956c4-h9tzf 1/1 Running 0 20s
test-nfs-7d6d956c4-qvvff 1/1 Running 0 20s
test-nfs-7d6d956c4-zpg87 1/1 Running 0 20s
So we have 4 pods running, let’s check the NFS storage…
total 8
drwxrwxrwx 2 nobody nogroup 4096 Mar 13 14:14 .
drwxrwxrwx 4 root root 4096 Mar 13 13:46 ..
-rw-r--r-- 1 nobody nogroup 0 Mar 13 13:52 SUCCESS
-rw-r--r-- 1 nobody nogroup 0 Mar 13 14:11 SUCCESS-192.168.122.139
-rw-r--r-- 1 nobody nogroup 0 Mar 13 14:14 SUCCESS-192.168.122.140
-rw-r--r-- 1 nobody nogroup 0 Mar 13 14:14 SUCCESS-192.168.122.141
-rw-r--r-- 1 nobody nogroup 0 Mar 13 14:14 SUCCESS-192.168.122.4
Excellent, all 4 pods are writing to the shared NFS PVC we created.
Let’s clean up…
$ kubectl delete -f scaled-test-pod.yml --force
deployment.apps "test-nfs" force deleted
$ kubectl delete -f test-pvc.yml
persistentvolumeclaim "test-claim-client" deleted
Additional NFS checks on the worker nodes
nfsiostat is useful to see the details on the NFS shares in use
$ nfsiostat
172.30.5.67:/data/nfs_share mounted on /var/lib/kubelet/pods/c9096597-9a3d-43a0-a280-241061ea89b9/volumes/kubernetes.io~nfs/nfs-subdir-external-provisioner-root:
ops/s rpc bklog
0.033 0.000
read: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms)
0.000 0.000 0.000 0 (0.0%) 0.000 0.000
write: ops/s kB/s kB/op retrans avg RTT (ms) avg exe (ms)
0.000 0.000 0.000 0 (0.0%) 0.000 0.000
NFS checks on the NFS server
On the NFS Server, checking the proc space in later kernels is very useful.
$ sudo ls -la /proc/fs/nfsd/clients/
[sudo] password for nick:
total 0
drw------- 3 root root 0 Mar 1 21:14 .
drwxr-xr-x 3 root root 0 Feb 26 16:05 ..
drw------- 2 root root 0 Mar 1 21:14 5
In the above we can see the client ids, and we can check these to see information about those clients as seen below.
$ sudo cat /proc/fs/nfsd/clients/5/info
clientid: 0x942ec40460391c55
address: "172.30.5.32:918"
name: "Linux NFSv4.2 k8s-n-1"
minor version: 2
Implementation domain: "kernel.org"
Implementation name: "Linux 5.4.0-66-generic #74-Ubuntu SMP Wed Jan 27 22:54:38 UTC 2021 x86_64"
Implementation time: [0, 0]
Troubleshooting
The page is very useful https://learnk8s.io/troubleshooting-deployments
Fin!
So, hopefully this rather long post, and it helps new users of k8s to enable a simple shared file store as NFS.
And now as its stopped raining I think we will take the kids and dogs out for a walk. π