Admin Admin podcast – A comedy of errors.

I’m a regular listener to the Admin Admin podcast and lurk on their telegram channel, so i was quite chuffed when i was asked to appear with them and talk a little about one of the roles i have at work where i build Linux “golden images” that are used globally for our customers…

Listen here:

Show notes:

Click listen from here

The guys were brilliant, we had a fair few technical issues whilst doing it, but they took it on the chin and were very chilled out about it. One thing i can say is you can really tell the quality of my mic is so poor compared to the ones the guys are using so apologies about that, i really wasn’t sitting at the back of the room.

I really had a great time and loads of fun, so thanks to Al, Stuart, Jerry and of course Jon (who asked me to appear).

Posted in Automation, Security | 1 Comment

Elastic Search daily index size view

I could never find a decent way to forecast the index sizes in Elastic Search. In the Kibana GUI, under Stack Management, you can see the total index size which you need to divide by the number of nodes that the index data is stored on, and that gives you an idea, but you can’t visualise it.

So, to get some sort of rough average, I wrote some python code to do what i needed. The following is done.

  • Pick an index
  • Work out the average size of a document in that index
  • Count the number of documents in the previous day
  • daily index size = (number of docs that day) x (average doc size)

It’s not 100% but it’s going to allow you to see the index sizes and forecast some trends.

A docker image

My Elastic stack is running in docker, so an image is included with a docker-compose file, but the python code can be run on where ever you want.

So you can run the code as often as you want and create a visualisation in Kibana to see the results.


So after a few days running, i can see we are ingesting 9GB a day and also which index is ingesting the most often.

So the green index is the one that is ingesting more data each day.

Posted in Automation, Containers, DevOps | Leave a comment

Simple view of ElasticSearch templates

I wanted a simple way to view the Elastic Templates we use to sense check what we had in place.

The following script is what i came up with. Basically it uses the awesome ‘jq’ to flatten the dict/hash returned from the Elastic API

curl -s -XGET "http://$ES/$INDEX/_mapping " |\
jq -r –arg INDEX "$INDEX" '.[$INDEX] |
[leaf_paths as $path |
{"key": $path | join("."), "value": getpath($path)}] |
from_entries' |\
grep -v fields |\
sed -e 's/.type": "/: "/'
"@timestamp: "date",
"avg_doc_size: "float",
"daily_size: "float",
"doc_count: "long",
"index: "text",
"name: "text",
"query_count: "long",
"size_in_bytes: "long",
"tags: "text",
view raw output.log hosted with ❤ by GitHub

As you can see it’s much easier to view.
I think, being clever, a way to express the ‘keyword’ field of ‘text’ might be good.


Posted in Automation, DevOps | Leave a comment

kubernetes and shared storage


This post is aimed at those just starting to use kubernetes, it’s not a production worthy solution, though you can build upon it.

Installing kubernetes

I have used and tailored this post from IT Wonder Lab to create my home lab setup. It may fit the bill for you too. I like it as it uses vagrant as the infrastructure build tool along with ansible to do the k8s installation and configuration

Getting started

One of the first things that you want to do with k8s when you try out a multi-node cluster rather than the easy single all-in-one node build, is to install a load of stuff that you have done before on single node and scale it. As scaling is what its all about right 🙂 But you hit a brick wall when you realise you don’t have any shared storage available. Shared storage makes life much easier when you are clustering.

There is local storage you can use to at least get some sort of storage available: HostPath Volumes and Local Persistent Volumes but these tie your pods (containers) to a specific node which you probably didn’t notice when you have used a single node k8s build

But of course when you have a multi-node cluster then you find you can’t share files between your pods (containers) across those nodes very easily without some messing around, copying files between the nodes, outside of k8s.

Readily usable, simple and cheap shared storage

NFS of course is a really simple solution to this and it is simple to install and run. Not everyone has NetApp ONTAP available to them 🙂

There are a lot of tutorials to setup an NFS share, so i wont go into that here.
For example techmint do a good tutorial here

What you will need to be aware of and do, is make sure that your NFS exported share allow access to your nodes.

This is because the NFS provisioner we will use, allocates the volume share from the node itself, not the actual pod. So it’s a little like Local Storage. The pod talks to the local k8s node it is running on and that node talks to the remote NFS server.

Here we can see that the NFS server (homeserver) allows access to two subnets one of which is the k8s node subnet.

$ showmount -e homeserver
Export list for homeserver:

Installing a k8s package manager to do the heavy lifting – helm

Next we need to install helm – helm is brilliant, it allows reuse of code via a powerful package manager. Basically, someone else has done all the hard work of writing an installer for the NFS provisioner we will be using.

Install helm onto your control plane instance (manager) from the helm website

$ curl | sudo apt-key add -
$ sudo apt-get install apt-transport-https --yes
$ echo "deb all main" | sudo $ tee /etc/apt/sources.list.d/helm-stable-debian.list
$ sudo apt-get update
$ sudo apt-get install helm

Installing the NFS Provisioner

Once helm is installed we need to install the NFS provisioner.

Here is the code repository:

But don’t panic, helm takes care of this for us.

This is all it takes to tell your control plane manager to make that the storage provisioner available.

$ helm repo add nfs-subdir-external-provisioner

Once the provisioner is enabled, we need to actually install and configure it. There are a lot of options that can be selected but the ones below will give you a simple enough configuration to be able to play.

$ helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
--set nfs.server= \
--set nfs.path=/data/nfs_share \
--set persistence.storageClass=nfs-client \
--set persistence.size=10Gi

We have told the provisioner where out shared storage is (IP and share name), given it a k8s Storage Class name and how big we want it to be. Really simple.

Now, let’s check to see if we can see that Storage Class

$ kubectl get sc
nfs-client cluster.local/nfs-subdir-external-provisioner Delete Immediate true 34s

A tip here is to make that Storage Class the default for your cluster.

$ kubectl patch storageclass nfs-client -p '{"metadata": {"annotations":{"":"true"}}}' patched

Lets check again and make sure it is the default

$ kubectl get sc
nfs-client (default) cluster.local/nfs-subdir-external-provisioner Delete Immediate true 3m3s

Now you can see that the flag (default) is set.

Uh-oh, I broke something

uh oh something is not right, the pod that manages the provisioner is not starting. You can see below it is stuck in ContainerCreating for 6mins now. That is not right.

$ kubectl get pods
nfs-subdir-external-provisioner-76b4bc6f7d-sxvdr 0/1 ContainerCreating 0 6m21s

Looks like we have an issue somewhere, lets check the logs (events)

$ kubectl get events
52s Warning ProvisioningFailed persistentvolumeclaim/test-claim-client failed to provision volume with StorageClass "nfs-client": unable to create directory to provision new pv: mkdir /persistentvolumes/default-test-claim-client-pvc-0335846f-3e6f-4cc8-aef3-2699365165be: permission denied

okay that looks bad. We need to dig further through the events.

Warning FailedMount 52s kubelet MountVolume.MountDevice failed for volume "pvc-0335846f-3e6f-4cc8-aef3-2699365165be" : NFSDisk - mountDevice:FormatAndMount failed with mount failed: exit status 32

After a bit of Googling for that exit status 32, it seems the nodes do not have the underlying NFS storage drivers installed LOL, so lets do that now on each node. I did the control plane manger too.

sudo apt-get install -y nfs-common

This is what happens when you take someone else’s Infrastructure as Code and do not check it properly. The virtual box images i used ubuntu/focal64 did not have NFS installed. Of course we can update that ansible code from IT Wonder Lab to do this for us next time around. 🙂

ok, lets uninstall that helm chart and reinstall it again.

$ helm uninstall nfs-subdir-external-provisioner
release "nfs-subdir-external-provisioner" uninstalled

$ helm install nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner --set nfs.server= --set nfs.path=/data/nfs_share --set persistence.storageClass=nfs-client --set persistence.size=10Gi
NAME: nfs-subdir-external-provisioner
LAST DEPLOYED: Sat Mar 13 12:23:53 2021
NAMESPACE: default
STATUS: deployed

Lets check that pod again

$ kubectl get pods
nfs-subdir-external-provisioner-76b4bc6f7d-5bjgg 1/1 Running 0 17s

Woo Hoo! simples!

Now create a PVC and pod to use this shared storage

The next step is to allocate this storage. This is done via a PVC, or Persistent Volume Claim.

PersistentVolumeClaim (PVC) is a request for storage by a user. … Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany, see AccessModes).

We do this by creating some YAML to define the state we want.

Create a file like the one below. You can see that we allocated 10Gi on the NFS server, but we only want 1Mi of that for our test claim.

$ cat test-pvc.yml 
kind: PersistentVolumeClaim
apiVersion: v1
  name: test-claim-client
  annotations: "test-path" # not required, depending on whether this annotation was shown in the storage class description
  storageClassName: nfs-client
    - ReadWriteMany
      storage: 1Mi

In the above we have done a few interesting things.

In the spec we defined the Storage Class we have created above.
The access mode, in this case read/write and to allow other pods to access the same storage (Many).
Also it’s size 1Mi

Now we apply it to the cluster.

$ kubectl apply -f test-pvc.yml 
persistentvolumeclaim/test-claim-client created

Let’s check

$ kubectl get pvc
test-claim-client Bound pvc-1213d39e-b623-44db-a1f1-f3835197a212 1Mi RWX nfs-client 14s

If we look on the actual NFS server we can see it, and the UUIDs match

/data/nfs_share/ $ ls -la
total 12
drwxrwxrwx 3 root   root    4096 Mar 13 13:12 .
drwxr-xr-x 7 root   root    4096 Feb 26 16:08 ..
drwxrwxrwx 2 nobody nogroup 4096 Mar 13 13:11 default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212

Let’s check inside, as we expect there is nothing there.

s -l default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212/
total 0

Access the storage from a pod

Lets create a pod to use that and put something inside the storage.

Again, we create some yaml, to describe a pod and it’s attributes.

$ cat test-pod.yml 
kind: Pod
apiVersion: v1
  name: test-pod
  - name: test-pod
      - "/bin/sh"
      - "-c"
      - "touch /mnt/SUCCESS && exit 0 || exit 1"
      - name: nfs-pvc
        mountPath: "/mnt"
  restartPolicy: "Never"
    - name: nfs-pvc
        claimName: test-claim-client

I wont go into too much detail on how this pro is created, thats a whole blog post on its own. We use a small image (busybox) and get it to write a file to the pods file system. The pods filesystem has mounted within it (/mnt) our new PVC.

It will do this and exit (stop), thus the file that is written to is actually written to NFS and we should see it there.

$ kubectl apply -f test-pod.yml 
pod/test-pod created

Let’s check it

$ kubectl get pods
NAME                                               READY   STATUS      RESTARTS   AGE
nfs-subdir-external-provisioner-76b4bc6f7d-5bjgg   1/1     Running     0          72m
test-pod                                           0/1     Completed   0          36s

Yes, the pod has run and completed. Let’s check the NFS share folder on the NFS server, and we can see the file was created.

$ ls -l default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212/
total 0
-rw-r--r-- 1 nobody nogroup 0 Mar 13 13:35 SUCCESS

Let’s clear up that pod.

$ kubectl delete -f test-pod.yml 
pod "test-pod" deleted

$ kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
nfs-subdir-external-provisioner-76b4bc6f7d-5bjgg   1/1     Running   0          76m

Now check the NFS share folder on the NFS server again and we can see the file is still there, even though the pod is gone. So depending on how you define your PVC, you may need to do housekeeping.

$ ls -l default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212/
total 0
-rw-r--r-- 1 nobody nogroup 0 Mar 13 13:35 SUCCESS

Let’s remove the PVC and see what happens then

$ kubectl delete -f test-pvc.yml 
persistentvolumeclaim "test-claim-client" deleted

$ kubectl get pvc
No resources found in default namespace.

What has happened to the folder on the NFS server?

$ ls -l default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212/
ls: cannot access 'default-test-claim-client-pvc-1213d39e-b623-44db-a1f1-f3835197a212/': No such file or directory

Oh no! What has happened?

$ ls -l 
total 4
drwxrwxrwx 2 nobody nogroup 4096 Mar 13 13:35 archived-pvc-1213d39e-b623-44db-a1f1-f3835197a212

As you can see it has been archived. Take a peek inside…

$ ls -l archived-pvc-1213d39e-b623-44db-a1f1-f3835197a212/
total 0
-rw-r--r-- 1 nobody nogroup 0 Mar 13 13:35 SUCCESS

Ah phew! nothing lost.

What about scaling?

Let’s use a deployment definition and scale that up.
We use the definition below as it uses some environment variable magic to allow us to write to a file that contains the pods IP, thus as the pods scale and increase in number it won’t overwrite any of the other pods files.

$ cat scaled-test-pod.yml 
apiVersion: apps/v1
kind: Deployment
    app: test-nfs
  name: test-nfs
  replicas: 1
      app: test-nfs
  strategy: {}
      creationTimestamp: null
        app: test-nfs
      - name: test-pod
          - "/bin/sh"
          - "-c"
          - "touch /mnt/SUCCESS-$MY_POD_IP && sleep 3600 || exit 1"
        # from
            - name: MY_NODE_NAME
                  fieldPath: spec.nodeName
            - name: MY_POD_NAME
            - name: MY_POD_NAMESPACE
                  fieldPath: metadata.namespace
            - name: MY_POD_IP
                  fieldPath: status.podIP
            - name: MY_POD_SERVICE_ACCOUNT
                  fieldPath: spec.serviceAccountName
          - name: nfs-pvc
            mountPath: "/mnt"
        - name: nfs-pvc
            claimName: test-claim-client

Let’s apply just one pod first and see the output

$  kubectl apply -f scaled-test-pod.yml 
deployment.apps/test-nfs created

$ kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
nfs-subdir-external-provisioner-76b4bc6f7d-5bjgg   1/1     Running   0          108m
test-nfs-7d6d956c4-h9t9z                           1/1     Running   0          19s

And check the underlying NFS data store

/data/nfs_share/default-test-claim-client-pvc-451af214-10b8-48c2-ac6a-704bbf9f6339 $ ls -la
total 8
drwxrwxrwx 2 nobody nogroup 4096 Mar 13 14:11 .
drwxrwxrwx 4 root   root    4096 Mar 13 13:46 ..
-rw-r--r-- 1 nobody nogroup    0 Mar 13 13:52 SUCCESS
-rw-r--r-- 1 nobody nogroup    0 Mar 13 14:11 SUCCESS-

So we have a file with the IP appended. 🙂

Let’s scale that up

$ kubectl get deployments
NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
nfs-subdir-external-provisioner   1/1     1            1           110m
test-nfs                          1/1     1            1           2m16s
$ kubectl scale --replicas=4 deployment test-nfs
deployment.apps/test-nfs scaled
$ kubectl get pods
NAME                                               READY   STATUS    RESTARTS   AGE
nfs-subdir-external-provisioner-76b4bc6f7d-5bjgg   1/1     Running   0          110m
test-nfs-7d6d956c4-h9t9z                           1/1     Running   0          3m5s
test-nfs-7d6d956c4-h9tzf                           1/1     Running   0          20s
test-nfs-7d6d956c4-qvvff                           1/1     Running   0          20s
test-nfs-7d6d956c4-zpg87                           1/1     Running   0          20s

So we have 4 pods running, let’s check the NFS storage…

total 8
drwxrwxrwx 2 nobody nogroup 4096 Mar 13 14:14 .
drwxrwxrwx 4 root   root    4096 Mar 13 13:46 ..
-rw-r--r-- 1 nobody nogroup    0 Mar 13 13:52 SUCCESS
-rw-r--r-- 1 nobody nogroup    0 Mar 13 14:11 SUCCESS-
-rw-r--r-- 1 nobody nogroup    0 Mar 13 14:14 SUCCESS-
-rw-r--r-- 1 nobody nogroup    0 Mar 13 14:14 SUCCESS-
-rw-r--r-- 1 nobody nogroup    0 Mar 13 14:14 SUCCESS-

Excellent, all 4 pods are writing to the shared NFS PVC we created.

Let’s clean up…

$ kubectl delete -f scaled-test-pod.yml --force
deployment.apps "test-nfs" force deleted

$ kubectl delete -f test-pvc.yml 
persistentvolumeclaim "test-claim-client" deleted

Additional NFS checks on the worker nodes

nfsiostat is useful to see the details on the NFS shares in use

$ nfsiostat mounted on /var/lib/kubelet/pods/c9096597-9a3d-43a0-a280-241061ea89b9/volumes/

           ops/s       rpc bklog
           0.033           0.000

read:              ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)
                   0.000           0.000           0.000        0 (0.0%)           0.000           0.000
write:             ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)
                   0.000           0.000           0.000        0 (0.0%)           0.000           0.000

NFS checks on the NFS server

On the NFS Server, checking the proc space in later kernels is very useful.

$ sudo ls -la /proc/fs/nfsd/clients/
[sudo] password for nick:
total 0
drw------- 3 root root 0 Mar 1 21:14 .
drwxr-xr-x 3 root root 0 Feb 26 16:05 ..
drw------- 2 root root 0 Mar 1 21:14 5

In the above we can see the client ids, and we can check these to see information about those clients as seen below.

$ sudo cat /proc/fs/nfsd/clients/5/info
clientid: 0x942ec40460391c55
address: ""
name: "Linux NFSv4.2 k8s-n-1"
minor version: 2
Implementation domain: ""
Implementation name: "Linux 5.4.0-66-generic #74-Ubuntu SMP Wed Jan 27 22:54:38 UTC 2021 x86_64"
Implementation time: [0, 0]


The page is very useful


So, hopefully this rather long post, and it helps new users of k8s to enable a simple shared file store as NFS.

And now as its stopped raining I think we will take the kids and dogs out for a walk. 🙂

Posted in Automation, Containers, DevOps, Uncategorized | 1 Comment

Blameless Culture – High performing teams

I was pointed at this wonderful article (though quite old in internet time, 2017 😁) about how high performing teams require a blameless culture to enable them to really fly.

So, many good things written in that article.

Posted in Uncategorized | Leave a comment

Firewall rules for k8s containers

or what they are really are “Network Policies”…

Kudos to my DevOps colleague tenhishadow for this (i follow him on social media, you should too).

Nice and simple with diagrams.

Posted in Uncategorized | 1 Comment

Terraform pre-commit hooks

Great article here, go read it if you use Terraform…

I would have loved these as a VScode extension. But pre-commit hooks are fine by me, wrap them into your pipeline (if you haven’t already ).

Posted in Uncategorized | Leave a comment

Go 1.16 and embedded assets

Seriously how cool is this…

You can embed files and folder structures into your Go binaries as of v1.16.

Here is a fantastic write up on how seriously cool and easy it is…

More on embed here

Posted in Uncategorized | Leave a comment

PWS Weather Underground upload

Below is copied from the IBM site, mainly as it’s not found via Google and you can’t hardlink to the page as its hidden away within Salesforce. (not exactly user friendly)

data good as of 07 Feb 2021

Original link: But this needs t have some cookie available as it needs to know you are looking at WUnderground and not The Weather Channel (top left hand icon changes as do the menus).

Here is my code as an example:

If this is helpful ping me back in the comments.

PWS Upload Protocol

Here is how to create a Personal Weather Station update to

To upload a weather condition, you make a standard HTTP GET request with the ID, PASSWORD and weather conditions as GET parametersURLHere is the URL used in the uploading (if you browse here without parameters you will get a brief usage): 

GET parameters

NOT all fields need to be set, the _required_ elements are:

  • ID
  • dateutc

IMPORTANT all fields must be url escaped



  2001-01-01 10:32:35

  • if the weather station is not capable of producing a timestamp, our system will accept “now”.



  • list of fields:
action [action=updateraw] -- always supply this parameter to indicate you are making a weather observation upload
ID [ID as registered by]
PASSWORD [Station Key registered with this PWS ID, case sensitive]
dateutc - [YYYY-MM-DD HH:MM:SS (mysql format)] In Universal Coordinated Time (UTC) Not local time
winddir - [0-360 instantaneous wind direction]
windspeedmph - [mph instantaneous wind speed]
windgustmph - [mph current wind gust, using software specific time period]
windgustdir - [0-360 using software specific time period]
windspdmph_avg2m  - [mph 2 minute average wind speed mph]
winddir_avg2m - [0-360 2 minute average wind direction]
windgustmph_10m - [mph past 10 minutes wind gust mph ]
windgustdir_10m - [0-360 past 10 minutes wind gust direction]humidity - [% outdoor humidity 0-100%]
dewptf- [F outdoor dewpoint F]tempf - [F outdoor temperature]
* for extra outdoor sensors use temp2f, temp3f, and so onrainin - [rain inches over the past hour)] -- the accumulated rainfall in the past 60 min
dailyrainin - [rain inches so far today in local time]baromin - [barometric pressure inches]weather - [text] -- metar style (+RA)
clouds - [text] -- SKC, FEW, SCT, BKN, OVCsoiltempf - [F soil temperature]
* for sensors 2,3,4 use soiltemp2f, soiltemp3f, and soiltemp4f
soilmoisture - [%]
* for sensors 2,3,4 use soilmoisture2, soilmoisture3, and soilmoisture4leafwetness  - [%]
+ for sensor 2 use leafwetness2solarradiation - [W/m^2]
UV - [index]visibility - [nm visibility]indoortempf - [F indoor temperature F]
indoorhumidity - [% indoor humidity 0-100]
  • Pollution Fields:
AqNO - [ NO (nitric oxide) ppb ]
AqNO2T - (nitrogen dioxide), true measure ppb
AqNO2 - NO2 computed, NOx-NO ppb
AqNO2Y - NO2 computed, NOy-NO ppb
AqNOX - NOx (nitrogen oxides) - ppb
AqNOY - NOy (total reactive nitrogen) - ppb
AqNO3 -NO3 ion (nitrate, not adjusted for ammonium ion) UG/M3
AqSO4 -SO4 ion (sulfate, not adjusted for ammonium ion) UG/M3
AqSO2 -(sulfur dioxide), conventional ppb
AqSO2T -trace levels ppb
AqCO -CO (carbon monoxide), conventional ppm
AqCOT -CO trace levels ppb
AqEC -EC (elemental carbon) – PM2.5 UG/M3
AqOC -OC (organic carbon, not adjusted for oxygen and hydrogen) – PM2.5 UG/M3
AqBC -BC (black carbon at 880 nm) UG/M3
AqUV-AETH  -UV-AETH (second channel of Aethalometer at 370 nm) UG/M3
AqPM2.5 - PM2.5 mass - UG/M3
AqPM10 - PM10 mass - PM10 mass
AqOZONE - Ozone - ppb
softwaretype - [text] ie: WeatherLink, VWS, WeatherDisplay
Example URL

Here is an example of a full URL:

NOTE: not all fields need to be set

Response Text

The response from an HTTP GET request contains some debugging data.


response “success”

  • the observation was ingested successfully


  • Password and/or id are incorrect”

invalid user data entered in the ID and PASSWORD GET parameters

response<b>RapidFire Server</b><br><br>

  • the minimum GET parameters ID, PASSWORD (Station Key registered with this PWS ID, case sensitive), action, and dateutc were not set

RapidFire Updates

RapidFire Updates allow you to update weather station conditions at a frequency up to once observation every 2.5 seconds. Web site visitors will see these observations change in real-time on the site.

A real-time update should look almost like the standard update.

However, the server to request is:, not

And, please add to the query string:


where rtrfreq is the frequency of updates in seconds.

here is an example:
Posted in Uncategorized | Leave a comment

Scanning Docker containers for vulnerabilities

There have been a few posts in the past few months on how people have scanned docker images on the docker registry and found most have a lot of vulnerabilities. Many in fact have bitcoin miners and other such malware embedded. *shudder*

If you have production or internet facing containers, this is an obvious major problem.

Virtual Images

When using virtual machine images on AWS and Azure, i have always picked the known vendor images and then patched before using. You’d pick images provided directly by vendors like Red Hat, Canonical and AWS because you can trust their builds. (a little more than you could some scoodle-doodle vendor or a bloke working from his parents basement). Then patch to current levels for that vendor image. As these come from major vendors we “expect” them to be free of malware, but i’d never expect them to be free of mis-configurations or out-dated packages.

For my employer I produce virtual machine images used across 100s of customers providing 1000s of servers a year, you have to cross your t’s and dot your i’s.

This initial creation process is something that you would take for granted with virtual machines. Something your governance expects you to do at work.


Well, this leads us to containers. Again, these should be taken from well-known vendors and patched before use just like virtual machine images.

But are they?

Docker scanning tools

There are various tools to install to scan docker images: Clair, Cillium, Anchore, et al. In fact, docker desktop edge has a scanner in built. docker scan

I chose clair. There are docker-compose and kubernettes deploy files from and I tried to get those working in a semi-production way but in the end i rolled my own. My deployment is more simplistic than others out there but gives you a quick working version to play with. To get your toe in the water, so to speak.

There is a nice simple which you call like this: ./ myimage or ./ alpine

This will do all the work and place a pretty html report accessible on the node on port 6061

OMG, look how bad my images are!!!

That was my first thought. Then after a quick cuppa. I thought hang on, i didn’t install that much in my container. So how did all this happen?

Let’s look at the original image

First, how old is the image you are using?

$ docker image ls python:slim
python slim 2f14080158f4 8 days ago 149MB

My python:slim is 8 days old. Surely, that is okay?

Next, re-run the scan on the original image file. My bet is that you inherited a lot of outdated packages.

./ python:slim

Ugh, no it’s not okay. But should we blame python:slim ?

Placing the blame

If we go to the docker registry and click on the slim tag we can look at the Dockerfile for that image.


Leads to this

Here you can see how the actual image is made up and what they install on to it to make it actually the python image.

It is actually built from debian:buster-slim.

So lets scan debian:buster-slim and see what that reports…

$ docker pull debian:buster-slim
buster-slim: Pulling from library/debian
852e50cd189d: Already exists
Digest: sha256:062bbd9a1a58c9c5b8fc9d83a206371127ef268cfcc65f1a01227c6faebdb212
Status: Downloaded newer image for debian:buster-slim

Looks like i pulled a newer version down, this will be interesting…

$ docker image ls debian:buster-slim
debian buster-slim 79fa6b1da13a 2 weeks ago 69.2MB

So, our slimmed down debian image is 2 weeks old, there shouldn’t be that many vulnerabilities in it after 2 weeks?

Oops, looks like the majority of the vulnerabilities are inherent from the debian buster image. That’s a shame. Though no critical ones.

I should point out a few caveats here: we didn’t go thought each one to determine the risk that each vulnerability poses, nor match the two vulnerability reports to see common inherited ones and where there were differences.


We didn’t test for malware just outdated patches. That’s a whole other story/blog post.

What next? It’s easy, just Patch.

It’s that simple. Patch. (with a big ‘P’ because it’s important)

You let your mobile phone pull down updates to the apps on it. You let your Windows desktop pull down and install patches, so why not your containers

When working in development and you need to pull down and image, add a layer which is the latest patches. Tick that box. You don’t need to do this at a lower level layer (eg the debian buster-slim) you can do this before you put your app on at the python:slim layer in our example above.

It doesn’t take much. Just a few minutes, get a cuppa.

apt update && apt upgrade

or grab a cuppa, but a sword fight is loads more fun.

This way you have reduced your vulnerability attack surface as low as you can plus you’ve patched a bunch of bugs at the same time.

If you happen to need to roll out a bunch of related images, say all based on python then create your own python base image and use that each time then all you app images are based on that one. You are adding version control to your images 🙂 and you know exactly what issues, whether they are vulnerabilities or bugs, that you have with your images.

This is a great example of what you should be doing in your development pipeline or CI/CD infrastructure. No pulling of raw images off the internet, but only using known “certified” images from your in-house registry.

In my opinion pulling raw images should be banned and only done as part of the DevSecOps pipeline, before developers get hold of them. Sure, in development it’s fine to quickly throw something together, but in pre-production and production never allow use of raw images.


  1. Only use known vendor images, or even better roll your own
  2. Patch
  3. Only allow your dev teams to consume known images from your in-house registry and make sure you are scanning here as well, it keeps those anxious Sec team members happy.
  4. Patch (did i mention that one) and keep patching.
Ha Ha, not even close…

Have fun and be safe out there.

Posted in Automation, Containers, DevOps, Security | Leave a comment