Dealing with a Lack of Entropy on your OpenShift Cluster

Posted by Mark DeNeve on Wednesday, March 20, 2024

Introduction

The Linux Kernel supplies two sources of random numbers, /dev/random and /dev/urandom. Theses character devices can supply random numbers to any application running on your machine. The random numbers supplied by the kernel on these devices come from the Linux kernel’s random-number entropy pool. The random-number entropy pool contains “sufficiently random” numbers meaning they are good for use in things like secure communications. But what happens if the random-number entropy pool runs out of numbers? If you are reading from the /dev/random device, your application will block waiting for new numbers to be generated. Alternatively the urandom device is non-blocking, and will create random numbers on the fly, re-using some of the entropy in the pool. This can lead to numbers that are less random than required for some use cases.

What do we do if you have a need for high entropy random numbers, but your application(s) are using them up faster than they are generated? The rng-tools software can be used to re-fill the entropy pool, using environmental noise and hardware generators available in modern CPUs. The rng-tools package contains the rngd daemon utility to handle this. In traditional hosting platforms, you would run this tool as a systemd service or daemon, but what about in the land of Kubernetes and OpenShift?

Since OpenShift does not allow us to install applications into the Red Hat CoreOS operating system, how do we get rngd running in our cluster? The quickest way to achieve this is using a Kubernetes Daemonset. This blog post will talk about how to container-ize the utility and get it running on all worker nodes in the cluster.

Kubernetes Daemonsets

So what are Kubernetes DaemonSets? DaemonSets ensure that all specified nodes run a copy of a Pod. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet removes the running pods from all nodes. You can think of DaemonSets as a way to distribute a daemon to any node in a cluster that meets a given set of criteria. You can use nodeSelectors and node affinity to help define the selection of nodes you wish to run a DaemonSet on.

Source Code

The source code for the Dockerfile as well as the DaemonSet that we will deploy in this blog post can be found here: rng-daemonset.

Build Container Image and push

The rng-tools package does not have a default Docker container image, so we will create one to use in our cluster. We will use the “fedora-minimal” image as a base for our container to keep the overall container image small. In order to build this container and deploy it, start by creating a Dockerfile with the following contents:

FROM fedora-minimal
RUN microdnf install -y rng-tools && microdnf clean all
CMD exec /bin/bash -c "trap : TERM INT; sleep infinity & wait"

Use the included Dockerfile to generate a container image with the rng-tools installed, making sure to update <my registry> with a registry that you wish to use, and that your OpenShift cluster trusts for deployment. See Configuring the Registry Operator if you need additional guidance on this.

$ podman build -t <my registry>/rng-tools:latest .
# push the created image to your registry
$ podman push <my registry>/rng-tools:latest

Preparing the rngd-daemonset.yml file

Next we will create a rng-daemonset.yaml file. Create a new file using the following contents:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: rng-daemon
  labels:
    k8s-app: rng-daemon
spec:
  selector:
    matchLabels:
      name: rng-daemon
  template:
    metadata:
      labels:
        name: rng-daemon
    spec:
      containers:
        - name: rngutils
          image: registry.xphyrlab.net/rng-tools/rng-tools:latest
          command: ["/sbin/rngd"]
          args: ["-xjitter", "-xhwrng", "-xpkcs11", "-xrtlsdr", "-f"]
          resources: {}
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: Always
          securityContext:
            runAsUser: 0
            capabilities:
              add: ["CAP_SYS_ADMIN"]
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      schedulerName: default-scheduler
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 25%
      maxSurge: 25%
  revisionHistoryLimit: 10
  progressDeadlineSeconds: 600

Be sure to edit the spec.template.spec.container.image value to point to your image and image registry that you created in the previous step Build Container Image and push.

          image: <my registry>/rng-tools/rng-tools:latest

NOTE: The command line options in the rng-daemonset.yml file are designed to run on Intel Processors that support the Intel RDRAND Instruction set. You may need to tweak the options to meet your hardware needs. See rng-tools for other options.

Deploy the DaemonSet

The DaemonSet we have created above will deploy to all worker nodes in your cluster, by default. We have not used any of the selectors to narrow down the nodes or to allow this to run on Control Plane nodes. We will create a new project for this app to run and then deploy the DaemonSet:

$ oc new-project rng-tools
# deploy the daemonset
$ oc create -f rng-daemonset.yaml

If you check the status of the pods, you will find that they are not deploying. This is because the rngd utility needs to run as UID 0, and also requires the “CAP_SYS_ADMIN” capability. We have defined this as a part of the DaemonSet file, but we have not granted this permission to the default serviceAccount in the rng-tools namespace. In order to grant those to this pod, we will need to add the default serviceAccount in the rng-tools namespace the Privileged SCC. This will allow us to deploy the pod with the minimal amount of permissions required.

NOTE: If you want to make this secure, you could create a new SCC that only allows running as UID 0, and grants just CAP_SYS_ADMIN, however that is a blog post for another day.

Assigning the privileged SCC to our serviceAccount

You will need to give the default serviceAccount the SCC privileged policy.

$ oc adm policy add-scc-to-user privileged system:serviceaccount:rng-tools:default

If we now check the status of the DaemonSet we will see that the pods have been successfully deployed to all worker nodes:

$ oc get daemonset
NAME         DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
rng-daemon   3         3         3       3            3           <none>          135m

Conclusion

In this blog post we discussed the /dev/random and /dev/urandom devices and one way to help refill the kernel entropy pool using rngd deployed as a DaemonSet in your OpenShift cluster.

References

RNG-Tools

RNG-Tools DaemonSet