Using cert-manager and Let's Encrypt with the Wildcard route in OCP

Posted by Mark DeNeve on Monday, August 14, 2023

Introduction

So you have successfully set up your very own OpenShift cluster, and now you want to access the UI. You open a web browser and get a Warning:

You can click “Accept the Risk”, but what if there was a better way. Well, depending on your ability to access DNS and make changes to your DNS records, there just might be! This blog post will take you through the process of using the cert-manager Operator for Red Hat OpenShift to configure the Wild Card ingress certificate for your cluster. We will use the Let’s Encrypt service to retrieve a valid signed certificate and keep it up to date within your cluster. As an added bonus we will also update the API certificate so that is signed by a valid CA as well.

Requirements

In order to configure cert-manager to manage the api and ingress wildcard certificate we will need a few things:

  • OpenShift 4.13 Cluster or later - while this may work with older versions it has been tested against OCP 4.13
  • Cluster Administrator access
  • Ownership of a Domain Name - we will use “example.com” in this blog post
  • A DNS provider that has API support - must be supported by cert-manager (see: DNS Providers for supported providers)

NOTE: While you could use an HTTP Challenge type with OpenShift and cert-manager it will NOT supply wildcard certs via this process. (See HTTP-01 challenge for additional details). Because OpenShift uses Wildcard certificates for the default ingress router you will need to have access to creating/updating DNS records for this process to work.

Install the cert-manager Operator for Red Hat OpenShift

We will start by installing the cert-manager Operator

  1. Log in to the OpenShift Container Platform web console
  2. Navigate to Operators → OperatorHub
  3. Enter “cert-manager Operator for Red Hat OpenShift” into the filter box
  4. Select the “cert-manager Operator for Red Hat OpenShift” and click Install
  5. On the Install Operator page, select all defaults, and click Install

Wait for the Operator to install before proceeding to the next section.

Using Split DNS with Cert-Manager

If you use a Split-horizon DNS configuration, you will need to configure Cert Manager to use an external DNS provider when requesting new certificates. To do this, we will need to update the default CertManager configuration to allow for this “Split DNS” configuration. If you are not using Split DNS, you can skip this step and move onto Create Cluster Issuer.

Start by editing the cluster certmanager instance:

$ oc edit certmanager cluster

Now add the following section to your configuration which will force using external DNS servers (in this case 1.1.1.1 and 8.8.8.8) to validate that the DNS challenge has been properly configured:

spec:
  controllerConfig:
    overrideArgs:
      - '--dns01-recursive-nameservers=8.8.8.8:53,1.1.1.1:53'
      - '--dns01-recursive-nameservers-only'

Save the file and then validate that the changes have been made.

$ oc describe certmanager cluster
...
Status:
  Conditions:
    Last Transition Time:  2023-07-25T15:48:12Z
    Status:                True
    Type:                  cert-manager-controller-deploymentAvailable

Keep in mind, this configuration is only required if you are using Split-horizon DNS.

Create Cluster Issuer

Next up we need to create a ClusterIssuer. The ClusterIssuer is the configuration of a provider that can supply a working Certificate within the cluster. Red Hat supports the following DNS providers:

  • AWS Route53
  • Azure DNS
  • Google DNS

The instructions below will show how to configure for either DigitalOcean or CloudFlare. You will notice that both of these are not on the list of supported providers. While this configuration will work, you can not get support from Red Hat for the DigitalOcean or CloudFlare provider. There are other options for DNS-01 solvers available as part of the upstream project and they are_ documented here: Configuring DNS01 Challenge Provider. You can review the docs for other solvers that may meet your needs.

NOTE: You only need to configure one provider, you should not configure both a Digital Ocean and CloudFlare provider.

Configure DNS-01 resolver - Digital Ocean

In this section we will create a DigitalOcean provider. If you are looking to use CloudFlare, skip to the next section. We will start by creating a secret to store our Digital Ocean access token in the “cert-manager” namespace:

provider-secret.yaml

apiVersion: v1
kind: Secret
metadata:
  name: digitalocean-dns
  namespace: cert-manager
type: Opaque
stringData:
  # insert your DO access token here
  access-token: "access token here"

NOTE: You can create a new Digital Ocean API Token by going here Applications & API if you don’t already have one.

Now, we will create ClusterIssuer instances for both LetsEncrypt staging and production. This will allow us to test our configuration prior to hitting the production endpoint. The production endpoint will block you if you have too many incorrect requests to, so its always a good idea to test first. Be sure to update the spec.acme.email field with your email address or you will get errors.

staging-clusterissuer.yaml

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-staging
spec:
  acme:
    email: [email protected]
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-staging-issuer-account-key
    solvers:
    - dns01:
        digitalocean:
          tokenSecretRef:
            name: digitalocean-dns
            key: access-token

prod-clusterissuer.yaml

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: [email protected]
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod-issuer-account-key
    solvers:
    - dns01:
        digitalocean:
          tokenSecretRef:
            name: digitalocean-dns
            key: access-token

Configure DNS-01 resolver - CloudFlare

In this section we will create a CloudFlare provider. We will start by creating a secret to store our Digital Ocean access token in the “cert-manager” namespace:

provider-secret.yaml

apiVersion: v1
kind: Secret
metadata:
  name: cloudflare-dns
  namespace: cert-manager
type: Opaque
stringData:
  # insert your cloudflare access token here
  api-token: "access token here"

NOTE: You can create a new CloudFlare API Token by going here User API Tokens if you don’t already have one. Make sure when creating your token, that you create one scoped to “Edit zone DNS” to ensure that the token has the proper permissions.

Now, we will create ClusterIssuer instances for both LetsEncrypt staging and production. This will allow us to test our configuration prior to hitting the production endpoint. The production endpoint will block you if you have too many incorrect requests to, so its always a good idea to test first. Be sure to update the spec.acme.email field with your email address or you will get errors.

staging-clusterissuer.yaml

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-staging
spec:
  acme:
    email: [email protected]
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-staging-issuer-account-key
    solvers:
    - dns01:
        cloudflare:
          apiTokenSecretRef:
            name: cloudflare-dns
            key: api-token

prod-clusterissuer.yaml

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    email: [email protected]
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-prod-issuer-account-key
    solvers:
    - dns01:
        cloudflare:
          apiTokenSecretRef:
            name: cloudflare-dns
            key: api-token

NOTE: ClusterIssuer resources are not namespaced, and are available across the cluster.

Create ClusterIssuer

With your Apply the new ClusterIssuer instances:

$ oc create -f provider-secret.yaml
secret/cloudflare-dns created
$ oc create -f staging-clusterissuer.yaml
clusterissuer.cert-manager.io/letsencrypt-staging created
$ oc create -f prod-clusterissuer.yaml
clusterissuer.cert-manager.io/letsencrypt-prod created

Validating our Cluster Issuer

With the ClusterIssuers created, we will validate that they are ready for use. Ensure that they show READY as “True” before continuing.

$ oc get clusterissuer
NAME                  READY   AGE
letsencrypt-prod      True    61m
letsencrypt-staging   True    18h

Create Certificate Requests

We are going to secure both our API as well as our Wildcard route. To achieve this, we need to create a Certificate request in both the openshift-ingress and openshift-config namespace. You will need to update the templates below to match your configuration. I will use “mycluster.example.com” as the cluster name and example.com as the domain name for this example

  • *.${DOMAIN} - this should be *.example.com
  • *.apps.${CLUSTER-NAME} - this would be something like *.apps.mycluster.example.com
  • api.${CLUSTER-NAME} - this would be api.mycluster.example.com

If you are unsure of your Cluster-Name you can run oc status and get the api host name from the output:

$ oc status
In project openshift-ingress on server https://api.mycluster.example.comt:6443

Create the files below making sure to update the spec.commonName and spec.dnsNames as outlined above:

openshift-ingress-wildcard.yaml

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: router-certs-letsencrypt
  namespace: openshift-ingress
  labels:
    app: cert-manager
spec:
  secretName: router-certs-letsencrypt
  secretTemplate:
    labels:
      app: cert-manager
  duration: 2160h # 90d
  renewBefore: 720h # 30d
  subject:
    organizations:
      - Org Name
  commonName: '*.${DOMAIN}'
  privateKey:
    algorithm: RSA
    encoding: PKCS1
    size: 2048
    rotationPolicy: Always
  usages:
    - server auth
    - client auth
  dnsNames:
    - '*.${DOMAIN}'
    - '*.apps.${CLUSTER-NAME}'
  issuerRef:
    name: letsencrypt-staging
    kind: ClusterIssuer

openshift-api.yaml

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: api-certs-letsencrypt
  namespace: openshift-config
  labels:
    app: cert-manager
spec:
  secretName: api-certs-letsencrypt
  secretTemplate:
    labels:
      app: cert-manager
  duration: 2160h # 90d
  renewBefore: 720h # 30d
  subject:
    organizations:
      - Org Name
  commonName: 'api.${CLUSTER-NAME}'
  privateKey:
    algorithm: RSA
    encoding: PKCS1
    size: 2048
    rotationPolicy: Always
  usages:
    - server auth
    - client auth
  dnsNames:
    - 'api.${DOMAIN}'
  issuerRef:
    name: letsencrypt-staging
    kind: ClusterIssuer

NOTE: we are staring off using the “letsencrypt-staging” issuer, we will update this before making any changes to the cluster. We are also requesting a 90 day certificate and telling cert-manager to automatically renew the certificate when there is 30 days left, based on Let’s Encrypt best practices

With your Certificate files created, we will apply them to the cluster:

$ oc create -f openshift-ingress-wildcard.yaml
$ oc create -f openshift-api.yaml

Check to ensure that the certificates are properly created:

$ oc describe certificate api-certs-letsencrypt -n openshift-config
$ oc describe certificate router-certs-letsencrypt -n openshift-ingress

We can also check to ensure that the TLS secrets were created:

$ oc get secret api-certs-letsencrypt -n openshift-config
NAME                    TYPE                DATA   AGE
api-certs-letsencrypt   kubernetes.io/tls   2      110m
$ oc get secret router-certs-letsencrypt -n openshift-ingress
NAME                       TYPE                DATA   AGE
router-certs-letsencrypt   kubernetes.io/tls   2      127m

Switch to prod for issuing your certs

Assuming that your certificates were issued properly from the staging instance, we can now switch over to using the prod instance. Edit each of the certificates, and change the spec.issuerRef.name to letsencrypt-prod

$ oc edit certificate api-certs -n openshift-config
# edit the file here and save it
$ oc edit certificate router-certs -n openshift-ingress
# edit the file here and save it

Check to ensure that the certificates are properly re-created using the prod issuer:

$ oc describe certificate api-certs -n openshift-config
$ oc describe certificate router-certs -n openshift-ingress

Also validate and ensure that the TLS secrets were re-created:

$ oc get secret api-certs-letsencrypt -n openshift-config
NAME                    TYPE                DATA   AGE
api-certs-letsencrypt   kubernetes.io/tls   2      110m
$ oc get secret router-certs-letsencrypt -n openshift-ingress
NAME                       TYPE                DATA   AGE
router-certs-letsencrypt   kubernetes.io/tls   2      127m

Configure OCP to use the Certificates

Assuming that your certificates were properly issued from the Prod instance in the previous step, we can now go ahead and update our cluster to use the new certificates.

Configure the Ingress Wildcard Certificate

$ oc patch ingresscontroller default -n openshift-ingress-operator --type=merge --patch='{"spec": { "defaultCertificate": { "name": "router-certs-letsencrypt" }}}' --insecure-skip-tls-verify

OpenShift will update the Ingress routers with the new certificate. This may take a few minutes.

Validating the Ingress Wildcard Certificate

We can then check to ensure that the new LetsEncrypt certificate is available by running the following command:

$ curl https://console-openshift-console.apps.mycluster.example.com

You should get back a bunch of HTML, and no errors about the certificate not being valid.

Configure the API Controller

$ export OKD_API=$(oc whoami --show-server --insecure-skip-tls-verify | cut -f 2 -d ':' | cut -f 3 -d '/' | sed 's/-api././')
$ oc patch apiserver cluster --type merge --patch="{\"spec\": {\"servingCerts\": {\"namedCertificates\": [ { \"names\": [  \"$OKD_API\"  ], \"servingCertificate\": {\"name\": \"api-certs-letsencrypt\" }}]}}}" --insecure-skip-tls-verify

Validating the API Certificate

After applying the changes to the API Controller, we need to check and validate that the kube-apiserver is healthy. Run the command below and wait until AVAILABLE shows True and PROGRESSING shows False.

$ oc get clusteroperators kube-apiserver --watch
NAME             VERSION   AVAILABLE   PROGRESSING   DEGRADED   SINCE
kube-apiserver   4.13.0     True        False         False      145m

Updating your kubeadmin file

After updating your cluster’s API server to use a new certificate, you may find that the kubeadmin file that was created as part of your initial cluster install no longer works:

$oc status
Unable to connect to the server: x509: certificate signed by unknown authority

This is because the kubeadmin file contains a copy of the clusters self signed certificate, and this is no longer what the API server is returning. In order to be able to use your kubeadmin file with the new API cert, you will need to remove the self-signed cert from the file.

# Make a backup of your $KUBECONFIG file before proceeding!
$ sed -i "/certificate-authority-data/d" $KUBECONFIG

You can now use your kubeadmin file to authenticate against the cluster.

$oc status
In project openshift-ingress on server https://api.mycluster.example.comt:6443

Conclusion

By leveraging cert-manager and Let’s Encrypt we are able to have a set of TLS certificates in our OpenShift cluster that are automatically kept updated, and allows for a more seamless connection to the OpenShift UI as well as any application that you may choose to host in your OpenShift cluster.