Creating a multi-host OKD Cluster
By Mark DeNeve
Introduction
In the last two posts, I have shown you how to get an OKD All-in-One cluster up and running. Since it was an “All-in-One” cluster, there was no redundancy in it, and there was no ability to scale out. OKD and Kubernetes work best in a multi-server deployment, creating redundancy and higher availability along with the ability to scale your applications horizontally on demand. This final blog post is going to outline the steps to build a multi-host cluster. It will build on what we have done in the previous posts but extend the process out to make a multi-node cluster and add working SSL certificates.
In order to follow along with this post, you are going to need something I can not easily provide. A DNS name and an understanding of how to update DNS records for your domain. If this is not something you have, or can get access to, but you still want to stand up your own OpenShift cluster I suggest taking a look at my previous posts on getting OKD up and running as a single node.
Speaking of those old posts, if you haven’t already read my previous blog posts on building a test cluster in Azure, I suggest reading them. Not only to boost my page visits, but also because this final post will build on the previous posts. This post OpenShift, Asure and Anible is critical as we will be using the docker container introduced there to build out this new multi-host cluster. It also walks you through the creation of an Azure credentials file that is required for the automatic creation of your hosts.
Keep in mind that while the cluster we will build here is more fault tolerant, I still would not run a production service/application on this. You certainly could, but running an OpenShift cluster for Production usage requires some care and feeding that you should be ready to do long term. You are still missing a few important things, like persistent storage, a highly available routing infrastructure and a plan to keep your cluster patched long term. Finally, keep in mind that if you opt for running a multi-host cluster your cost is going to go up. Each host you run is an additional charge from Azure. Make sure you are ready to pay the cost of running in this mode before proceeding. Using the default values in the inventory file this cluster will cost you about $150 a month to run.
The end result
When we are all done here, you will have the following:
- an OKD cluster with a master controller, infrastructure node and two compute nodes
- an OKD console with a valid SSL certificate for console.okd.<your domain name here>
- a wild-card router setup to allow you to host apps inside OKD at the URL https://<appname>.apps.okd.<your domain name here>
- a potential Azure bill (this setup will run you about $150/mo)
- the knowledge that you set up your very own OpenShift cluster!
Before We Begin
We are going to start by cloning the gitlab repo with some base files in it that will make this easier. If you have this repo from trying out previous posts, be sure to update it with the latest code by running the following command from within the source directory:
$ git pull
If this is your first time trying this out, run the following command:
$ git clone https://gitlab.com/xphyr/azure_okd.git && cd azure_okd
I will refer to this directory as your “working directory” for the remainder of this post.
Securing with LetsENCRYPT
If you tried out some of my previous posts you may have noticed that you get prompted to accept an Unsigned Certificate before logging in. By default OKD/OpenShift generates a self-signed certificate which is un-trusted by your browser. While this works for simple testing, if you plan to use the test cluster for longer periods of time, it is best to have a trusted certificate in place. The easiest way I have found to do this is to leverage Let’s Encrypt a free SSL certificate authority.
Using a tool called Lego we are going to generate a certificate that handles both the API and the wildcard hosting route. Follow the instructions on the install page to get lego installed for your specific host.
In order to make this work smoothly we are going to create one certificate that covers two different domain/hostnames:
- console.okd.<your domain name here>
- *.apps.okd.<your domain name here>
The first will secure your access to the OKD console, and the second will work for ANY application you host within your cluster via the built-in OpenShift router. Lego supplies many options for validating your ownership of the domain name. The instructions below are going to use the “manual” DNS way to validate domain ownership as they can apply to any DNS hosting provider. You are welcome to try other auth providers but I will not be documenting them here.
Since we want to keep all these files together, from within your working directory create a new directory called “certs” and change into this directory:
$ mkdir certs && cd certs
The following command will start the registration process using the manual validation method (be sure to update your email address and domain name as needed):
$ lego --email="[email protected]" --domains="console.okd.<your domain name here>" --domains="*.apps.okd.<your domain name here>" --dns="manual" --accept-tos run
After running the command you will be given two TXT records to create in your DNS provider. How you do this is up to you and is dependent on your DNS provider. Follow the prompts, and when complete you will have a set of files in the current directory under “.lego/certificates”
NOTE: If you have trouble getting Let’s Encrypt working, it’s not a big deal. This is not a required step, it just makes your cluster work a little more like an enterprise deployment. If you run into problems, you can still build your multi-host cluster by following the remaining steps below
Updating the Inventory File
Now that we created an SSL certificate we can move on to setting up the inventory file. Open the file called “inventory-multi” in your working directory. Review the file, and use a search/replace command to replace all instances of <your domain name here> with your base domain name. If you were able to successfully generate an SSL Certificate then be sure to un-comment the following three lines as well:
# openshift_master_overwrite_named_certificates= ...
# openshift_master_named_certificates= ...
# openshift_hosted_router_certificate= ...
Now, take a look at the last four lines in the Ansible file. This is where the magic happens. Previously we had just one line (and thus one host) that was configured as an “all-in-one” configuration. You will see that now we are breaking out the roles required to run OpenShift across multiple hosts. Other than defining the new hosts, there is no additional work that will need to be done. The Ansible install scripts will use this new information to configure your cluster. These four hosts are the minimum number of hosts required to build a stable micro-OpenShift Cluster in my experience.
Leveraging the Certs in the New Build
We are going to continue to use the docker container to do the OKD deployment. We are going to mount in one additional path, the path to the certificates we previously created. If you were able to create the new certificate the command to kick off the docker container looks something like this:
$ docker run -i -t -v $HOME/.ssh/id_rsa:/root/.ssh/id_rsa -v $HOME/.ssh/id_rsa.pub:/root/.ssh/id_rsa.pub -v $HOME/.azure/credentials:/root/.azure/credentials -v $(pwd)/inventory-multi:/tmp/inventory -v $(pwd)/certs/.lego/certificates:/tmp/certificates registry.gitlab.com/xphyr/azure_okd:latest
If you chose to not get a Let’s Encrypt certificate you can use the following command instead to get the container started:
$ docker run -i -t -v $HOME/.ssh/id_rsa:/root/.ssh/id_rsa -v $HOME/.ssh/id_rsa.pub:/root/.ssh/id_rsa.pub -v $HOME/.azure/credentials:/root/.azure/credentials -v $(pwd)/inventory-multi:/tmp/inventory registry.gitlab.com/xphyr/azure_okd:latest
We will use the same command to provision our hosts in Azure, but this time you will see that it creates four hosts:
$ ansible-playbook -i /tmp/inventory setup_scripts/azure_provision_hosts.yml
If this fails, make sure that your Azure credentials file is correct. If you don’t have a credentials file, or don’t know what I am talking about review my last post OpenShift, Azure and Ansible to get this file in place.
When the Ansible playbook completes creating your hosts in Azure, we need to go and update DNS so we can access them directly. The playbook will output four IP addresses. Within your DNS management tool create four “A” records that match the hostnames in the inventory file we created and use the IP addresses that were output by the playbook. For example:
- okdmstr.<your domain name here>
- okdinf.<your domain name here>
- okdcp1.<your domain name here>
- okdcp2.<your domain name here>
We also need to create two additional records. Create a CNAME record “console.okd.<your domain name here> that points to the “okdmstr” server. Also, create a wildcard A record for “*.apps.okd.<your domain name here> and point that to the same IP as your okdinf host. You will need to wait for the DNS names to propagate, and the time this takes will depend on what service you use. I suggest waiting about 30 minutes before running the next commands. You can validate that the DNS name is available by opening a command prompt and running the following command:
$ ping okdmstr.<your domain name here>
If this is successful you should be able to proceed with the next steps. Run each line one at a time and ensure they complete successfully before moving onto the next command.
$ ansible-playbook -i /tmp/inventory setup_scripts/prep_hosts.yml
$ ansible-playbook -i /tmp/inventory openshift-ansible/playbooks/prerequisites.yml
$ ansible-playbook -i /tmp/inventory openshift-ansible/playbooks/deploy_cluster.yml
# Secure the OKD API/Web Interface
$ ansible-playbook -i /tmp/inventory setup_scripts/secure_okd.yml
At this point, you have fully working SSL certs in your new multi-host OKD cluster! One thing though, you need to keep them updated! These certs are only good for 90 days, but you should plan to renew them every 60 days. The default username is “admin” and the password is located in your inventory file.
Renewing your Certs
LetsEncrypt requires you to renew your certs every 6 months. You will get an e-mail as your renewal time comes close. Using the lego client makes this easy. From your working directory run the following command:
$ lego --email="[email protected]" --domains="console.okd.<your domain name here>" --domains="*.apps.okd.<your domain name here>" --dns="manual" --accept-tos renew
You will need to follow a similar process to when you first created the cert. When the process is complete, you will need to update your OKD installation with the new certs. We can use the same docker container to do this work.
Wrapping up
Over the past three posts we have gone over a manual process for installing an OKD All-in-One cluster, then automated the deployment from creation to tear down using docker and Ansible. Finally, we created a multi-host cluster with signed certificates for a more “production like” experience. At this point, the sky is the limit for you. If you are looking for some direction on how to deploy your first app, take a look at these articles
References
https://blog.openshift.com/lets-encrypt-acme-v2-api/ https://docs.okd.io/3.11/install_config/certificate_customization.html#configuring-custom-certificates-wildcard