A new feature added to TKGS as of 7.0 Update 2 is support for adding private SSL certificates to the “trust” on TKG cluster nodes.
This is very important as it finally provides a supported mechanism to use on-premises Harbor and other image registries.
It’s done by adding the encoded CAs to the “TkgServiceConfiguration”. The template for the TkgServiceConfiguration looks like this:
- name: first-cert-name
data: base64-encoded string of a PEM encoded public cert 1
- name: second-cert-name
data: base64-encoded string of a PEM encoded public cert 2
Notice that there are two new sections under spec; one for proxy and one for trust. This article is going to focus on trust for additional CAs.
If your registry uses a self-signed cert, you’ll just encode that cert itself. If you take advantage on an Enterprise CA or similar to sign your certs, you’d encoded and import the “signing”, “intermediate” and/or “root” CA.
Let’s add the certificate for a standalone Harbor (not the built-in Harbor instance in TKGS, its certificate is already trusted)
Run base64 -i <ca file> to return the base64 encoded content:
Provide a simple name and copy and paste the encoded cert into the data value:
Apply the TkgServiceConfiguration
After setting up your file. Apply it to the Supervisor cluster:
kubectl apply -f ./TanzuServiceConfiguration.yaml
Existing TKG clusters will not automatically inherit the trust for the certificates
Clusters created after the TKGServiceConfiguration is applied will get the certificates
You can scale an existing TKG cluster to trigger a rollout with the certificates
You can verify the certificates exist by connecting through SSH to the nodes and locating the certs under /etc/ssl/certs:
In TKGS on vSphere 7.0 through (at least) 7.0.1d, a Harbor Image Registry may be enabled for the vSphere Cluster (Under Configure|Namespaces| Image Registry). This feature currently (as of 7.0.1d) requires the Pod Service, which in turn requires NSX-T integration.
As of 7.0.1d, the self-signed certificate created for this instance of Harbor is added to the trust for nodes in TKG clusters, making it easier (possible?) to use images from Harbor.
When you login to harbor as a user, you’ll notice that the menu is very sparse. Only the ‘admin’ account can access the “Administration” menu.
To get logged in as the ‘admin’ account, we’ll need to retrieve the password from a secret for the harbor controller in the Supervisor cluster.
SSH into the vCenter Server as root, type ‘shell’ to get to bash shell
Type ‘/usr/lib/vmware-wcp/decryptK8Pwd.py‘ to return information about the Supervisor Cluster. The results include the IP for the cluster as well as the node root password
While still in the SSH session on the vCenter Server, ssh into the Supervisor Custer node by entering ‘ssh root@<IP address from above>’. For the password, enter the PWD value from above.
Now, we have a session as root on a supervisor cluster control plane node.
Enter ‘kubectl get ns‘ to see a list of namespaces in the supervisor cluster. You’ll see a number of hidden, system namespaces in addition to those corresponding to the vSphere namespaces. Notice there is a namespace named “vmware-system-registry” in addition to one named “vmware-system-registry-#######”. The namespace with the number is where Harbor is installed.
Run ‘kubectl get secret -n vmware-system-registry-######‘ to get a list of secrets in the namespace. Locate the secret named “harbor-######-controller-registry”.
Run this to return the decoded admin password: kubectl get secret -n vmware-system-registry-###### harbor-######-controller.data.harborAdminPassword}' | base64 -d | base64 -d
In the cases I seen so far, the password is about 16 characters long, if it’s longer than that, you may not have decoded it entirely. Note that the value must be decoded twice.
Once you’ve saved the password, enter “exit” three times to get out of the ssh sessions.
Don’t manipulate the authentication settings
The process above is not supported; VMware GSS will not help you complete these steps
Some features may remain disabled (vulnerability scanning for example)
As admin, you may configure registries and replication (although it’s probably unsupported with this built-in version of Harbor for now)
So, lets say you want to deploy an instance of Harbor to your “services” kubernetes cluster. The cluster is protected by a scheduled Velero backup Velero pickup all resources in all namespaces by default, but we need to add an annotation to indicate a persistent volume that should be included in the backup. Without this annotation, Velero will not include the PV in the backup.
First, let’s create a namespace we want to install Harbor to: kubectl create ns harbor Then, we’ll make sure helm has the chart for Harbor helm repo add harbor https://helm.goharbor.io
helm repo update Finally, we’ll install harbor helm install harbor harbor/harbor --namespace harbor \
Notice a few of the configurations we’re passing here:
expose.tls.commonName is the value that will be used by the gnerated TLS certificate
externalURL is the FQDN that we’ll use to reach Harbor (post deploy, you’ll get the loadBalancer IP and add the DNS record for it)
harborAdminPassword is the password assigned by default to the admin account – clearly this should be changed immediately
The next items are for the podAnnotations; the syntax was unexpectedly different. Notice there’s a dot instead of an equals-sign between the key and the value. Also notice that the dots in the value must be escaped.
Once Harbor is deployed, you can get the loadBalancer’s IP and point your browser at it.
Now, we can wait for the Velero backup job to run or kick off a one-off backup.
I noticed that Harbor did not start properly after restore. This was because postgres in the database pod expects a specific set of permissions – which were apparently different as a result of the restore. The log on the database pod only read FATAL: data directory “/var/lib/postgresql/data” has group or world access
To return Harbor to functionality post-restore, I had to take the following steps:
Edit the database statefulSet: kubectl edit StatefulSet harbor-harbor-database -n harbor
Replace the command in the “change-permission-of-directory” initContainer from chown -R 999:999 /var/lib/postgresql/data to chmod -R 0700 /var/lib/postgresql/data
Save changes and bounce the database pod by running kubectl delete po -n harbor harbor-harbor-database-0
Bounce the remaining pods that are in CrashLoopBackup (because they’re trying to connect to the database)
Thanks to my friend and colleague Hemanth AVS for help with the podAnnotations syntax!
Login to Harbor Web GUI as an administrator. Navigate to Administration/Registries
Add Endpoint for local Harbor by clicking ‘New Endpoint’ and entering the following:
Name: local (or FQDN or whatever)
Endpoint URL: the actual URL for your harbor instance beginning with https and ending with :443
Access ID: username for an admin or user that at least has Project Admin permission to the target Projects/namespaces
Access Secret: Password for the account above
Verify Remote Cert: typically checked
Add Endpoint for Docker Hub by clicking ‘New Endpoint’ and entering the following:
Name: dockerhub (or something equally profound)
Endpoint URL: pre-populated/li>
Access ID: username for your account at dockerhub
Access Secret: Password for the account above
Verify Remote Cert: typically checked
Notice that this is for general dockerhub, not targeting a particular repo.
Configure Replications for the Yelb Images
You may create replications for several images at once using a variety of filters, but I’m going to create a replication rule for each image we need. I think this makes it easier to identify a problem, removes the risk of replicating too much and makes administration easier. Click ‘New Replication Rule‘ enter the following to create our first rule:
Replication Mode: Pull-based (because we’re pulling the image from DockerHub)
Source registry: dockerhub
Source Registry Filter – Name: mreferre/yelb-db
Source Registry Filter – Tag: 0.5
Source Registry Filter – Resource: pre-populated
Destination Namespace: yelb (or whatever Project you want the images saved to)
Trigger Mode: Select ‘Manual’ for a one-time sync or select ‘Scheduled’ if you want to ensure the image is replicated periodically. Note that the schedule format is cron with seconds, so 0 0 23 * * 5 would trigger the replication to run every Friday at 23:00:00. Scheduled replication makes sense when the tag filter is ‘latest’ for example
Override: leave checked to overwrite the image if it already exists
Enable rule: leave checked to keep the rule enabled
Add the remaining Replication Rules:
Note that redis is an official image, so we have to include library/