Adding trusted certs to nodes on TKGS 7.0 U2

A new feature added to TKGS as of 7.0 Update 2 is support for adding private SSL certificates to the “trust” on TKG cluster nodes.

This is very important as it finally provides a supported mechanism to use on-premises Harbor and other image registries.

It’s done by adding the encoded CAs to the “TkgServiceConfiguration”. The template for the TkgServiceConfiguration looks like this:

apiVersion: run.tanzu.vmware.com/v1alpha1
kind: TkgServiceConfiguration
metadata:
  name: tkg-service-configuration
spec:
  defaultCNI: antrea
  proxy:
    httpProxy: http://<user>:<pwd>@<ip>:<port>

  trust:
    additionalTrustedCAs:
      - name: first-cert-name
        data: base64-encoded string of a PEM encoded public cert 1
      - name: second-cert-name
        data: base64-encoded string of a PEM encoded public cert 2

Notice that there are two new sections under spec; one for proxy and one for trust. This article is going to focus on trust for additional CAs.

If your registry uses a self-signed cert, you’ll just encode that cert itself. If you take advantage on an Enterprise CA or similar to sign your certs, you’d encoded and import the “signing”, “intermediate” and/or “root” CA.

Example

Let’s add the certificate for a standalone Harbor (not the built-in Harbor instance in TKGS, its certificate is already trusted)

Download the certificate by clicking the “Registry Certificate” link

Run base64 -i <ca file> to return the base64 encoded content:

Provide a simple name and copy and paste the encoded cert into the data value:

Apply the TkgServiceConfiguration

After setting up your file. Apply it to the Supervisor cluster:

kubectl apply -f ./TanzuServiceConfiguration.yaml

Notes

  • Existing TKG clusters will not automatically inherit the trust for the certificates
  • Clusters created after the TKGServiceConfiguration is applied will get the certificates
  • You can scale an existing TKG cluster to trigger a rollout with the certificates
  • You can verify the certificates exist by connecting through SSH to the nodes and locating the certs under /etc/ssl/certs:

Retrieving the Admin Password for Harbor Image Registry in Tanzu Kubernetes Grid Service

In TKGS on vSphere 7.0 through (at least) 7.0.1d, a Harbor Image Registry may be enabled for the vSphere Cluster (Under Configure|Namespaces| Image Registry). This feature currently (as of 7.0.1d) requires the Pod Service, which in turn requires NSX-T integration.

As of 7.0.1d, the self-signed certificate created for this instance of Harbor is added to the trust for nodes in TKG clusters, making it easier (possible?) to use images from Harbor.

When you login to harbor as a user, you’ll notice that the menu is very sparse. Only the ‘admin’ account can access the “Administration” menu.

To get logged in as the ‘admin’ account, we’ll need to retrieve the password from a secret for the harbor controller in the Supervisor cluster.

Steps:

  • SSH into the vCenter Server as root, type ‘shell’ to get to bash shell
  • Type ‘/usr/lib/vmware-wcp/decryptK8Pwd.py‘ to return information about the Supervisor Cluster. The results include the IP for the cluster as well as the node root password
  • While still in the SSH session on the vCenter Server, ssh into the Supervisor Custer node by entering ‘ssh root@<IP address from above>’. For the password, enter the PWD value from above.
  • Now, we have a session as root on a supervisor cluster control plane node.
  • Enter ‘kubectl get ns‘ to see a list of namespaces in the supervisor cluster. You’ll see a number of hidden, system namespaces in addition to those corresponding to the vSphere namespaces. Notice there is a namespace named “vmware-system-registry” in addition to one named “vmware-system-registry-#######”. The namespace with the number is where Harbor is installed.
  • Run ‘kubectl get secret -n vmware-system-registry-######‘ to get a list of secrets in the namespace. Locate the secret named “harbor-######-controller-registry”.
  • Run this to return the decoded admin password: kubectl get secret -n vmware-system-registry-###### harbor-######-controller.data.harborAdminPassword}' | base64 -d | base64 -d
  • In the cases I seen so far, the password is about 16 characters long, if it’s longer than that, you may not have decoded it entirely. Note that the value must be decoded twice.
  • Once you’ve saved the password, enter “exit” three times to get out of the ssh sessions.

Notes

  • Don’t manipulate the authentication settings
  • The process above is not supported; VMware GSS will not help you complete these steps
  • Some features may remain disabled (vulnerability scanning for example)
  • As admin, you may configure registries and replication (although it’s probably unsupported with this built-in version of Harbor for now)

Use Helm to deploy Harbor with Annotations for Velero

So, lets say you want to deploy an instance of Harbor to your “services” kubernetes cluster.  The cluster is protected by a scheduled Velero backup Velero pickup all resources in all namespaces by default, but we need to add an annotation to indicate a persistent volume that should be included in the backup.  Without this annotation, Velero will not include the PV in the backup.

First, let’s create a namespace we want to install Harbor to:
kubectl create ns harbor
Then, we’ll make sure helm has the chart for Harbor
helm repo add harbor https://helm.goharbor.io
helm repo update

Finally, we’ll install harbor
helm install harbor harbor/harbor --namespace harbor \
--set expose.type=loadBalancer,expose.tls.enabled=true,expose.tls.commonName=harbor.ragazzilab.com,\
externalURL=harbor.ragazzilab.com,harborAdminPassword=harbor,\
redis.podAnnotations."backup\.velero\.io/backup-volumes"=data,\
registry.podAnnotations."backup\.velero\.io/backup-volumes"=registry-data,\
trivy.podAnnotations."backup\.velero\.io/backup-volumes"=data,\
database.podAnnotations."backup\.velero\.io/backup-volumes"=database-data,\
chartmuseum.podAnnotations."backup\.velero\.io/backup-volumes"=chartmuseum-data,\
jobservice.podAnnotations."backup\.velero\.io/backup-volumes"=job-logs

Notice a few of the configurations we’re passing here:

  • expose.tls.commonName is the value that will be used by the gnerated TLS certificate
  • externalURL is the FQDN that we’ll use to reach Harbor (post deploy, you’ll get the loadBalancer IP and add the DNS record for it)
  • harborAdminPassword is the password assigned by default to the admin account – clearly this should be changed immediately
  • The next items are for the podAnnotations; the syntax was unexpectedly different.  Notice there’s a dot instead of an equals-sign between the key and the value.  Also notice that the dots in the value must be escaped.

Once Harbor is deployed, you can get the loadBalancer’s IP and point your browser at it.

Now, we can wait for the Velero backup job to run or kick off a one-off backup.

Not So Fast...

I noticed that Harbor did not start properly after restore.  This was because postgres in the database pod expects a specific set of permissions – which were apparently different as a result of the restore.  The log on the database pod only read FATAL: data directory “/var/lib/postgresql/data” has group or world access

To return Harbor to functionality post-restore, I had to take the following steps:

  1. Edit the database statefulSet: kubectl edit StatefulSet harbor-harbor-database -n harbor
  2. Replace the command in the “change-permission-of-directory” initContainer from chown -R 999:999 /var/lib/postgresql/data to chmod -R 0700 /var/lib/postgresql/data
  3. Save changes and bounce the database pod by running kubectl delete po -n harbor harbor-harbor-database-0
  4. Bounce the remaining pods that are in CrashLoopBackup (because they’re trying to connect to the database)

 

harborThanks to my friend and colleague Hemanth AVS for help with the podAnnotations syntax!

 

 

Replicating images from DockerHub to Harbor

Harbor LogoI found the documentation for actually replicating images from DockerHub to a local Harbor instance to be missing.  So here’s what I’ve found:

Objective: Replicate the images for the Yelb sample application to local Harbor repo

Set-up and Prereqs

  1. A local Harbor instance – I’ll be using an Enterprise PKS foundation with Harbor 1.10
  2. An account for DockerHub

Steps

      1. Login to Harbor Web GUI as an administrator. Navigate to Administration/Registries
      2. Add Endpoint for local Harbor by clicking ‘New Endpoint’ and entering the following:
        • Provider: harbor
        • Name: local (or FQDN or whatever)
        • Description: optional
        • Endpoint URL: the actual URL for your harbor instance beginning with https and ending with :443
        • Access ID: username for an admin or user that at least has Project Admin permission to the target Projects/namespaces
        • Access Secret: Password for the account above
        • Verify Remote Cert: typically checked
      3. Add Endpoint for Docker Hub by clicking ‘New Endpoint’ and entering the following:
        • Provider: docker-hub
        • Name: dockerhub (or something equally profound)
        • Description: optional
        • Endpoint URL: pre-populated/li>
        • Access ID: username for your account at dockerhub
        • Access Secret: Password for the account above
        • Verify Remote Cert: typically checked

        Notice that this is for general dockerhub, not targeting a particular repo.

      4. Configure Replications for the Yelb Images
        You may create replications for several images at once using a variety of filters, but I’m going to create a replication rule for each image we need. I think this makes it easier to identify a problem, removes the risk of replicating too much and makes administration easier. Click ‘New Replication Rule‘ enter the following to create our first rule:

        • Name: yelb-db-0.5
        • Description: optional
        • Replication Mode: Pull-based (because we’re pulling the image from DockerHub)
        • Source registry: dockerhub
        • Source Registry Filter – Name: mreferre/yelb-db
        • Source Registry Filter – Tag: 0.5
        • Source Registry Filter – Resource: pre-populated
        • Destination Namespace: yelb (or whatever Project you want the images saved to)
        • Trigger Mode: Select ‘Manual’ for a one-time sync or select ‘Scheduled’ if you want to ensure the image is replicated periodically. Note that the schedule format is cron with seconds, so 0 0 23 * * 5 would trigger the replication to run every Friday at 23:00:00. Scheduled replication makes sense when the tag filter is ‘latest’ for example
        • Override: leave checked to overwrite the image if it already exists
        • Enable rule: leave checked to keep the rule enabled
      5. Add the remaining Replication Rules:
        Name Name Filter Tag Filter Dest Namespace
        yelb-ui-latest mreferre/yelb-ui latest yelb
        yelb-appserver-latest mreferre/yelb-appserver latest yelb
        redis-4.0.2 library/redis 4.0.2 yelb

        Note that redis is an official image, so we have to include library/