Configuring Backup in Tanzu SQL with MySQL for Kubernetes

Backup & Restore

Prerequisite: A reachable S3 endpoint. Can be local or remote, but the pods must be able to resolve its name or IP. Create or select and existing bucket for your database backups. In this case, I have a minio instance running on-prem with a bucket named backup-mysql.

Create a secret for the S3 endpoint credentials. This account will need to be able to write to the database backup bucket. Here’s an example:

---
apiVersion: v1
kind: Secret
metadata:
  name: minio-creds
stringData:
  # S3 Credentials
  accessKeyId: "MYACCESSKEY"
  secretAccessKey: "MYSECRETKEY"

Create a TanzuMySQLBackupLocation. In the example below, we’re not using SSL with the minio endpoint, so I’m explicitly using port 80. More examples and details are found here. I like to keep the backups organized, so I’ll create a backup location for each instance and specify an bucketPath for each.

---
apiVersion: with.sql.tanzu.vmware.com/v1
kind: MySQLBackupLocation
metadata:
  name: backuplocation-mysql-ha
spec:
  storage:
    # For S3 or Minio:
    s3:
      bucket: "backup-mysql-ha"
      bucketPath: "/mysql-ha/"
      # region: "us-east-1"
      endpoint:  "http://minio.ragazzilab.com:80" # optional, default to AWS
      forcePathStyle: true
      secret:
        name: minio-creds

Test with a one-off backup. Create and apply a yaml like the following to request a backup without a schedule. Here’s an example yaml for a one-off backup for the mysql-ha instance to its corresponding backup location:

---
apiVersion: with.sql.tanzu.vmware.com/v1
kind: MySQLBackup
metadata:
  name: backup-mysql-ha-1off
spec:
  location:
    name: backuplocation-mysql-ha
  instance:
    name: mysql-ha

We can get the MySQLBackups to see that it has completed successfully:

Create a backup Schedule

Now that we’ve confirmed that the backup location and credentials work as expected, we should add a backup schedule. Here’s an example:

---
apiVersion: with.sql.tanzu.vmware.com/v1
kind: MySQLBackupSchedule
metadata:
  name: mysql-ha-daily
spec:
  backupTemplate:
    spec:
      location:
        name: backuplocation-mysql-ha
      instance:
        name:  mysql-ha
  schedule: "@daily"

Apply this kubectl apply -n mysql-instances -f backupschedule-mysql-ha-daily.yaml

I found that (unlike Velero), when applying the MySQLBackupSchedule, a backup does not immediately begin. At the scheduled time however, a pod for the backup schedule will be created to run the backup job. This pod will remain intact to run subsequent backup jobs.

Backup Pods and created Backup objects

Lastly, regarding backups, keep in mind that the backup data on the S3 endpoint never expires, the backups will remain there until removed manually. This may be important if you have limited capacity.

Restore/Recover

From the docs:

MySQLRestores always restores to a new MySQL instance to avoid overwriting any data on an existing MySQL instance. The MySQL instance is created automatically when the restore is triggered. Tanzu MySQL for Kubernetes does not allow you to restore a backup to an existing MySQL instance. Although you can perform this manually by copying the MySQL data from the backup artifact onto an existing MySQL instance, VMware strongly discourages you from doing this because you might overwrite existing data on the MySQL instance.

So, we should not expect to restore directly to a running database instance. If we need to recover, we’ll create a new instance and restore the backup to it.

To create a restore, we’ll need the name of the MySQLBackup object to restore from and a name of a database to create from that backup as part of the restore. We’ll put that into a yaml like the one below. Notice that we provide a spec for a new database, I wanted a loadbalancer for it although we are able to repoint the existing loadbalancer to the new proxy nodes (for ha) or the new database node (for standalone)

---
apiVersion: with.sql.tanzu.vmware.com/v1
kind: MySQLRestore
metadata:
  name: restore-ha
spec:
  backup:
    name: mysql-ha-daily-20210708-000005
  instanceTemplate:
    metadata:
      name: restored-mysql-database
    spec:
      storageSize: 2Gi
      imagePullSecret: harbor
      serviceType: LoadBalancer
      highAvailability:
        enabled: true

Apply the yaml to create the restore kubectl apply -n mysql-instances -f ./restore-ha.yamlYou should see a new database pending and a MySQLRestore object running:

Job is running and instance is pending
Restore job succeeded and there is a new mysql instance

Now, the choice if yours to copy data from the restored database back to the original or to point the applications to the new database or to point the loadbalancer at the new database.

If you choose to repoint the existing load-balancer to the new database, here’s an example how to do that:

kubectl patch service -n mysql-instances mysql-ha -p '{"spec":{"selector":{"app.kubernetes.io/instance": "restored-mysql-database"}}}'

Use Helm to deploy Harbor with Annotations for Velero

So, lets say you want to deploy an instance of Harbor to your “services” kubernetes cluster.  The cluster is protected by a scheduled Velero backup Velero pickup all resources in all namespaces by default, but we need to add an annotation to indicate a persistent volume that should be included in the backup.  Without this annotation, Velero will not include the PV in the backup.

First, let’s create a namespace we want to install Harbor to:
kubectl create ns harbor
Then, we’ll make sure helm has the chart for Harbor
helm repo add harbor https://helm.goharbor.io
helm repo update

Finally, we’ll install harbor
helm install harbor harbor/harbor --namespace harbor \
--set expose.type=loadBalancer,expose.tls.enabled=true,expose.tls.commonName=harbor.ragazzilab.com,\
externalURL=harbor.ragazzilab.com,harborAdminPassword=harbor,\
redis.podAnnotations."backup\.velero\.io/backup-volumes"=data,\
registry.podAnnotations."backup\.velero\.io/backup-volumes"=registry-data,\
trivy.podAnnotations."backup\.velero\.io/backup-volumes"=data,\
database.podAnnotations."backup\.velero\.io/backup-volumes"=database-data,\
chartmuseum.podAnnotations."backup\.velero\.io/backup-volumes"=chartmuseum-data,\
jobservice.podAnnotations."backup\.velero\.io/backup-volumes"=job-logs

Notice a few of the configurations we’re passing here:

  • expose.tls.commonName is the value that will be used by the gnerated TLS certificate
  • externalURL is the FQDN that we’ll use to reach Harbor (post deploy, you’ll get the loadBalancer IP and add the DNS record for it)
  • harborAdminPassword is the password assigned by default to the admin account – clearly this should be changed immediately
  • The next items are for the podAnnotations; the syntax was unexpectedly different.  Notice there’s a dot instead of an equals-sign between the key and the value.  Also notice that the dots in the value must be escaped.

Once Harbor is deployed, you can get the loadBalancer’s IP and point your browser at it.

Now, we can wait for the Velero backup job to run or kick off a one-off backup.

Not So Fast...

I noticed that Harbor did not start properly after restore.  This was because postgres in the database pod expects a specific set of permissions – which were apparently different as a result of the restore.  The log on the database pod only read FATAL: data directory “/var/lib/postgresql/data” has group or world access

To return Harbor to functionality post-restore, I had to take the following steps:

  1. Edit the database statefulSet: kubectl edit StatefulSet harbor-harbor-database -n harbor
  2. Replace the command in the “change-permission-of-directory” initContainer from chown -R 999:999 /var/lib/postgresql/data to chmod -R 0700 /var/lib/postgresql/data
  3. Save changes and bounce the database pod by running kubectl delete po -n harbor harbor-harbor-database-0
  4. Bounce the remaining pods that are in CrashLoopBackup (because they’re trying to connect to the database)

 

harborThanks to my friend and colleague Hemanth AVS for help with the podAnnotations syntax!