Use Helm to deploy Harbor with Annotations for Velero

So, lets say you want to deploy an instance of Harbor to your “services” kubernetes cluster.  The cluster is protected by a scheduled Velero backup Velero pickup all resources in all namespaces by default, but we need to add an annotation to indicate a persistent volume that should be included in the backup.  Without this annotation, Velero will not include the PV in the backup.

First, let’s create a namespace we want to install Harbor to:
kubectl create ns harbor
Then, we’ll make sure helm has the chart for Harbor
helm repo add harbor https://helm.goharbor.io
helm repo update

Finally, we’ll install harbor
helm install harbor harbor/harbor --namespace harbor \
--set expose.type=loadBalancer,expose.tls.enabled=true,expose.tls.commonName=harbor.ragazzilab.com,\
externalURL=harbor.ragazzilab.com,harborAdminPassword=harbor,\
redis.podAnnotations."backup\.velero\.io/backup-volumes"=data,\
registry.podAnnotations."backup\.velero\.io/backup-volumes"=registry-data,\
trivy.podAnnotations."backup\.velero\.io/backup-volumes"=data,\
database.podAnnotations."backup\.velero\.io/backup-volumes"=database-data,\
chartmuseum.podAnnotations."backup\.velero\.io/backup-volumes"=chartmuseum-data,\
jobservice.podAnnotations."backup\.velero\.io/backup-volumes"=job-logs

Notice a few of the configurations we’re passing here:

  • expose.tls.commonName is the value that will be used by the gnerated TLS certificate
  • externalURL is the FQDN that we’ll use to reach Harbor (post deploy, you’ll get the loadBalancer IP and add the DNS record for it)
  • harborAdminPassword is the password assigned by default to the admin account – clearly this should be changed immediately
  • The next items are for the podAnnotations; the syntax was unexpectedly different.  Notice there’s a dot instead of an equals-sign between the key and the value.  Also notice that the dots in the value must be escaped.

Once Harbor is deployed, you can get the loadBalancer’s IP and point your browser at it.

Now, we can wait for the Velero backup job to run or kick off a one-off backup.

Not So Fast...

I noticed that Harbor did not start properly after restore.  This was because postgres in the database pod expects a specific set of permissions – which were apparently different as a result of the restore.  The log on the database pod only read FATAL: data directory “/var/lib/postgresql/data” has group or world access

To return Harbor to functionality post-restore, I had to take the following steps:

  1. Edit the database statefulSet: kubectl edit StatefulSet harbor-harbor-database -n harbor
  2. Replace the command in the “change-permission-of-directory” initContainer from chown -R 999:999 /var/lib/postgresql/data to chmod -R 0700 /var/lib/postgresql/data
  3. Save changes and bounce the database pod by running kubectl delete po -n harbor harbor-harbor-database-0
  4. Bounce the remaining pods that are in CrashLoopBackup (because they’re trying to connect to the database)

 

harborThanks to my friend and colleague Hemanth AVS for help with the podAnnotations syntax!

 

 

Building a Concourse CI VM on Ubuntu

Recently, I’ve found myself needing a Concourse CI system. I struggled with the documentation on concourse.ci, couldn’t find any comprehensive build guides.  Knew for certain I wasn’t going to use VirtualBox.  So, having worked it out; thought I’d share what I went through to get to a working system.

Starting Position
Discovered that the CentOS version I was using previously did not have a compatible Linux kernel version.  CentOS 7.2 uses kernel 3.10, Concourse requires 3.19+.  So, I’m starting with a freshly-deployed Ubuntu Server 16.04 LTS this time.

Prep Ubuntu
Not a lot we have to do, but still pretty important:

  1. Make sure port for concourse is open

    sudo ufw allow 8080
    sudo ufw status

    sudo ufw disable

    I disabled the firewall on ubuntu because it was preventing the concourse worker and concourse web from communicating.

  2. Update and make sure wget is installed

    apt-get update
    apt-get install wget

Postgresql
Concourse expects to use a postgresql database, I don’t have one standing by, so let’s install it.

  1. Pretty straightforward on Ubuntu too:

    apt-get install postgresql postgresql-contrib

    Enter y to install the bits.  On Ubuntu, we don’t have to take extra steps to configure the service.

  2. Ok, now we have to create an account and a database for concourse. First, lets create the linux account. I’m calling mine “concourse” because I’m creative like that.

    adduser concourse
    passwd concourse

  3. Next, we create the account (aka “role” or “user”) in postgres via the createuser command. In order to do this, we have to switch to the postgres account, do that with sudo:

    sudo -i -u postgres

    Now, while in as postgres we can use the createuser command

    createuser –interactive

    You’ll enter the name of the account, and answer a couple of special permissions questions.

  4. While still logged in as postgres, run this command to create a new database for concourse. I’m naming my database “concourse” – my creativity is legendary. Actually, I think it makes life easier if the role and database are named the same

    createdb concourse

  5. Test by switching users to the concourse account and making sure it can run psql against the concourse databaseWhile in psql, use this command to set the password for the account in postgress

    ALTER ROLE concourse WITH PASSWORD 'changeme';

  6. Type \q to exit psql

Concourse
Ok, we have a running postgresql service and and account to be used for concourse. Let’s go.

  1. Create a folder for concourse. I used /concourse, but you can use /var/lib/whatever/concourse if you feel like it.
  2. Download the binary from concourse.ci/downloads.html into your /concourse folder using wget or transfer via scp.
  3. Create a symbolic link named “concourse” to the file you downloaded and make it executable

    ln -s ./concourse_linux_amd64 ./concourse
    chmod +x ./concourse_linux_amd64

  4. Create keys for concourse

    cd /concourse

    mkdir -p keys/web keys/worker

    ssh-keygen -t rsa -f ./keys/web/tsa_host_key -N ”
    ssh-keygen -t rsa -f ./keys/web/session_signing_key -N ”
    ssh-keygen -t rsa -f ./keys/worker/worker_key -N ”
    cp ./keys/worker/worker_key.pub ./keys/web/authorized_worker_keys
    cp ./keys/web/tsa_host_key.pub ./keys/worker

  5. Create start-up script for Concourse. Save this as /concourse/start.sh:

    /concourse/concourse web \
    –basic-auth-username myuser \
    –basic-auth-password mypass \
    –session-signing-key /concourse/keys/web/session_signing_key \
    –tsa-host-key /concourse/keys/web/tsa_host_key \
    –tsa-authorized-keys /concourse/keys/web/authorized_worker_keys \
    –external-url http://192.168.103.81:8080 \
    –postgres-data-source postgres://concourse:changeme@127.0.0.1/concourse?sslmode=disable

    /concourse/concourse worker \
    –work-dir /opt/concourse/worker \
    –tsa-host 127.0.0.1 \
    –tsa-public-key /concourse/keys/worker/tsa_host_key.pub \
    –tsa-worker-private-key /concourse/keys/worker/worker_key

    The items in red should definitely be changed for your environment. “external_url” uses the IP address of the VM its running on. and the username and password values in the postgres-data-source should reflect what you set up earlier. Save the file and be sure to set it as executable (chmod +x ./start.sh)

  6. Run the script “./start.sh”. You should see several lines go by concerning worker-collectors and builder-reapers.
    • If you instead see a message about authentication, you’ll want to make sure that 1) the credentials in the script are correct, 2) the account has not had it’s password set in linux or in postgres
    • If you instead see a message about the connection not accepting SSL, be sure that the connection string in the script includes “?sslmode=disable” after the database name
  7. Test by pointing a browser at the value you assigned to the external_url. You should see “no pipelines configured”.  You can login using the basic-auth username and password you specified in the startup script.

    Success!
  8. Back in your SSH session, you can kill it with <CRTL>+C

Finishing Up
Now we just have to make sure that concourse starts when the system reboots. I am certain that there are better/safer/more reliable ways to do this, but here’s what I did:
Use nano or your favorite text editor to add “/concourse/start.sh” to /etc/rc.local ABOVE the line that reads “exit 0”
Now, reboot your VM and retest the connectivity to the concourse page.

Thanks

EMC ECS Community Edition project for how to start the script on boot.

Mitchell Anicas’ very helpful post on setting up postgres on Ubuntu.

Concourse.ci for some wholly inadequate documentation

Alfredo Sánchez for bringing the issue with Concourse and CentOS to my attention

Building a Concourse CI VM on CentOS

Recently, I’ve found myself needing a Concourse CI system. I struggled with the documentation on concourse.ci, couldn’t find any comprehensive build guides.  Knew for certain I wasn’t going to use VirtualBox.  So, having worked it out; thought I’d share what I went through to get to a working system.

WARNING

It has been brought to my attention that CentOS does not have a compatible Linux kernel, so I’ve redone this post using Ubuntu instead.

Starting Position
I’m starting with a freshly-deployed CentOS 7 VM. I use Simon’s template build, so it comes up quickly and reliably.  Logged on as root.

Prep CentOS
Not a lot we have to do, but still pretty important:

  1. Open firewall post for concourse

    firewall-cmd --add-port=8080/tcp --permanent
    firewall-cmd --reload

    optionally, you can open 5432 for postgres if you feel like it

  2. Update and make sure wget is installed

    yum update
    yum install wget

Postgresql
Concourse expects to use a postgresql database, I don’t have one standing by, so let’s install it.

  1. Pretty straightforward on CentOS:

    yum install postgresql-server postgresql-contrib

    Enter y to install the bits.

  2. When that step is done, we’ll set it up with this command:

    sudo postgresql-setup initdb

  3. Next, we’ll update the postgresql config to allow passwords. Use your favorite editor to open /var/lib/pgsql/data/pg_hba.conf We need to update the value in the method column for IPv4 and IPv6 connections from “ident” to “md5” then save the file.
    Before

    After
  4. Now, let’s start postgresql and set it to run automatically

    sudo systemctl start postgresql
    sudo systemctl enable postgresql

  5. Ok, now we have to create an account and a database for concourse. First, lets create the linux account. I’m calling mine “concourse” because I’m creative like that.

    adduser concourse
    passwd concourse

  6. Next, we create the account (aka “role” or “user”) in postgres via the createuser command. In order to do this, we have to switch to the postgres account, do that with sudo:

    sudo -i -u postgres

    Now, while in as postgres we can use the createuser command

    createuser –interactive

    You’ll enter the name of the account, and answer a couple of special permissions questions.

  7. While still logged in as postgres, run this command to create a new database for concourse. I’m naming my database “concourse” – my creativity is legendary. Actually, I think it makes life easier if the role and database are named the same

    createdb concourse

  8. Test by switching users to the concourse account and making sure it can run psql against the concourse databaseWhile in psql, use this command to set the password for the account in postgress

    ALTER ROLE concourse WITH PASSWORD 'changeme';

  9. Type \q to exit psql

Concourse
Ok, we have a running postgresql service and and account to be used for concourse. Let’s go.

  1. Create a folder for concourse. I used /concourse, but you can use /var/lib/whatever/concourse if you feel like it.
  2. Download the binary from concourse.ci/downloads.html into your /concourse folder using wget or transfer via scp.
  3. Create a symbolic link named “concourse” to the file you downloaded and make it executable

    ln -s ./concourse_linux_amd64 ./concourse
    chmod +x ./concourse_linux_amd64

  4. Create keys for concourse

    cd /concourse

    mkdir -p keys/web keys/worker

    ssh-keygen -t rsa -f ./keys/web/tsa_host_key -N ”
    ssh-keygen -t rsa -f ./keys/web/session_signing_key -N ”
    ssh-keygen -t rsa -f ./keys/worker/worker_key -N ”
    cp ./keys/worker/worker_key.pub ./keys/web/authorized_worker_keys
    cp ./keys/web/tsa_host_key.pub ./keys/worker

  5. Create start-up script for Concourse. Save this as /concourse/start.sh:

    /concourse/concourse web \
    –basic-auth-username myuser \
    –basic-auth-password mypass \
    –session-signing-key /concourse/keys/web/session_signing_key \
    –tsa-host-key /concourse/keys/web/tsa_host_key \
    –tsa-authorized-keys /concourse/keys/web/authorized_worker_keys \
    –external-url http://192.168.103.81:8080 \
    –postgres-data-source postgres://concourse:changeme@127.0.0.1/concourse?sslmode=disable

    /concourse/concourse worker \
    –work-dir /opt/concourse/worker \
    –tsa-host 127.0.0.1 \
    –tsa-public-key /concourse/keys/worker/tsa_host_key.pub \
    –tsa-worker-private-key /concourse/keys/worker/worker_key

    The items in red should definitely be changed for your environment. “external_url” uses the IP address of the VM its running on. and the username and password values in the postgres-data-source should reflect what you set up earlier. Save the file and be sure to set it as executable (chmod +x ./start.sh)

  6. Run the script “./start.sh”. You should see several lines go by concerning worker-collectors and builder-reapers.
    • If you instead see a message about authentication, you’ll want to make sure that 1) the credentials in the script are correct, 2) the account has not had it’s password set in linux or in postgres and 3) the pg_hba.conf fie has been updated to use md5 instead of ident
    • If you instead see a message about the connection not accepting SSL, be sure that the connection string in the script includes “?sslmode=disable” after the database name
  7. Test by pointing a browser at the value you assigned to the external_url. You should see “no pipelines configured”

    Success!
  8. Back in your SSH session, you can kill it with <CRTL>+X

Finishing Up
Now we just have to make sure that concourse starts when the system reboots. I am certain that there are better/safer/more reliable ways to do this, but here’s what I did:

echo "/concourse/start.sh" >> /etc/rc.d/rc.local
chmod +x /etc/rc.d/rc.local

Now, reboot your VM and retest the connectivity to the concourse page.

Thanks

EMC ECS Community Edition project for how to start the script on boot.

Mitchell Anicas’ very helpful post on setting up postgres on CentOS.

Concourse.ci for some wholly inadequate documentation

Configuring Replicated vPostgres for vCAC 6.x

This is the second in my series for building a fully distributed vCAC deployment.  In this part, we’re building the vPostgres database server with replication for use with vCAC 6.x.

I’m using v9.2.6.0. The vCAC 6.0 Support Matrix says 9.2.4 is supported but the PDF version of the Installation and Configuration guide says 9.2.4 or higher is supported.  I originally wanted to use 9.3.2.0 because the documentation includes replication, but I’m unsure whether it’s officially supported with vCAC 6.x yet.  We’ll still configure replication though 🙂  I’m going to front-end the vPostgres nodes with a vCNS Edge Gateway load balancer so that in the case of a failure, we don’t have to reconfigure the vCAC appliance database connection.  Updated documentation shows that for vCAC 6.0, vPostgres v9.2.4 is supported, v9.2.6 and v9.3.4 were untested.  For vCAC 6.1, versions 9.2.4, 9.2.6 and 9.3.4 are supported.
Prerequisites:

  • Reserve IP addresses for the appliances and the virtual IP.
  • Add DNS entries for the IP addresses.  I used vpostgres1 for vpostgres2 for the appliances and vpostgres as the virtual/load-balanced name/address.

vProgres setup Steps

  1. Download the VMware vFabric Postgres Appliance from my.vmware.com.
  2. Deploy the vFabric Postgres Appliance from OVF twice.  I named them vPostgres1 and vPostgres2.  vPostgres1 will be the master and vPostgres2 will be the slave.
  3. Power on vPostgres1, browse to https://vpostgres1:5480, logon as root using the password you entered during deployment.
  4. Configure the hostname (eg: vpostgres1.ragazzi.lab) and timezone
  5. Browse to https://vpostgres1:8443, leave the default values, enter your password and click “Connect” to enter the Enterprise Manager (vpgdbem)

    Login to vpgdbem
    Login to vpgdbem
  6. Click on localhost:5432/DB Login Users to list the existing users (just “postgres” so far)
  7. Click the green plus to add a new DB Login user.  In the properties, enter “vcac” (or whatever you want) as the name, check “Enable login”, do not check “Can create DB login users” and set the password.  Click OK to save.

    Create vcac user
    Create vcac user
  8. Click on localhost:5432 to display the overview and list of databases (just “postgres” so far)
  9. Click the green plus to Create a new database.  Just enter “vcacdb” (or similar) for the name, set the Owner to “vcac”, add a comment if you wish and click “OK” to save. Click the refresh button (blue ccw arrow) to refresh the list.

    Create vcacdb database
    Create vcacdb database
  10. Expand the Databases item under the treeview and select your new “vcacdb” database.  The database overview should load, displaying the uptime, size and more.

    Select the database
    Select the database
  11. Toward the right side of the window is a button labelled “Enter SQL”, click it.
  12. In the SQL Script area, type the following:

    CREATE EXTENSION "hstore";
    CREATE EXTENSION "uuid-ossp"

    SQL Statements
    SQL Statements
  13. Click “Execute” and check the Output|Messages area for SQL query succeeded
  14. Click the X to close the SQL window
  15. Complete steps 2-4 for vpostgres2. Do not configure any users or databases on vPostgres2.
  16. SSH into vpostgres1, logon as “postgres”, not root.
  17. Run this command to create a replication user named “replicate”:

    v9.2/opt/vmware/vpostgres/current/share/create_replication_user replicate

    v9.3/opt/vmware/vpostgres/current/scripts/create_replication_user replicate

    You’ll be prompted for a password and confirmation.

    Create "replicate" user
    Create “replicate” user
  18. SSH into on vpostgres2 as “postgres”
  19. Run this command to configure vpostgres2 as a replica:

    /opt/vmware/vpostgres/current/share/run_as_replica -h 192.168.101.31-b -W -U replicate

    Obviously, replace the red text with the IP address of the master vPostgres server. First you’ll be prompted for the password for the “replicate” user, then you’ll confirm the authenticity of the connection, then you’ll be prompted to enter the password for the postgres user on the master. Next, you’ll confirm that you want to enable WAL archiving on the primary/master by typing “yes” and lastly, you’ll confirm your intention to overwrite the data directory with the databases from the master. It’ll copy the tablespace over.

    Configure replica and confirm
    Configure replica and confirm
  20. Run this command on vpostgres1 to verify the replication:

    /opt/vmware/vpostgres/current/share/show_replication_status

    Replication Status
    Replication Status

Load-Balancer setup steps
I’m going to use the load-balancer feature in vCloud Networking and Security Edge Gateway. It’s not the most intelligent Load-Balancer ever, but it’s what I have.

  1. Using vCNS Manager, locate the appropriate edge gateway, click Actions|Manage to open it for editing
  2. On the Configure Tab, edit the interface that will listen on the virtual IP
  3. Edit the Subnet and add the Virtual IP. It’s probably not the primary IP. Save and publish those changes

    Add the virtual IP to the Edge Gateay
    Add the virtual IP to the Edge Gateay
  4. On the Load Balancer tab, on the Pools page, click “Enable”, then “Publish Changes”Enable Load Balancer
  5. Click the green plus to add a load-balancing pool
  6. Enter a recognizable Name and Description, click “Next”
  7. On the Services step, check only TCP, set Balancing Method to “ROUND_ROBIN” and the Port to 5432. Click “Next”
  8. On the Health Check step, set it as shown. Click “Next” when done.
  9. On the members step, click the green plus to add the IP address of you SSO servers to the pool. Add the primary/master vPostgress server with a weight of 1 or higher.  Add the slave/replica with a weight of 0 (zero).  This will ensure all of the traffic goes to the primary until it is changed in the event of a primary failure. Keep the TCP port and Monitor Port at 5432 for each. Click “Next” once all you members are added.
  10. Review the Ready to complete step and click “Finish” if it all correct
  11. Click the Publish Changes Button before proceeding
  12. Click the “Virtual Servers” link, then the green plus to add a Virtual Server
  13. Enter a meaningful name and description, provide the Virtual IP adddress that you added to the edge earlier, select the Pool created in the steps above and Enable TCP on port 5432. Make sure the “Enabled” box is checked. Click “Add” then “Publish Changes”

Now, when you configure your vCAC Appliance, provide the host name that resolves to the virtual IP address.vpostgres-vCAC1

Dealing with a failure
By default, the replica acts like a read-only copy of the database. It has a very short replication delay, so do not count on it to save you if you delete things from the primary.

When to promote a replica:

  • You’ve screwed up the network settings on the primary vPostgres node beyond repair; preventing vCAC from using it and replication from occurring
  • You’ve applied an update to the primary vPostgres node that broke it; preventing vCAC from using it and replication from occurring

When to NOT promote a replica

  • You deleted a bunch of stuff from vCAC. Too late! Those changes have already replicated
  • The physical host where the primary vPostgres virtual appliance was running has failed. Just wait for vSphere HA to being it back online
  • You want to see it run active/active. It does’t do that. relax

Recovery Procedure

  1. See if the primary/master node is up. If it is, stop here.
  2. Using the vCNS Manager web interface, edit the load-balancing pool, setting the weight for vpostgres1 (which has failed) to 0 (zero) and the weight for vpostgres2 (which we’re going to promote) to 1. Save and publish changes.
  3. SSH into the slave (vpostgres2) as postgres
  4. Run this command to promote the slave to master:

    /opt/vmware/vpostgres/current/share/promote_replica_to_primary

    The response will be “server promoting

  5. If/When vpostgres1 comes back to life, you’ll need to configure it as a replica to vpostgres2. Do this by running the command from step 19 above.
  6. Now if you want to make vpostgres1 primary again, I strongly suggest you stop the vcac_service on your vCAC appliances. Then, you’ll just promote it like you did before and make vpostgres2 a replica again.