Configuring Replicated vPostgres for vCAC 6.x

This is the second in my series for building a fully distributed vCAC deployment.  In this part, we’re building the vPostgres database server with replication for use with vCAC 6.x.

I’m using v9.2.6.0. The vCAC 6.0 Support Matrix says 9.2.4 is supported but the PDF version of the Installation and Configuration guide says 9.2.4 or higher is supported.  I originally wanted to use 9.3.2.0 because the documentation includes replication, but I’m unsure whether it’s officially supported with vCAC 6.x yet.  We’ll still configure replication though 🙂  I’m going to front-end the vPostgres nodes with a vCNS Edge Gateway load balancer so that in the case of a failure, we don’t have to reconfigure the vCAC appliance database connection.  Updated documentation shows that for vCAC 6.0, vPostgres v9.2.4 is supported, v9.2.6 and v9.3.4 were untested.  For vCAC 6.1, versions 9.2.4, 9.2.6 and 9.3.4 are supported.
Prerequisites:

  • Reserve IP addresses for the appliances and the virtual IP.
  • Add DNS entries for the IP addresses.  I used vpostgres1 for vpostgres2 for the appliances and vpostgres as the virtual/load-balanced name/address.

vProgres setup Steps

  1. Download the VMware vFabric Postgres Appliance from my.vmware.com.
  2. Deploy the vFabric Postgres Appliance from OVF twice.  I named them vPostgres1 and vPostgres2.  vPostgres1 will be the master and vPostgres2 will be the slave.
  3. Power on vPostgres1, browse to https://vpostgres1:5480, logon as root using the password you entered during deployment.
  4. Configure the hostname (eg: vpostgres1.ragazzi.lab) and timezone
  5. Browse to https://vpostgres1:8443, leave the default values, enter your password and click “Connect” to enter the Enterprise Manager (vpgdbem)

    Login to vpgdbem
    Login to vpgdbem
  6. Click on localhost:5432/DB Login Users to list the existing users (just “postgres” so far)
  7. Click the green plus to add a new DB Login user.  In the properties, enter “vcac” (or whatever you want) as the name, check “Enable login”, do not check “Can create DB login users” and set the password.  Click OK to save.

    Create vcac user
    Create vcac user
  8. Click on localhost:5432 to display the overview and list of databases (just “postgres” so far)
  9. Click the green plus to Create a new database.  Just enter “vcacdb” (or similar) for the name, set the Owner to “vcac”, add a comment if you wish and click “OK” to save. Click the refresh button (blue ccw arrow) to refresh the list.

    Create vcacdb database
    Create vcacdb database
  10. Expand the Databases item under the treeview and select your new “vcacdb” database.  The database overview should load, displaying the uptime, size and more.

    Select the database
    Select the database
  11. Toward the right side of the window is a button labelled “Enter SQL”, click it.
  12. In the SQL Script area, type the following:

    CREATE EXTENSION "hstore";
    CREATE EXTENSION "uuid-ossp"

    SQL Statements
    SQL Statements
  13. Click “Execute” and check the Output|Messages area for SQL query succeeded
  14. Click the X to close the SQL window
  15. Complete steps 2-4 for vpostgres2. Do not configure any users or databases on vPostgres2.
  16. SSH into vpostgres1, logon as “postgres”, not root.
  17. Run this command to create a replication user named “replicate”:

    v9.2/opt/vmware/vpostgres/current/share/create_replication_user replicate

    v9.3/opt/vmware/vpostgres/current/scripts/create_replication_user replicate

    You’ll be prompted for a password and confirmation.

    Create "replicate" user
    Create “replicate” user
  18. SSH into on vpostgres2 as “postgres”
  19. Run this command to configure vpostgres2 as a replica:

    /opt/vmware/vpostgres/current/share/run_as_replica -h 192.168.101.31-b -W -U replicate

    Obviously, replace the red text with the IP address of the master vPostgres server. First you’ll be prompted for the password for the “replicate” user, then you’ll confirm the authenticity of the connection, then you’ll be prompted to enter the password for the postgres user on the master. Next, you’ll confirm that you want to enable WAL archiving on the primary/master by typing “yes” and lastly, you’ll confirm your intention to overwrite the data directory with the databases from the master. It’ll copy the tablespace over.

    Configure replica and confirm
    Configure replica and confirm
  20. Run this command on vpostgres1 to verify the replication:

    /opt/vmware/vpostgres/current/share/show_replication_status

    Replication Status
    Replication Status

Load-Balancer setup steps
I’m going to use the load-balancer feature in vCloud Networking and Security Edge Gateway. It’s not the most intelligent Load-Balancer ever, but it’s what I have.

  1. Using vCNS Manager, locate the appropriate edge gateway, click Actions|Manage to open it for editing
  2. On the Configure Tab, edit the interface that will listen on the virtual IP
  3. Edit the Subnet and add the Virtual IP. It’s probably not the primary IP. Save and publish those changes

    Add the virtual IP to the Edge Gateay
    Add the virtual IP to the Edge Gateay
  4. On the Load Balancer tab, on the Pools page, click “Enable”, then “Publish Changes”Enable Load Balancer
  5. Click the green plus to add a load-balancing pool
  6. Enter a recognizable Name and Description, click “Next”
  7. On the Services step, check only TCP, set Balancing Method to “ROUND_ROBIN” and the Port to 5432. Click “Next”
  8. On the Health Check step, set it as shown. Click “Next” when done.
  9. On the members step, click the green plus to add the IP address of you SSO servers to the pool. Add the primary/master vPostgress server with a weight of 1 or higher.  Add the slave/replica with a weight of 0 (zero).  This will ensure all of the traffic goes to the primary until it is changed in the event of a primary failure. Keep the TCP port and Monitor Port at 5432 for each. Click “Next” once all you members are added.
  10. Review the Ready to complete step and click “Finish” if it all correct
  11. Click the Publish Changes Button before proceeding
  12. Click the “Virtual Servers” link, then the green plus to add a Virtual Server
  13. Enter a meaningful name and description, provide the Virtual IP adddress that you added to the edge earlier, select the Pool created in the steps above and Enable TCP on port 5432. Make sure the “Enabled” box is checked. Click “Add” then “Publish Changes”

Now, when you configure your vCAC Appliance, provide the host name that resolves to the virtual IP address.vpostgres-vCAC1

Dealing with a failure
By default, the replica acts like a read-only copy of the database. It has a very short replication delay, so do not count on it to save you if you delete things from the primary.

When to promote a replica:

  • You’ve screwed up the network settings on the primary vPostgres node beyond repair; preventing vCAC from using it and replication from occurring
  • You’ve applied an update to the primary vPostgres node that broke it; preventing vCAC from using it and replication from occurring

When to NOT promote a replica

  • You deleted a bunch of stuff from vCAC. Too late! Those changes have already replicated
  • The physical host where the primary vPostgres virtual appliance was running has failed. Just wait for vSphere HA to being it back online
  • You want to see it run active/active. It does’t do that. relax

Recovery Procedure

  1. See if the primary/master node is up. If it is, stop here.
  2. Using the vCNS Manager web interface, edit the load-balancing pool, setting the weight for vpostgres1 (which has failed) to 0 (zero) and the weight for vpostgres2 (which we’re going to promote) to 1. Save and publish changes.
  3. SSH into the slave (vpostgres2) as postgres
  4. Run this command to promote the slave to master:

    /opt/vmware/vpostgres/current/share/promote_replica_to_primary

    The response will be “server promoting

  5. If/When vpostgres1 comes back to life, you’ll need to configure it as a replica to vpostgres2. Do this by running the command from step 19 above.
  6. Now if you want to make vpostgres1 primary again, I strongly suggest you stop the vcac_service on your vCAC appliances. Then, you’ll just promote it like you did before and make vpostgres2 a replica again.