So, you’ve installed PKS and created a PKS cluster. Excellent! Now what?
We want to use helm charts to deploy applications. Many of the charts use PersistentVolumes, so getting PVs set up is our first step.
There are a couple of complicating factors to be aware of when it comes to PVs in a multi-AZ/multi-vSphere-Cluster environment. First, you probably have cluster-specific datastores – particularly if you are using Pivotal Ready Architecture and VSAN. These datastores are not suitable for PersistentVolumes consumed by applications deployed to our Kubernetes cluster. To work-around this, we’ll need to provide some shared block storage to each host in each cluster. Probably the simplest way to do this is with an NFS share.
Prerequisites:
Common datastore; NFS share or iSCSI
In production, you’ll want a production-quality fault-tolerant solution for NFS or iSCSI, like Dell EMC Isilon. For this proof-of-concept, I’m going to use an existing NFS server, create a volume and share it to the hosts in the three vSphere clusters where the PKS workload VMs will run. In this case, the NFS datastore is named “sharednfs” ’cause I’m creative like that. Make sure that your hosts have adequate permissions to the share. Using VMFS on iSCSI is supported, just be aware that you may need to cable-up additional NICs if yours are already consumed by N-VDS and/or VSAN.
Workstation Prep
We’ll need a handful of command-line tools, so make sure your workstation has the PKS CLI and Kubectl CLI from Pivotal and you’ve downloaded and extracted Helm.
PKS Cluster
We’ll want to provision a cluster using the PKS CLI tool. This document assumes that your cluster was provisioned successfully, but nothing else has been done to it. For my environment, I configured the “medium” plan to use 3 Masters and 3 Workers in all three AZs, then created the cluster with the command
Where “pks1cl1″ is replaced by your cluster’s name,”api.pks1.lab13.myenv.lab” is replaced by the FQDN to your PKS API server, “pksadmin” is replaced by the username with admin rights to PKS and “my_password” is replaced with that account’s password.
Procedure:
Create storageclass
Create storageclass spec yaml. Note that the file is named storageclass-nfs.yml and we’re naming the storage class itself “nfs”:
Create a sample PVC (Persistent Volume Claim). Note that the file is names pvc-sample.yml, the PVC name is “pvc-sample” and uses the “nfs” storageclass we created above. This step is not absolutely necessary, but will help confirm we can use the storage.
Look for a version number for the version; note that it might take a few seconds for tiller in the cluster to get ready.
Deploy sample helm chart
Update helm local chart repository. We do this so that we can be sure that helm can reach the public repo and to cache teh latest information to our local repo.
helm repo update
If this step results in a certificate error, you may have to add the cert to the trusted certificates on the workstation.
Install helm chart with ingress enabled. Here, I’ve selected the Dokuwiki app. The command below will enable ingress, so we can access it via routable IP and it will use the default storageclass we configured earlier.
Edit – April 23 2019 – Passing the credentials in here makes connecting easier later.
Confirm that the app was deployed
helm list
kubectl get pods -n default
kubectl get services -n default
From the get services results, make a note of the external IP address – in the example above, it’s 192.13.6.73
Point a browser at the external address from the previous step and marvel at your success in deploying Dokuwiki via helm to Kubernetes! If you want to actually login to your Dokuwiki instance, first obtain the password for the user account with this command:
Then login with username “user” and that password.
Edit – 04/23/19 – Login with the username and password you included in the helm install command
Additional info
View Persistent Volume Claims with
kubectl get pvc -n default
This will list the PVCs and the volumes in the “default” namespace. Note the volume corresponds to the name of the VMDK on the datastore.
Load-Balancer
Notice that since we are leveraging the NSX-T Container Networking Interface and enabled the ingress when we installed dokuwiki, a load-balancer in NSX-T was automatically created for us to point to the application.
This took me some time to figure out; had to weed through a lot of documentation – some of which contradicted itself and quite a bit of trial-and-error. I hope this helps save someone time later!
Edit – 2/1/17 – Updated with OpenSSL configuration detail
Edit – 3/20/17 – Updated SubjectAltNames in config
Preparation
SSL Certificate. You’ll need the signed public cert for your URL (certnew.cer), the associated private key (pcf.key) and the public cert of the signing CA (root64.cer).
Download and install OpenSSL
Create a config file for your request – paste this into a text file:
[ req_distinguished_name ]
countryName = US
stateOrProvinceName = State
localityName = City
0.organizationName = Company Name
organizationalUnitName = PCF
commonName = *.pcf.domain.com
Replace the values in red with those appropriate for your environment. Be sure to specify the server name and IP address as the Virtual IP and its associated DNS record. Save the file as pcf.cfg. You’ll want to use the wildcard “base” name as the common name and the server name, as well as the *.system, *.apps, *.login.system and *.uaa.system Subject Alt Names.
Use OpenSSL to create the Certificate Site Request (CSR) for the wildcard PCF domain.
Use OpenSSL to convert the key to RSA (required for NSX to accept it)
openssl rsa -in pcf.key -out pcfrsa.key
Submit the CSR (pcf.csr) to your CA (Microsoft Certificate Services in my case), retrieve the certificate (certnew.cer) and certificate chain (certnew.p7b) base-64 encoded.
Double-click certnew.p7b to open certmgr. Export the CA certificate as 64-bit encoded x509 to a file (root64.cer is the file name I use)
Networks. You’ll need to know what layer 3 networks the PCF components will use. In my case, I set up a logical switch in NSX and assigned the gateway address to the DLR. Probably should make this a 24-bit network, so there’s room to grow, but not reserving a ridiculous number of addresses. We’re going to carve up the address space a little, so make a note of the following:
Gateway and other addresses you typically reserve for network devices. (eg: first 9 addresses 1-9)
Address that will be assigned to the NSX load balancer. Just need one (eg: 10)
Addresses that will be used by the PCF Routers. At least two. These will be configured as members in the NSX Load Balancer Pool.
DNS, IP addresses. PCF will use “system” and “apps” subdomains, plus whatever names you give any apps deployed. This takes some getting used to – not your typical application. Based on the certificate we created earlier, I recommend just creating a “pcf” subdomain. In my case, the network domain (using AD-DNS) is ragazzilab.com and I’ve created the following:
pcf.ragazzilab.com subdomain
*.pcf.ragazzilab.com A record for the IP address I’m going to assign to the NSX Load-Balancer
NSX
Assuming NSX is already installed and configured. Create or identify an existing NSX Edge that has an interface on the network where PCF will be / is deployed.
Assign the address we noted above to the inteface under Settings|Interfaces
Under Settings|Certificates, add the our SSL certificates
Click the Green Plus and select “CA Certificate”. Paste the content of the signing CA public certificate (base64.cer) into the Certificate Contents box. Click OK.
Click the Green Plus and select “Certificate”. Paste the content of the signed public cert (certnew.cer) into the Certificate Contents box and paste the content of the RSA private key (pcfrsa.key) into the Private Key box. Click OK.
Under Load Balancer, create an Application Profile. We need to ensure that NSX inserts the x-forwarded-for HTTP headers. To do that, we need to be able to decrypt the request and therefore must provide the certificate information. I found that Pool Side SSL had to be enabled and using the same Service and CA Certificates.
Router Application Profile
Create the Service Monitor. What worked for me is a little different from what is described in the GoRouter project page. The key points are that we want to specify the useragent and look for a response of “ok” with a header of “200 OK”.
Service Monitor for PCF Router
Create the Pool. Set it to ROUND-ROBIN using the Service Monitor you just created. When adding the routers as members, be sure to set the port to 443, but the Monitor Port to 80.
Router Pool
Create the Virtual Server. Specify the Application Profile and default Pool we just created. Obviously, specify the correct IP Address.
Virtual Server Configuration
PCF – Ops Manager
Assuming you’ve already deployed the Ops Manager OVF, use the installation dashboard to edit the configuration for Ops Manager Director. I’m just going to highlight the relevant areas of the configuration here:
Networks. Under “Create Networks”, be sure that the Subnet specified has the correct values. Pay special attention to the reserved IP ranges. These should be the addresses of the network devices and the IP address assigned to the load-balancer. Do not include the addresses we intend to use for the routers though. Based on the example values above, we’ll reserve the first 10 addresses.
Ops Manager Network Config
Ops Manager Director will probably use the first/lowest address in range that is not reserved.
PCF – Elastic Runtime
Next, we’ll install Elastic Runtime. Again, I’ll highlight the relevant sections of the configuration.
Domains. In my case it’s System Domain = system.pcf.ragazzilab.com and Apps Domain = apps.pcf.ragazzilab.com
Networking.
Set the Router IPs to the addresses (comma-separated) you noted and added to as members to the NSX load-balancer earlier.
Leave HAProxy IPs empty
Select the point-of-entry option for “external load balancer, and it can forward encrypted traffic”
Paste the content of the signed certificate (certnew.cer) into the Certificate PEM field. Paste the content of the CA public certificate (root64.cer) into the same field, directly under the certificate content.
Paste the content of the private key (pcf.key) into the Private Key PEM field.
Check “Disable SSL Certificate verification for this environment”.
Resource Config. Be sure that the number of Routers is at least 2 and equal to the number of IP addresses you reserved for them.
Troubleshooting
Help! The Pool Status is down when the Service Monitor is enabled.
This could occur if your routers are behaving differently from mine. Test the response by sending a request to one of the routers through curl and specifying the user agent as HTTP-Monitor/1.1
curl -v -A “HTTP-Monitor/1.1” “http://{IP of router}”
Testing router with curl
The value in the yellow box should go into the “Expected” field of the Service Monitor and the value in the red box should go into the “Receive” field. Note that you should not get a 404 response, if you do, check that he user agent is set correctly.
Notes
This works for me and I hope it works for you. If you have trouble or disagree, please let me know.
This is the third post in my series for building a fully distributed vCloud Automation Center deployment. In this post, we’ll configure vCenter Orchestrator (vCO) for High Availability using two nodes and an vCloud Networking and Security Edge Gateway as a Load Balancer. I’ll use the vCenter Orchestrator Appliance v5.5.1.0.1617225. I want to ensure that both vCO nodes return the same, organizationally-trusted SSL certificate, so we’ll configure that too.
Prerequisites
Database Server (ideally , it should be configured for high availability – I’ll be using a Microsoft SQL Server 2012 Failover Cluster)
Database for vCO
Credentials for database
Reserve IP addresses for two nodes and virtual IP
DNS records for both nodes and virtual IP (I’m using vcvco1 and vcvco2 for the appliance nodes and vcvco as the virtual)
Appropriate Identity Sources added to SSO
A vCO administrators security group with appropriate members
An Active Directory integrated Certificate Authority
Notes
In the steps below, text in red is not meant to be typed verbatim. You’ll replace the value with something relevant to your environment.
Configure database settings (MSSQL)
To ensure that multiple Orchestrator nodes can use the database without clashing, you’ll need to enable a couple of optional settings.
Thiscan be done through script:ALTER DATABASE [vcvCO] SET ALLOW_SNAPSHOT_ISOLATION ON; GO; ALTER DATABASE [vcvCO] SET READ_COMMITTED_SNAPSHOT ON; GO;
Or through theSSMS GUI:
Enable Miscellaneous options for the vCO database
Deploy and configure the First Orchestrator Appliance
Using the vSphere or vSphere Web Client, deploy the appliance from OVF to an available HA cluster. I named mine vcvco1.
Adjust the resources if necessary and power on vcvco1.
Browse to https://vcvco1:5480, logon as root
Set the timezone, confirm the network settings and hostname. I set the hostname to the vcvco, the cluster name. Log out of the VAMI.
Browse to https://vcvco1:8283, logon as vmware
Navigate to the Network section
(Optional) on the Network tab, set the IP address to the actual address. Leave the port numbers at default
On the SSL Trust Manager tab, type the URL to your SSO server (eg: https://vcsso.domain.local:7444) and click the Import button. Verify that the certificate information is correct and click Import to add it to the trust. Repeat this for your vCenter Server(s).
In the Authentication Section, you can choose LDAP or SSO. I’m going to configure it for SSO. Enter your sso hostname (eg: vcsso.domain.local). Click the Advanced Settings Link to see and verify that the Token service and Admin service URLs are fully populated with the correct port number (7444). Enter the user name and password for anSSO administrator (eg: administrator@vsphere.local) in the appropriate boxes. Click the RegisterOrchestrator button. Wait for it….Registered with SSO, but not configured
After the registration is confirmed, select the correct group in the vCO Admin – domain and group dropdown list. Then, click the Accept Orchestrator Configuration button.
In the Database section; again I’m using SQL Server, but you’d select what’s appropriate for your environment.
After the connection is made, click the link to Create the database tables, then Apply Changes.
On the Licenses section, enter the host name of the vCenter Server and credentials, then click Apply Changes.
Install any plugins you need (vCAC, ViPR, Powershell, etc) and restart the service to complete the plugin installation.
Create Package Signing Certificate
On the Server Certificate section, click the “Create a certificate database and self-signed server certificate” link. Enter vcvco.domain.local – that’s the load-balanced name, not the actual hostname – for the Common Name, set the organization, ou and country, then click Create.
Still in the Server Certificate section, click “Export a certificate signing request”. Save the vCO_SigningRequest.csr file to your system.
Log into the Microsoft CA certificate authority Web interface. By default, it is http://servername/CertSrv/.
Click the Request a certificate link.Click advanced certificate request.
Click the Submit a certificate request by using a base-64-encoded CMC or PKCS #10 file, or submit a renewal request by using a base-64-encoded PKCS #7 file link.
Open the certificate request (vCO_SigningRequest.csr) in notepad. Copy the content between —–BEGIN CERTIFICATE REQUEST—– and —–END CERTIFICATE REQUEST—–
Paste the copied content into the “Base-64-encoded certificate request” textarea. Select Web Server as the Certificate Template.
Click Submit to submit the request.
Click Base 64 encoded on the Certificate issued screen. Click the Download Certificate Chain link.
Save the package as C:\certs\certnew.p7b.
Double-click thep7b to open it incertmgr. Navigate to Certificates – Current User\C:\Certs\Certnew.p7b\Certificates.Certs in P7b
You’ll see two certificates here (unless you have intermediate certificates, then you’ll have more).
Right-click the one for the vCO server, choose All Tasks|Export. Save the file as Base-64 encoded X.509 (.CER) as vco.crt
Right-click the one for root CA server, choose All Tasks|Export. Save the file as Base-64 encoded X.509 (.CER) to as root.cer . Close certmgr.
Before vCO will accept the CA-signed certificate, we have to import the root certificate. Launch the Orchestrator Client. You can use https://vcvco1.domain.local:8281/vco/client/client.jnlp
Login to thevCO client as a member of thevCO Admins groupLogin to vCO Client
In the client, launch Certificate Manager from Tools|Certificate Manager.
Under Known Certificates, click the “Import Certificate” button. Browse to and select root.cer that you saved earlier. Verify that the certificate details are correct and client the “Import Certificate” button to finish. Close or minimize the vCO Client.
Back on the Server Certificate section of the vCO configuration, click “Import a certificate signing request signed by a certificate authority”. Select the vco.crt file you saved and click import. If you get an error here, make sure you’ve imported the correct root (and any intermediate) cert into vCO.
Replace vCO Client certificate
Now, if you navigate to https://vcvco1.domain.local:8281/vco, you’ll see that the certificate is still untrusted. Let’s fix that. The certificate and key is stored with a specific alias and password, we’re going to replace them, but reuse the alias and password.
SSH into vcvco1 as root
Navigate to /etc/vco/app-server/security and make a copy of the jssecacerts keystore file
cd /etc/vco/app-server/security cp ./jssecacerts ./jssecacerts.backup
Use keytool to delete the item with the “dunes” alias. The keystore password is “dunesdunes”
Shutdown the first vCO appliance (vcvco1) to be safe
Clone vcvco1 to a new VM named vcvco2, be sure to update the hostname and IP address in the vApp Properties. (Although it doesn’t affect the guest OS in this case)
The cloned VM will retain the original IP address and hostname, so browse to https://vcvco1:5480, logon as root and set the correct IP address and hostname.
Once vcvco2 is on the correct IP address, you can power on vcvco1
Browse to https://vcvco2:8283, logon as vmware.
On the Network area, select the correct IP address and apply changes.
Configure the cluster
Cluster mode, both nodes up
Browse to the vCO Configuration web interface, http://vcvco1:8283. Logon as vmware.
Under Server Availability, select Cluster mode
Set the number of active nodes to 2, leave the heartbeat values at default unless you have a reason to change them. Click “Apply Changes”. Note that there will be times when you’ll have to set the number of active nodes to 1.
Under Startup Options, restart service. This may not be necessary, but in my case, the nodes were not listed until after I restarted the vCO service.
Repeat steps 1-4 on vcvco2
Preparing to load-balance Note – this worked for me, YMMV
Using vCNS Manager, locate the appropriate edge gateway, click Actions|Manage to open it for editing
On the Configure Tab, edit the interface that will listen on the virtual IP
Edit the Subnet and add the Virtual IP. It’s probably not the primary IP. Save and publish those changes.Add the virtual IP to the Edge Gateay
On the Load Balancer tab, on the Pools page, click “Enable”, then “Publish Changes”
Click the green plus to add a load-balancing pool
Enter a recognizable Name and Description, click “Next”.Load Balancer Pool
On the Services step, check HTTPS, set Balancing Method to “ROUND_ROBIN” and the Port to 8281.Clck “Next”.Services (HTTPS:8281)
On the Health Check step, set it as shown. Click “Next” when done.Health Check
On the members step, click the green plus to add the IP address of yourvCO servers to the pool. I suggest keeping the weight for each at 1, while both nodes are active. There are times when you’ll want to make one node active though (details below). Keep the HTTPS port and Monitor Port at 8281 for each. Click “Next” once all you membersare added.vCO Members
Review the Ready to complete step and click “Finish” if it all correct
Click the Publish Changes Button before proceeding
Click the “Virtual Servers” link, then the green plus to add a Virtual ServervCO Virtual Server
Enter a meaningful name and description, provide the Virtual IP adddress that you added to the edge earlier, select the Pool created in the steps above and Enable HTTPS on port 8281. Set the Persistence Method to SSL_SESSION_ID and make the “Enabled” box is checked. Click “Add” then “Publish Changes”
Test by navigating to https://vcvco.domain.local:8281/vco and verifying that the certificate matches.
IMPORTANT UPDATE! – Repeat steps 7-14 above for TCP 8286 and 8287. Without these undocumented ports, neither the vCO client nor the vCAC appliance will connect to the vCO cluster.
Additional steps Put the two vCO nodes in a vApp, set them to start a few minutes apart to prevent both nodes from trying to initialize the database concurrently.
Use vApp to stagger the startup of the vCO nodes
Notes, Caveats and Warnings
When writing information to vCO, such as designing and importing new workflows, VMware requires that only one vCO node be active. I suggest that before you connect vCAC to vCO, you take the following steps:
Logon to vcvco1 configuration as vmware , set the number of active nodes under Server Availability to 1. Apply changes.
Logon to vcvco2 configuration as vmware , set the number of active nodes under Server Availability to 1. Apply changes.
Watch the Service Availability area, wait for it to indicate that one node is in standby. If you’re impatient as I am, you can restart the service on vcvco2. It should come up as standby. Record which node is RUNNING.
Logon to vCNS Manager, locate the appropriate Edge Gateway for the vcvco virtual server.
Edit the Load Balancer pool, leave the RUNNING node with a weight of 1, set all other nodes’ weight to zero
Once the workflows have been created and edited and you want to resume distribution of vCO jobs among the nodes, just reverse these changes, setting the active nodes to 2 and the weights to 1 for both nodes.
Do not connect the vCO client to the virtual address. In this case, only TCP8281 is forwarded and the vCO client needs additional ports forwarded to the nodes. Other load-balancers/NAT devices may not have this issue.
This post may get some edits as I work through the rest on the vCAC distributed build.
I still have no idea why the certificate alias and password is “dunes”. UPDATE – The company that was bought by VMware that originally developed the product that is now vCO was named “Dunes”.
This is the second in my series for building a fully distributed vCAC deployment. In this part, we’re building the vPostgres database server with replication for use with vCAC 6.x.
I’m using v9.2.6.0. The vCAC 6.0 Support Matrix says 9.2.4 is supported but the PDF version of the Installation and Configuration guide says 9.2.4 or higher is supported. I originally wanted to use 9.3.2.0 because the documentation includes replication, but I’m unsure whether it’s officially supported with vCAC 6.x yet. We’ll still configure replication though 🙂 I’m going to front-end the vPostgres nodes with a vCNS Edge Gateway load balancer so that in the case of a failure, we don’t have to reconfigure the vCAC appliance database connection. Updated documentation shows that for vCAC 6.0, vPostgres v9.2.4 is supported, v9.2.6 and v9.3.4 were untested. For vCAC 6.1, versions 9.2.4, 9.2.6 and 9.3.4 are supported. Prerequisites:
Reserve IP addresses for the appliances and the virtual IP.
Add DNS entries for the IP addresses. I used vpostgres1 for vpostgres2 for the appliances and vpostgres as the virtual/load-balanced name/address.
vProgres setup Steps
Download the VMware vFabric Postgres Appliance from my.vmware.com.
Deploy the vFabric Postgres Appliance from OVF twice. I named them vPostgres1 and vPostgres2. vPostgres1 will be the master and vPostgres2 will be the slave.
Power on vPostgres1, browse to https://vpostgres1:5480, logon as root using the password you entered during deployment.
Configure the hostname (eg: vpostgres1.ragazzi.lab) and timezone
Browse to https://vpostgres1:8443, leave the default values, enter your password and click “Connect” to enter the Enterprise Manager (vpgdbem)
Login to vpgdbem
Click on localhost:5432/DB Login Users to list the existing users (just “postgres” so far)
Click the green plus to add a new DB Login user. In the properties, enter “vcac” (or whatever you want) as the name, check “Enable login”, do not check “Can create DB login users” and set the password. Click OK to save.
Create vcac user
Click on localhost:5432 to display the overview and list of databases (just “postgres” so far)
Click the green plus to Create a new database. Just enter “vcacdb” (or similar) for the name, set the Owner to “vcac”, add a comment if you wish and click “OK” to save. Click the refresh button (blue ccw arrow) to refresh the list.
Create vcacdb database
Expand the Databases item under the treeview and select your new “vcacdb” database. The database overview should load, displaying the uptime, size and more.
Select the database
Toward the right side of the window is a button labelled “Enter SQL”, click it.
Obviously, replace the red text with the IP address of the master vPostgres server. First you’ll be prompted for the password for the “replicate” user, then you’ll confirm the authenticity of the connection, then you’ll be prompted to enter the password for the postgres user on the master. Next, you’ll confirm that you want to enable WAL archiving on the primary/master by typing “yes” and lastly, you’ll confirm your intention to overwrite the data directory with the databases from the master. It’ll copy the tablespace over.
Configure replica and confirm
Run this command on vpostgres1 to verify the replication:
Load-Balancer setup steps
I’m going to use the load-balancer feature in vCloud Networking and Security Edge Gateway. It’s not the most intelligent Load-Balancer ever, but it’s what I have.
Using vCNS Manager, locate the appropriate edge gateway, click Actions|Manage to open it for editing
On the Configure Tab, edit the interface that will listen on the virtual IP
Edit the Subnet and add the Virtual IP. It’s probably not the primary IP. Save and publish those changes
Add the virtual IP to the Edge Gateay
On the Load Balancer tab, on the Pools page, click “Enable”, then “Publish Changes”
Click the green plus to add a load-balancing pool
Enter a recognizable Name and Description, click “Next”
On the Services step, check only TCP, set Balancing Method to “ROUND_ROBIN” and the Port to 5432. Click “Next”
On the Health Check step, set it as shown. Click “Next” when done.
On the members step, click the green plus to add the IP address of you SSO servers to the pool. Add the primary/master vPostgress server with a weight of 1 or higher. Add the slave/replica with a weight of 0 (zero). This will ensure all of the traffic goes to the primary until it is changed in the event of a primary failure. Keep the TCP port and Monitor Port at 5432 for each. Click “Next” once all you members are added.
Review the Ready to complete step and click “Finish” if it all correct
Click the Publish Changes Button before proceeding
Click the “Virtual Servers” link, then the green plus to add a Virtual Server
Enter a meaningful name and description, provide the Virtual IP adddress that you added to the edge earlier, select the Pool created in the steps above and Enable TCP on port 5432. Make sure the “Enabled” box is checked. Click “Add” then “Publish Changes”
Now, when you configure your vCAC Appliance, provide the host name that resolves to the virtual IP address.
Dealing with a failure
By default, the replica acts like a read-only copy of the database. It has a very short replication delay, so do not count on it to save you if you delete things from the primary.
When to promote a replica:
You’ve screwed up the network settings on the primary vPostgres node beyond repair; preventing vCAC from using it and replication from occurring
You’ve applied an update to the primary vPostgres node that broke it; preventing vCAC from using it and replication from occurring
When to NOT promote a replica
You deleted a bunch of stuff from vCAC. Too late! Those changes have already replicated
The physical host where the primary vPostgres virtual appliance was running has failed. Just wait for vSphere HA to being it back online
You want to see it run active/active. It does’t do that. relax
Recovery Procedure
See if the primary/master node is up. If it is, stop here.
Using the vCNS Manager web interface, edit the load-balancing pool, setting the weight for vpostgres1 (which has failed) to 0 (zero) and the weight for vpostgres2 (which we’re going to promote) to 1. Save and publish changes.
If/When vpostgres1 comes back to life, you’ll need to configure it as a replica to vpostgres2. Do this by running the command from step 19 above.
Now if you want to make vpostgres1 primary again, I strongly suggest you stop the vcac_service on your vCAC appliances. Then, you’ll just promote it like you did before and make vpostgres2 a replica again.
I love the simplicity of the vCenter Server Appliance and the VMware Identity Appliance for vCAC, but neither offer a high availability option better than vSphere HA. There are use cases where you’d need your SSO service to offer better uptime and resilience. In addition, there is some SSL certificates to be configured and for that, we’ll follow the instructions in KB2034833, KB2061934 and KB2034181.
Notes, caveats, warnings
AFAIK, this will only work with vSphere 5.5. v5.1 handles SSO differently. I’m only using two nodes, if you have more, there will be extra steps. I do not have intermediate CAs, if you do, consult the KBs for the additional steps. I’m going to use a vCloud Networking and Security Edge Gateway as my load-balancer. It does not offer SSL offload like some other load-balancers do, so you may have to take extra steps to configure SSL offload.
Here’s what I have in mind; load-balanced SSO Servers
Prerequisites
Reserve the IP addresses for your actual SSO servers, plus the Virtual IP address.
Add A or CNAME records to your DNS for the SSO servers and the virtual IP.
The DNS name of the virtual IP is what the SSL certificate must match (vcsso in my case)
You should have an edge gateway already configured with an interface in the same networks as your virtual IP and actual SSO servers.
First SSO Server
I’m starting with two freshly deployed Windows Server 2008 R2 VMs, joined to the domain and named vcsso1 and vcsso2. On vcsso1, install the Single Sign-on service. Be sure the prerequisites are all ok.
On the deployment mode step, choose “vCenter Single Sign-On for your first vCenter Server”
Next,next,finish your way through the installation. You’ve set up an SSO server, YAY!
Second SSO Server
On the second server, vcsso, also install the SSO service. We’re going to make a few different selections than we did on vcsso1 though. On the deployment mode step, here we’re going to select “vCenter Single Sign-On for an additional vCenter Server in an existing site”.
Next, we’re prompted for information about the first, or partner, SSO server.
We have to confirm that the information obtained from the first SSO server is correct, so click Continue.
Then we select the site name configured on the first SSO server. I named mine “Lab”, but you can leave yours as “Default-First-Site” or whatever makes sense for your environment.
From here, you’ll Next,Next,Finish your way to completion.
Generating the Cert
Prerequisites: Either the VMware ssl-certificate-updater-tool or OpenSSL Win32 v0.9.8
Log on to the first SSO server (vcsso1), extract the VMware SSL certificate updater tool to C:\ssltool or similar. Create folders named “C:\certs\sso“. Open notepad and paste the following:
[ req ]
default_bits = 2048
default_keyfile = rui.key
distinguished_name = req_distinguished_name
encrypt_key = no
prompt = no
string_mask = nombstr
req_extensions = v3_req
[ v3_req ]
basicConstraints = CA:FALSE
keyUsage = digitalSignature, keyEncipherment, dataEncipherment
extendedKeyUsage = serverAuth, clientAuth
subjectAltName = DNS:ServerShortName, IP:ServerIPAddress, DNS:server.domain.com, DNS:ServerIPAddress
[ req_distinguished_name ]
countryName = Country
stateOrProvinceName = State
localityName = City
0.organizationName = Company Name
organizationalUnitName = vCenterSSO
commonName = server.domain.com
Replace the values in red with those appropriate for your environment. Be sure to specify the server name and IP address as the Virtual IP and its associated DNS record. Save the file as c:\certs\sso\sso.cfg
At a command prompt, navigate to the folder containing openssl.exe (eg: C:\ssltool\tools\openssl). Run this command to create the key and certificate site request (CSR):
Follow the steps in KB2062108 to create the appropriate certificate template in you Active Directory Certificate Authority.
Log into the Microsoft CA certificate authority Web interface. By default, it is http://servername/CertSrv/.
Click the Request a certificate link.Click advanced certificate request.
Click the Submit a certificate request by using a base-64-encoded CMC or PKCS #10 file, or submit a renewal request by using a base-64-encoded PKCS #7 file link.
Open the certificate request (rui.csr) in notepad. Copy the content between —–BEGIN CERTIFICATE REQUEST—– and —–END CERTIFICATE REQUEST—–
Paste the copied content into the “Base-64-encoded certificate request” textarea. Select VMware Certificate as the Certificate Template. See KB2062108 if you don’t have the “VMware Certificate” template
Click Submit to submit the request.
Click Base 64 encoded on the Certificate issued screen. Click the Download Certificate Chain link.
Save the package as C:\certs\certnew.p7b.
Double-click the p7b to open it in certmgr. Navigate to Certificates – Current User\C:\Certs\Certnew.p7b\Certificates.
Certs in P7b
You’ll see two certificates here (unless you have intermediate certificates, then you’ll have more).
Right-click the one for the SSO server, choose All Tasks|Export. Save the file as Base-64 encoded X.509 (.CER) to c:\certs\sso\rui.crt
Right-click the one for root CA server, choose All Tasks|Export. Save the file as Base-64 encoded X.509 (.CER) to c:\certs\root.cer .. Close certmgr.
Note: The certificate store password must be changeme and the key alias must be ssoserver. Do not change these parameters.
Install and Configure the Certificate
While logged on to the first SSO server (vcsso1) as an administrator, make sure this folder exists: C:\Program Files\Common Files\VMware vCenter Server – Java Components If it doesn’t, you’ll need to check your SSO installation
Open an elevated command prompt (as administrator) and enter the following
SET JAVA_HOME=C:\Program Files\Common Files\VMware vCenter Server - Java Components
SET PATH=%PATH%;C:\Program Files\VMware\Infrastructure\VMware\CIS\vmware-sso;%JAVA_HOME%\bin
In the command prompt, cd to the folder containing openssl.exe (eg: C:\ssltool\tools\openssl)
Generate a subject hash from the certificate using this command:
This will return an 8-character hash. Record it, we’ll need it later
On both SSO servers, create the folder C:\ProgramData\VMware\SSL
On both SSO servers, copy c:\certs\root.cer to C:\ProgramData\VMware\SSL renaming it to ca_certificates.crt
On both SSO servers, copy c:\certs\root.cer to C:\ProgramData\VMware\SSL again, this time renaming it to <subjecthash>.0 (replacing <subjecthash> with your hash value from above and appending dot zero)
Just on the first SSO server, paste the following into a text file named c:\certs\gc.properties. Replace the red text with appropriate values.
[service]
friendlyName=The group check interface of the SSO server
version=1.5
ownerId=
productId=product:sso
type=urn:sso:groupcheck
description=The group check interface of the SSO server
[endpoint0]
uri=https://SSOserver.domain.com:7444/sso-adminserver/sdk/vsphere.local
ssl=c:\certs\Root64.cer
protocol=vmomi
Paste the following into a text file named c:\certs\admin.properties. Replace the red text with appropriate values.
[service]
friendlyName=The administrative interface of the SSO server
version=1.5
ownerId=
productId=product:sso
type=urn:sso:admin
description=The administrative interface of the SSO server
[endpoint0]
uri=https://SSOserver.domain.com:7444/sso-adminserver/sdk/vsphere.local
ssl=c:\certs\Root64.cer
protocol=vmomi
Paste the following into a text file named c:\certs\sts.properties. Replace the red text with appropriate values.
[service]
friendlyName=STS for Single Sign On
version=1.5
ownerId=
productId=product:sso
type=urn:sso:sts
description=The Security Token Service of the Single Sign On server.
[endpoint0]
uri=https://SSOserver.domain.com:7444/sts/STSService/vsphere.local
ssl=c:\certs\Root64.cer
protocol=wsTrust
Next, we need the service ID for each of the three services SSO uses. To get these, run the following command, replacing the red text with the FQDN to your first SSO server:
Stop and restart the VMware Secure Token Service on both servers
Preparing to load-balance
Navigate to http://firstssoserver.domain.local:7444/sts/STSService/vsphere.local notice that the certificate gives an error, but look at the cert. The certificate should return the “common” name (in my case, “vcsso” instead of “vcsso1”. Repeat this for the second and subsequent SSO servers, verifying that they provide the same certificate
Using vCNS Manager, locate the appropriate edge gateway, click Actions|Manage to open it for editing
On the Configure Tab, edit the interface that will listen on the virtual IP
Edit the Subnet and add the Virtual IP. It’s probably not the primary IP. Save and publish those changes
Add the virtual IP to the Edge Gateay
On the Load Balancer tab, on the Pools page, click “Enable”, then “Publish Changes”
Click the green plus to add a load-balancing pool
Enter a recognizable Name and Description, click “Next”
On the Services step, check HTTPS, set Balancing Method to “ROUND_ROBIN” and the Port to 7444. Clck “Next”
On the Health Check step, set it as shown. Click “Next” when done.
On the members step, click the green plus to add the IP address of you SSO servers to the pool. I suggest keeping the weifght for each at 1, unless you have a reason to send more requests to specific nodes. Keep the HTTPS port and Monitor Port at 7444 for each. Click “Next” once all you members are added.
Review the Ready to complete step and click “Finish” if it all correct
Click the Publish Changes Button before proceeding
Click the “Virtual Servers” link, then the green plus to add a Virtual Server
Enter a meaningful name and description, provide the Virtual IP adddress that you added to the edge earlier, select the Pool created in the steps above and Enable HTTPS on port 7444. Set the Persistence Method to SSL_SESSION_ID and make the “Enabled” box is checked. Click “Add” then “Publish Changes”
Test by navigating to https://ssovirtual.domain.local:7444/lookupservice/sdk and https://ssovirtual.domain.local:7444/sts/STSService/vpshere.local verifying that the certificates match.
YAY, load-balanced SSO with matching SSL certs!
One more thing….
Using your favorite web browser, navigate to http://ssovirtual.domain.local:7444/websso/SAML2/Metadata/vsphere.local you’ll be prompted to download and save an XML file named vsphere.download. Now open the XML file in notepad or Notepad++. First, make sure you received a readable XML file. Second, noticed that the EntitiesDescriptor/EntityDescriptor entityID property is server-specific. We’ll need both servers to respond with the same information.
<EntitiesDescriptor xmlns="urn:oasis:names:tc:SAML:2.0:metadata" xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion" xmlns:vmes="http://vmware.com/schemas/attr-names/2012/04/Extensions" Name="vsphere.local" validUntil="2014-08-12T23:54:04Z">
<Extensions>
<vmes:ExportedOn>2014-08-11T23:54:04Z</vmes:ExportedOn>
<vmes:ExportedBy>Exported by VMware Identity Server (c) 2012</vmes:ExportedBy>
</Extensions>
<EntityDescriptor entityID="https://VCSSO1.domain.local:7444/websso/SAML2/Metadata/vsphere.local">
<IDPSSODescriptor WantAuthnRequestsSigned="false" protocolSupportEnumeration="urn:oasis:names:tc:SAML:2.0:protocol">
<KeyDescriptor xmlns:ds="http://www.w3.org/2000/09/xmldsig#" use="signing">
<ds:KeyInfo>
<snip...>
Warning This is not in a VMware KB, and may not be best way to do it. Having the value in the EntitiesDescriptor/EntityDescriptor entityID property match the FQDN is going to be very important in the near future. Trust me.
On each server, open C:\ProgramData\VMware\CIS\cfg\vmware-sso\hostname.txt. It only contains the resolved hostname, so update it to the virtual hostname (vcsso.ragaazzi.lab in my case) save the file
Retrieve the XML file from http://ssovirtual.domain.local:7444/websso/SAML2/Metadata/vsphere.local again open it and confirm that it contains the virtual hostname
Conclusion
This was such a lengthy post, I considered breaking it up, but there was no good break-point. Thanks for sticking with it. This is mostly for my own benefit, hopefully you’ll find it helpful too.