Adding trusted certs to nodes on TKGS 7.0 U2

A new feature added to TKGS as of 7.0 Update 2 is support for adding private SSL certificates to the “trust” on TKG cluster nodes.

This is very important as it finally provides a supported mechanism to use on-premises Harbor and other image registries.

It’s done by adding the encoded CAs to the “TkgServiceConfiguration”. The template for the TkgServiceConfiguration looks like this:

apiVersion: run.tanzu.vmware.com/v1alpha1
kind: TkgServiceConfiguration
metadata:
  name: tkg-service-configuration
spec:
  defaultCNI: antrea
  proxy:
    httpProxy: http://<user>:<pwd>@<ip>:<port>

  trust:
    additionalTrustedCAs:
      - name: first-cert-name
        data: base64-encoded string of a PEM encoded public cert 1
      - name: second-cert-name
        data: base64-encoded string of a PEM encoded public cert 2

Notice that there are two new sections under spec; one for proxy and one for trust. This article is going to focus on trust for additional CAs.

If your registry uses a self-signed cert, you’ll just encode that cert itself. If you take advantage on an Enterprise CA or similar to sign your certs, you’d encoded and import the “signing”, “intermediate” and/or “root” CA.

Example

Let’s add the certificate for a standalone Harbor (not the built-in Harbor instance in TKGS, its certificate is already trusted)

Download the certificate by clicking the “Registry Certificate” link

Run base64 -i <ca file> to return the base64 encoded content:

Provide a simple name and copy and paste the encoded cert into the data value:

Apply the TkgServiceConfiguration

After setting up your file. Apply it to the Supervisor cluster:

kubectl apply -f ./TanzuServiceConfiguration.yaml

Notes

  • Existing TKG clusters will not automatically inherit the trust for the certificates
  • Clusters created after the TKGServiceConfiguration is applied will get the certificates
  • You can scale an existing TKG cluster to trigger a rollout with the certificates
  • You can verify the certificates exist by connecting through SSH to the nodes and locating the certs under /etc/ssl/certs:
Advertisement

Using Helm and Dynamic PersistentVolumes with Multi-AZ PKS on vSphere

So, you’ve installed PKS and created a PKS cluster.  Excellent!  Now what?

We want to use helm charts to deploy applications.  Many of the charts use PersistentVolumes, so getting PVs set up is our first step.

There are a couple of complicating factors to be aware of when it comes to PVs in a multi-AZ/multi-vSphere-Cluster environment.  First, you probably have cluster-specific datastores – particularly if you are using Pivotal Ready Architecture and VSAN.  These datastores are not suitable for PersistentVolumes consumed by applications deployed to our Kubernetes cluster.  To work-around this, we’ll need to provide some shared block storage to each host in each cluster.  Probably the simplest way to do this is with an NFS share.

Prerequisites:

Common datastore; NFS share or iSCSI

In production, you’ll want a production-quality fault-tolerant solution for NFS or iSCSI, like Dell EMC Isilon. For this proof-of-concept, I’m going to use an existing NFS server, create a volume and share it to the hosts in the three vSphere clusters where the PKS workload VMs will run.  In this case, the NFS datastore is named “sharednfs” ’cause I’m creative like that.  Make sure that your hosts have adequate permissions to the share.  Using VMFS on iSCSI is supported, just be aware that you may need to cable-up additional NICs if yours are already consumed by N-VDS and/or VSAN.

Workstation Prep

We’ll need a handful of command-line tools, so make sure your workstation has the PKS CLI and Kubectl CLI from Pivotal and you’ve downloaded and extracted Helm.

PKS Cluster
We’ll want to provision a cluster using the PKS CLI tool.  This document assumes that your cluster was provisioned successfully, but nothing else has been done to it.  For my environment, I configured the “medium” plan to use 3 Masters and 3 Workers in all three AZs, then created the cluster with the command

pks create-cluster pks1cl1 --external-hostname cl1.pks1.lab13.myenv.lab --plan "medium" --num-nodes "3"


Logged-in
Make sure you’re logged into the Kubernetes cluster. In PKS, the easiest way to do this is via the PKS cli:

pks login -a api.pks1.lab13.myenv.lab -u pksadmin -p my_password --skip-ssl-validation
pks cluster pks1cl1
pks get-credentials pks1cl1
kubectl config use-context pks1cl1
kubectl get nodes -o wide

Where “pks1cl1″ is replaced by your cluster’s name,”api.pks1.lab13.myenv.lab” is replaced by the FQDN to your PKS API server, “pksadmin” is replaced by the username with admin rights to PKS and “my_password” is replaced with that account’s password.

Procedure:

  1. Create storageclass
    • Create storageclass spec yaml. Note that the file is named storageclass-nfs.yml and we’re naming the storage class itself “nfs”:
      kind: StorageClass
      apiVersion: storage.k8s.io/v1
      metadata:
        name: nfs
        annotations:
          storageclass.kubernetes.io/is-default-class: "true"
      provisioner: kubernetes.io/vsphere-volume
      parameters:
        diskformat: thin
        datastore: sharednfs
        fstype: ext3
      

    • Apply the yml with kubectl

      kubectl create -f storageclass-nfs.yml

    • Create a sample PVC (Persistent Volume Claim). Note that the file is names pvc-sample.yml, the PVC name is “pvc-sample” and uses the “nfs” storageclass we created above. This step is not absolutely necessary, but will help confirm we can use the storage.
      kind: PersistentVolumeClaim
      apiVersion: v1
      metadata:
        name: pvc-sample
        annotations:
          volume.beta.kubernetes.io/storage-class: nfs
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
        storageClassName: nfs
      
    • Apply the yml with kubectl

      kubectl create -f pvc-sample.yml


      If you’re watching vSphere closely, you’ll see a VMDK created in the kubevols folder of the NFS datastore

    • Check that the PVC was created with

      kubectl get pvc

      and

      kubectl describe pvc pvc-sample

    • Remove sample PVC with

      kubectl delete -f pvc-sample

  2. Configure Helm and Tiller
    • Create Service Account for tiller with
      apiVersion: v1
      kind: ServiceAccount
      metadata:
        name: tiller
        namespace: kube-system
      ---
      apiVersion: rbac.authorization.k8s.io/v1beta1
      kind: ClusterRoleBinding
      metadata:
        name: tiller
      roleRef:
        apiGroup: rbac.authorization.k8s.io
        kind: ClusterRole
        name: cluster-admin
      subjects:
        - kind: ServiceAccount
          name: tiller
          namespace: kube-system
      
    • Apply the service account yml with Kubectl

      kubectl create -f rbac-config.yml

    • Initialize helm and tiller with

      helm init --service-account tiller

    • Check that tiller is ready

      helm version


      Look for a version number for the version; note that it might take a few seconds for tiller in the cluster to get ready.

  3. Deploy sample helm chart
    • Update helm local chart repository. We do this so that we can be sure that helm can reach the public repo and to cache teh latest information to our local repo.

      helm repo update


      If this step results in a certificate error, you may have to add the cert to the trusted certificates on the workstation.

    • Install helm chart with ingress enabled. Here, I’ve selected the Dokuwiki app. The command below will enable ingress, so we can access it via routable IP and it will use the default storageclass we configured earlier.

      helm install --name dokuwiki \
      --set ingress.enabled="true",dokuwikiUsername=admin,dokuwikiPassword=password \
      stable/dokuwiki

      Edit – April 23 2019 – Passing the credentials in here makes connecting easier later.

    • Confirm that the app was deployed
      helm list
      kubectl get pods -n default
      kubectl get services -n default


      From the get services results, make a note of the external IP address – in the example above, it’s 192.13.6.73

    • Point a browser at the external address from the previous step and marvel at your success in deploying Dokuwiki via helm to Kubernetes!
      If you want to actually login to your Dokuwiki instance, first obtain the password for the user account with this command:

      kubectl get secret -n default dokuwiki-dokuwiki \
      -o jsonpath="{.data.dockuwiki-password}" | base64 --decode

      Then login with username “user” and that password.

       

      Edit – 04/23/19 – Login with the username and password you included in the helm install command

  4. Additional info
    • View Persistent Volume Claims with

      kubectl get pvc -n default


      This will list the PVCs and the volumes in the “default” namespace. Note the volume corresponds to the name of the VMDK on the datastore.

    • Load-Balancer
      Notice that since we are leveraging the NSX-T Container Networking Interface and enabled the ingress when we installed dokuwiki, a load-balancer in NSX-T was automatically created for us to point to the application.

This took me some time to figure out; had to weed through a lot of documentation – some of which contradicted itself and quite a bit of trial-and-error. I hope this helps save someone time later!

Removing NSX-T VIBs from ESXi hosts

I’d wanted to revert my environment from (an incomplete install of) NSX-T v2.0 back to NSX for vSphere v6.3.x, but found that the hosts would not complete preparation.  The logs indicated that something was “claimed by multiple non-overlay vibs”.

Error in esxupdate.log

I found that the hosts still had the NSX-T VIBs loaded, so to remove them, here’s what I did:

  1. Put the host in maintenance mode.  This is necessary to “de-activate” the VIBs that may be in use
  2. Login to the host via SSH
  3. Run

    /etc/init.d/netcpad stop
    /etc/init.d/nsx-ctxteng stop remove
    /etc/init.d/nsx-da stop remove
    /etc/init.d/nsx-datapath stop remove
    /etc/init.d/nsx-exporter stop remove
    /etc/init.d/nsx-hyperbus stop remove
    /etc/init.d/nsx-lldp stop remove
    /etc/init.d/nsx-mpa stop remove
    /etc/init.d/nsx-nestdb stop remove
    /etc/init.d/nsx-platform-client stop remove
    /etc/init.d/nsx-sfhc stop remove
    /etc/init.d/nsx-support-bundle-client stop remove
    /etc/init.d/nsxa stop remove
    /etc/init.d/nsxcli stop remove

  4. Run this all in one line; note the the order of the vibs is important

    esxcli software vib remove -n nsx-ctxteng -n nsx-hyperbus -n nsx-platform-client -n nsx-nestdb -n nsx-aggservice -n nsx-da -n nsx-esx-datapath -n nsx-exporter -n nsx-host -n nsx-lldp -n nsx-mpa -n nsx-netcpa -n nsx-python-protobuf -n nsx-sfhc -n nsx-support-bundle-client -n nsxa -n nsxcli -n nsx-common-libs -n nsx-metrics-libs -n nsx-nestdb-libs -n nsx-rpc-libs -n nsx-shared-libs -n nsx-python-gevent -n nsx-python-greenlet

  5. reboot the host

A downside to VVols

I picked up a Dell Equallogic PS6000 for my homelab.  Updated it to the latest firmware and discovered it’s capable of VVols.  Yay!  I created a container and (eventually) migrated nearly everything to it.  Seriously, every VM except  Avamar VE.  Started creating and destroying VMs; DRS is happily moving VMs among the hosts.

UNTIL (dun dun dun)

The Equallogic VSM, running the VASA storage provider gets stuck during a vMotion.  Hmm, I notice that all of the powered-off VMs now have a status of “inaccessible”.  On the hosts, the VVol “datastore” is inaccessible.  

Ok, that’s bad.  Thank goodness for Cormac Hogan’s post about this issue.  It boils down to a chicken-and-egg problem.  vCenter relies on the VASA provider to supply information about the VVol.  If the VASA provider resides on the VVols, there’s no apparent way to recover it.  There’s no datastore to find the vmx and re-register, the connections to the VVols are based on the VM, so if it’s not running, there’s no connection to it.

To resolve, I had to create a new instance of the Equallogic VSM, re-register it with vCenter, re-register it as a VASA provider and add the Equallogic group.  Thankfully, the array itself is the source-of-truth for the VVol configuration, so the New VSM picked it up seamlessly.

So your options are apparently to place the VSM/VASA provider on a non-VVol or build a new one every time it shuts down.  Not cool.

 

Configuring vCenter Orchestrator Appliance for High Availability

I don't get it either
Dunes? Dunes.

UPDATED 09/07/14

This is the third post in my series for building a fully distributed vCloud Automation Center deployment. In this post, we’ll configure vCenter Orchestrator (vCO) for High Availability using two nodes and an vCloud Networking and Security Edge Gateway as a Load Balancer.  I’ll use the vCenter Orchestrator Appliance v5.5.1.0.1617225.  I want to ensure that both vCO nodes return the same, organizationally-trusted SSL certificate, so we’ll configure that too.

Prerequisites

  • Database Server (ideally , it should be configured for high availability – I’ll be using a Microsoft SQL Server 2012 Failover Cluster)
  • Database for vCO
  • Credentials for database
  • Reserve IP addresses for two nodes and virtual IP
  • DNS records for both nodes and virtual IP (I’m using vcvco1 and vcvco2 for the appliance nodes and vcvco as the virtual)
  • Appropriate Identity Sources added to SSO
  • A vCO administrators security group with appropriate members
  • An Active Directory integrated Certificate Authority

Notes

In the steps below, text in red is not meant to be typed verbatim.  You’ll replace the value with something relevant to your environment.

Configure database settings (MSSQL)

To ensure that multiple Orchestrator nodes can use the database without clashing, you’ll need to enable a couple of optional settings.

Thiscan be done through script:ALTER DATABASE [vcvCO] SET ALLOW_SNAPSHOT_ISOLATION ON;
GO;
ALTER DATABASE [vcvCO] SET READ_COMMITTED_SNAPSHOT ON;
GO;

Or through theSSMS GUI:

Enable Miscellaneous options for the vCO database
Enable Miscellaneous options for the vCO database

Deploy and configure the First Orchestrator Appliance

  1. Using the vSphere or vSphere Web Client, deploy the appliance from OVF to an available HA cluster.  I named mine vcvco1.
  2. Adjust the resources if necessary and power on vcvco1.
  3. Browse to https://vcvco1:5480, logon as root
  4. Set the timezone, confirm the network settings and hostname.  I set the hostname to the vcvco, the cluster name. Log out of the VAMI.
  5. Browse to https://vcvco1:8283, logon as vmware
  6. Navigate to the Network section
  7. (Optional) on the Network tab, set the IP address to the actual address.  Leave the port numbers at default
  8. On the SSL Trust Manager tab, type the URL to your SSO server (eg:  https://vcsso.domain.local:7444) and click the Import button.  Verify that the certificate information is correct and click Import to add it to the trust.  Repeat this for your vCenter Server(s).
  9. In the Authentication Section, you can choose LDAP or SSO.  I’m going to configure it for SSO.  Enter your sso hostname (eg: vcsso.domain.local).  Click the Advanced Settings Link to see and verify that the Token service and Admin service URLs are fully populated with the correct port number (7444).  Enter  the user name and password for anSSO administrator (eg: administrator@vsphere.local) in the appropriate boxes.  Click the RegisterOrchestrator button. Wait for it….
    Registered with SSO, but not configured
  10. After the registration is confirmed, select the correct group in the vCO Admin – domain and group dropdown list. Then, click the Accept Orchestrator Configuration button.
  11. In the Database section; again I’m using SQL Server, but you’d select what’s appropriate for your environment.
  12. After the connection is made, click the link to Create the database tables, then Apply Changes.
  13. On the Licenses section, enter the host name of the vCenter Server and credentials, then click Apply Changes.
  14. Install any plugins you need (vCAC, ViPR, Powershell, etc) and restart the service to complete the plugin installation.

Create Package Signing Certificate

  1. On the Server Certificate section, click the “Create a certificate database and self-signed server certificate” link.  Enter vcvco.domain.local – that’s the load-balanced name, not the actual hostname – for the Common Name, set the organization, ou and country, then click Create.
  2. Still in the Server Certificate section, click “Export a certificate signing request”.  Save the vCO_SigningRequest.csr file to your system.
  3. Log into the Microsoft CA certificate authority Web interface. By default, it is http://servername/CertSrv/.
  4. Click the Request a certificate link.Click advanced certificate request.
  5. Click the Submit a certificate request by using a base-64-encoded CMC or PKCS #10 file, or submit a renewal request by using a base-64-encoded PKCS #7 file link.
  6. Open the certificate request (vCO_SigningRequest.csr) in notepad. Copy the content between —–BEGIN CERTIFICATE REQUEST—– and —–END CERTIFICATE REQUEST—–
  7. Paste the copied content into the “Base-64-encoded certificate request” textarea. Select Web Server as the Certificate Template.
  8. Click Submit to submit the request.
  9. Click Base 64 encoded on the Certificate issued screen. Click the Download Certificate Chain link.
  10. Save the package as C:\certs\certnew.p7b.
  11. Double-click thep7b to open it incertmgr.  Navigate to Certificates – Current User\C:\Certs\Certnew.p7b\Certificates.
    Certs in P7b
    Certs in P7b
  12. You’ll see two certificates here (unless you have intermediate certificates, then you’ll have more).
  13.  Right-click the one for the vCO server, choose All Tasks|Export.  Save the file as Base-64 encoded X.509 (.CER) as vco.crt
  14. Right-click the one for root CA server, choose All Tasks|Export.  Save the file as Base-64 encoded X.509 (.CER) to as root.cer .  Close certmgr.
  15. Before vCO will accept the CA-signed certificate, we have to import the root certificate.  Launch the Orchestrator Client.  You can use https://vcvco1.domain.local:8281/vco/client/client.jnlp
  16. Login to thevCO client as a member of thevCO Admins group
    Login to vCO Client
    Login to vCO Client
  17. In the client, launch Certificate Manager from Tools|Certificate Manager.
  18. Under Known Certificates, click the “Import Certificate” button.  Browse to and select root.cer that you saved earlier.  Verify that the certificate details are correct and client the “Import Certificate” button to finish. Close or minimize the vCO Client.
  19. Back on the Server Certificate section of the vCO configuration, click “Import a certificate signing request signed by a certificate authority”.  Select the vco.crt file you saved and click import.  If you get an error here, make sure you’ve imported the correct root (and any intermediate) cert into vCO.

Replace vCO Client certificate

Now, if you navigate to https://vcvco1.domain.local:8281/vco, you’ll see that the certificate is still untrusted.  Let’s fix that.  The certificate and key is stored with a specific alias and password, we’re going to replace them, but reuse the alias and password.

  1. SSH into vcvco1 as root
  2. Navigate to /etc/vco/app-server/security and make a copy of the jssecacerts keystore file

    cd /etc/vco/app-server/security
    cp ./jssecacerts ./jssecacerts.backup

  3. Use keytool to delete the item with the “dunes” alias. The keystore password is “dunesdunes”

    keytool -keystore ./jssecacerts -delete -alias dunes -storepass dunesdunes

    Delete "dunes" alias
    Delete “dunes” alias
  4. Use keytool to create a CSR. The certreq alias must be “dunes”.  Exporting the csr to the fie named vcvvcoreq.csr

    keytool -keystore ./jssecacerts -storepass dunesdunes -certreq -alias dunes -file vcvcoreq.csr

    Create CSR named vcvco.csr
    Create CSR named vcvco.csr
  5. Use filezilla or SFTP again to retrieve the csr
  6. Just like we did for the package signing certificate, submit a new request to your CA.
  7. This time, just download the certificate (not the certificate chain) in DER format instead of base64.  save the file as vcoDER.cer.
  8. Use filezilla or SFTP to copy vcoDER.cer to /etc/vco/app-server/security on vcvco1.  (you can actually place it anywhere, but this makes sense)
  9. Using keytool again, import the CA-signed cert into the keystore. The passwords are kept ‘dunesdunes”.

    keytool -keystore ./jssecacerts -storepass dunesdunes -importcert -alias dunes -keypass dunesdunes -file ./vcoDER.cer

    Import the cert
    Import the cert
  10. Restart the vCO services

    service vco-server restart

Prepare Second Orchestrator Appliance

  1.  Shutdown the first vCO appliance (vcvco1) to be safe
  2. Clone vcvco1 to a new VM named vcvco2, be sure to update the hostname and IP address in the vApp Properties. (Although it doesn’t affect the guest OS in this case)
  3. The cloned VM will retain the original IP address and hostname, so browse to https://vcvco1:5480, logon as root and set the correct IP address and hostname.
  4. Once vcvco2 is on the correct IP address, you can power on vcvco1
  5. Browse to https://vcvco2:8283, logon as vmware.
  6. On the Network area, select the correct IP address and apply changes.

Configure the cluster

Cluster mode, both nodes up
Cluster mode, both nodes up

  1. Browse to the vCO Configuration web interface, http://vcvco1:8283.  Logon as vmware.
  2. Under Server Availability, select Cluster mode
  3. Set the number of active nodes to 2, leave the heartbeat values at default unless you have a reason to change them. Click “Apply Changes”.  Note that there will be times when you’ll have to set the number of active nodes to 1.
  4. Under Startup Options, restart service.  This may not be necessary, but in my case, the nodes were not listed until after I restarted the vCO service.
  5. Repeat steps 1-4 on vcvco2

 

Preparing to load-balance
Note – this worked for me, YMMV

  1. Using vCNS Manager, locate the appropriate edge gateway, click Actions|Manage to open it for editing
  2. On the Configure Tab, edit the interface that will listen on the virtual IP
  3. Edit the Subnet and add the Virtual IP. It’s probably not the primary IP. Save and publish those changes.
    Add the virtual IP to the Edge Gateay
    Add the virtual IP to the Edge Gateay
  4. On the Load Balancer tab, on the Pools page, click “Enable”, then “Publish Changes”
  5. Click the green plus to add a load-balancing pool
  6. Enter a recognizable Name and Description, click “Next”.
    Load Balancer Pool
  7. On the Services step, check HTTPS, set Balancing Method to “ROUND_ROBIN” and the Port to 8281.Clck “Next”.
    Services (HTTPS:8281)
  8. On the Health Check step, set it as shown. Click “Next” when done.
    Health Check
  9. On the members step, click the green plus to add the IP address of yourvCO servers to the pool. I suggest keeping the weight for each at 1, while both nodes are active.  There are times when you’ll want to make one node active though (details below).  Keep the HTTPS port and Monitor Port at 8281 for each. Click “Next” once all you membersare added.
    vCO Members
  10. Review the Ready to complete step and click “Finish” if it all correct
  11. Click the Publish Changes Button before proceeding
  12. Click the “Virtual Servers” link, then the green plus to add a Virtual Server
    vCO Virtual Server
  13. Enter a meaningful name and description, provide the Virtual IP adddress that you added to the edge earlier, select the Pool created in the steps above and Enable HTTPS on port 8281. Set the Persistence Method to SSL_SESSION_ID and make the “Enabled” box is checked. Click “Add” then “Publish Changes”
  14. Test by navigating to https://vcvco.domain.local:8281/vco and verifying that the certificate matches.
  15. IMPORTANT UPDATE! – Repeat steps 7-14 above for TCP 8286 and 8287.  Without these undocumented ports, neither the vCO client nor the vCAC appliance will connect to the vCO cluster.

Additional steps
Put the two vCO nodes in a vApp, set them to start a few minutes apart to prevent both nodes from trying to initialize the database concurrently.

Use vApp to stagger the startup of the vCO nodes
Use vApp to stagger the startup of the vCO nodes

 

Notes, Caveats and Warnings

When writing information to vCO, such as designing and importing new workflows, VMware requires that only one vCO node be active.  I suggest that before you connect vCAC to vCO, you take the following steps:

  1. Logon to vcvco1 configuration as vmware , set the number of active nodes under Server Availability to 1.  Apply changes.
  2. Logon to vcvco2 configuration as vmware , set the number of active nodes under Server Availability to 1.  Apply changes.
  3. Watch the Service Availability area, wait for it to indicate that one node is in standby. If you’re impatient as I am, you can restart the service on vcvco2.  It should come up as standby.  Record which node is RUNNING.
  4. Logon to vCNS Manager, locate the appropriate Edge Gateway for the vcvco virtual server.
  5. Edit the Load Balancer pool, leave the RUNNING node with a weight of 1, set all other nodes’ weight to zero

Once the workflows have been created and edited and you want to resume distribution of vCO jobs among the nodes, just reverse these changes, setting the active nodes to 2 and the weights to 1 for both nodes.

Do not connect the vCO client to the virtual address.  In this case, only TCP8281 is forwarded and the vCO client needs additional ports forwarded to the nodes.  Other load-balancers/NAT devices may not have this issue.

This post may get some edits as I work through the rest on the vCAC distributed build.

I still have no idea why the certificate alias and password is “dunes”.  UPDATE – The company that was bought by VMware that originally developed the product that is now vCO was named “Dunes”.

References 

Work with vCO over SSL

VMKB2058674

vCO 5.5.1 release notes

Configuring Highly Available vCenter SSO with SSL certificates

*** UPDATE 12/18/14 ***
Instead of this blog, I strongly suggest you use and follow the
Configuring VMware vCenter SSO High Availability for VMware vRealize Automation Technical Whitepaper. It is somewhat more comprehensive and authoritative. For VMware documentation, it’s really good.
*** UPDATE ***

I love the simplicity of the vCenter Server Appliance and the VMware Identity Appliance for vCAC, but neither offer a high availability option better than vSphere HA. There are use cases where you’d need your SSO service to offer better uptime and resilience. In addition, there is some SSL certificates to be configured and for that, we’ll follow the instructions in KB2034833,  KB2061934 and KB2034181.

Notes, caveats, warnings
AFAIK, this will only work with vSphere 5.5. v5.1 handles SSO differently. I’m only using two nodes, if you have more, there will be extra steps. I do not have intermediate CAs, if you do, consult the KBs for the additional steps. I’m going to use a vCloud Networking and Security Edge Gateway as my load-balancer.  It does not offer SSL offload like some other load-balancers do, so you may have to take extra steps to configure SSL offload.

Here's what I have in mind; load-balanced SSO Servers
Here’s what I have in mind; load-balanced SSO Servers

Prerequisites
Reserve the IP addresses for your actual SSO servers, plus the Virtual IP address.
Add A or CNAME records to your DNS for the SSO servers and the virtual IP.
The DNS name of the virtual IP is what the SSL certificate must match (vcsso in my case)
You should have an edge gateway already configured with an interface in the same networks as your virtual IP and actual SSO servers.

First SSO Server

I’m starting with two freshly deployed Windows Server 2008 R2 VMs, joined to the domain and named vcsso1 and vcsso2.  On vcsso1, install the Single Sign-on service.  Be sure the prerequisites are all ok.

vCenter SSO Prereqs

On the deployment mode step, choose “vCenter Single Sign-On for your first vCenter Server” First SSO Server

Next,next,finish your way through the installation. You’ve set up an SSO server, YAY!

Second SSO Server

On the second server, vcsso, also install the SSO service.  We’re going to make a few different selections than we did on vcsso1 though.  On the deployment mode step, here we’re going to select “vCenter Single Sign-On for an additional vCenter Server in an existing site”.Second SSO

 

Next, we’re prompted for information about the first, or partner, SSO server.  SSO Partner

 

We have to confirm that the information obtained from the first SSO server is correct, so click Continue.Certificate Verification

Then we select the site name configured on the first SSO server.  I named mine “Lab”, but you can leave yours as “Default-First-Site” or whatever makes sense for your environment. Select Site Name

 

From here, you’ll Next,Next,Finish your way to completion.

Generating the Cert
Prerequisites: Either the VMware ssl-certificate-updater-tool or OpenSSL Win32 v0.9.8

  1. Log on to the first SSO server (vcsso1), extract the VMware SSL certificate updater tool to C:\ssltool or similar.  Create folders named “C:\certs\sso“.  Open notepad and paste the following:

    [ req ]
    default_bits = 2048
    default_keyfile = rui.key
    distinguished_name = req_distinguished_name
    encrypt_key = no
    prompt = no
    string_mask = nombstr
    req_extensions = v3_req


    [ v3_req ]
    basicConstraints = CA:FALSE
    keyUsage = digitalSignature, keyEncipherment, dataEncipherment
    extendedKeyUsage = serverAuth, clientAuth
    subjectAltName = DNS:ServerShortName, IP:ServerIPAddress, DNS:server.domain.com, DNS:ServerIPAddress

    [ req_distinguished_name ]
    countryName = Country
    stateOrProvinceName = State
    localityName = City
    0.organizationName = Company Name
    organizationalUnitName = vCenterSSO
    commonName = server.domain.com
  2. Replace the values in red with those appropriate for your environment. Be sure to specify the server name and IP address as the Virtual IP and its associated DNS record. Save the file as c:\certs\sso\sso.cfg
  3. At a command prompt, navigate to the folder containing openssl.exe (eg: C:\ssltool\tools\openssl). Run this command to create the key and certificate site request (CSR):

    openssl req -new -nodes -out c:\certs\sso\rui.csr -keyout c:\certs\sso\rui-orig.key -config c:\certs\sso\sso.cfg

    Generate CSR
    Generate CSR
  4. In the same command prompt, run this to change the key to the necessary type.


    openssl rsa -in c:\certs\sso\rui-orig.key -out c:\certs\sso\rui.key

  5. Follow the steps in KB2062108 to create the appropriate certificate template in you Active Directory Certificate Authority.
  6. Log into the Microsoft CA certificate authority Web interface. By default, it is http://servername/CertSrv/.
  7. Click the Request a certificate link.Click advanced certificate request.
  8. Click the Submit a certificate request by using a base-64-encoded CMC or PKCS #10 file, or submit a renewal request by using a base-64-encoded PKCS #7 file link.
  9. Open the certificate request (rui.csr) in notepad. Copy the content between —–BEGIN CERTIFICATE REQUEST—– and —–END CERTIFICATE REQUEST—–
  10. Paste the copied content into the “Base-64-encoded certificate request” textarea. Select VMware Certificate as the Certificate Template. See KB2062108 if you don’t have the “VMware Certificate” template
  11. Click Submit to submit the request.
  12. Click Base 64 encoded on the Certificate issued screen. Click the Download Certificate Chain link.
  13. Save the package as C:\certs\certnew.p7b.
  14. Double-click the p7b to open it in certmgr.  Navigate to Certificates – Current User\C:\Certs\Certnew.p7b\Certificates.

    Certs in P7b
    Certs in P7b
  15. You’ll see two certificates here (unless you have intermediate certificates, then you’ll have more).
  16.  Right-click the one for the SSO server, choose All Tasks|Export.  Save the file as Base-64 encoded X.509 (.CER) to c:\certs\sso\rui.crt
  17. Right-click the one for root CA server, choose All Tasks|Export.  Save the file as Base-64 encoded X.509 (.CER) to c:\certs\root.cer ..  Close certmgr.
  18. Generate the pfx by running this command:

    openssl pkcs12 -export -in c:\certs\sso\rui.crt -inkey c:\certs\sso\rui.key -certfile c:\certs\Root64.cer -name "ssoserver" -passout pass:changeme -out c:\certs\sso\ssoserver.p12

    Note: The certificate store password must be changeme and the key alias must be ssoserver. Do not change these parameters.

Install and Configure the Certificate

  1. While logged on to the first SSO server (vcsso1) as an administrator, make sure this folder exists: C:\Program Files\Common Files\VMware vCenter Server – Java Components If it doesn’t, you’ll need to check your SSO installation
  2. Open an elevated command prompt (as administrator) and enter the following

    SET JAVA_HOME=C:\Program Files\Common Files\VMware vCenter Server - Java Components
    SET PATH=%PATH%;C:\Program Files\VMware\Infrastructure\VMware\CIS\vmware-sso;%JAVA_HOME%\bin

  3. In the command prompt, cd to the folder containing openssl.exe (eg: C:\ssltool\tools\openssl)
  4. Generate a subject hash from the certificate using this command:

    openssl x509 -subject_hash -noout -in c:\certs\root.cer

    This will return an 8-character hash. Record it, we’ll need it later

  5. On both SSO servers, create the folder C:\ProgramData\VMware\SSL
  6. On both SSO servers, copy c:\certs\root.cer to C:\ProgramData\VMware\SSL renaming it to ca_certificates.crt
  7. On both SSO servers, copy c:\certs\root.cer to C:\ProgramData\VMware\SSL again, this time renaming it to <subjecthash>.0 (replacing <subjecthash> with your hash value from above and appending dot zero)
  8. Just on the first SSO server, paste the following into a text file named c:\certs\gc.properties. Replace the red text with appropriate values.

    [service]
    friendlyName=The group check interface of the SSO server
    version=1.5
    ownerId=
    productId=product:sso
    type=urn:sso:groupcheck
    description=The group check interface of the SSO server


    [endpoint0]
    uri=https://SSOserver.domain.com:7444/sso-adminserver/sdk/vsphere.local
    ssl=c:\certs\Root64.cer
    protocol=vmomi

  9. Paste the following into a text file named c:\certs\admin.properties. Replace the red text with appropriate values.

    [service]
    friendlyName=The administrative interface of the SSO server
    version=1.5
    ownerId=
    productId=product:sso
    type=urn:sso:admin
    description=The administrative interface of the SSO server


    [endpoint0]
    uri=https://SSOserver.domain.com:7444/sso-adminserver/sdk/vsphere.local
    ssl=c:\certs\Root64.cer
    protocol=vmomi

  10. Paste the following into a text file named c:\certs\sts.properties. Replace the red text with appropriate values.

    [service]
    friendlyName=STS for Single Sign On
    version=1.5
    ownerId=
    productId=product:sso
    type=urn:sso:sts
    description=The Security Token Service of the Single Sign On server.


    [endpoint0]
    uri=https://SSOserver.domain.com:7444/sts/STSService/vsphere.local
    ssl=c:\certs\Root64.cer
    protocol=wsTrust

  11. Next, we need the service ID for each of the three services SSO uses. To get these, run the following command, replacing the red text with the FQDN to your first SSO server:

    ssolscli.cmd listServices https://vcsso1.domain.local:7444/lookupservice/sdk

    SSO Services
    SSO Services
  12. The service ID for each service should be saved to a file. Use quickedit to copy the service id for each and echo it to a file:

    Echo service ID to files
    Echo service ID to files
  13. Update the Group Check service:

    ssolscli updateService -d https://ssoserver.domain.com:7444/lookupservice/sdk -u SSO_administrator -p password -si c:\certs\gc_id -ip c:\certs\gc.properties

  14. Update the Admin service:

    ssolscli updateService -d https://ssoserver.domain.com:7444/lookupservice/sdk -u SSO_administrator -p password -si c:\certs\admin_id -ip c:\certs\admin.properties

  15. Update the STS service:

    ssolscli updateService -d https://ssoserver.domain.com:7444/lookupservice/sdk -u SSO_administrator -p password -si c:\certs\sts_id -ip c:\certs\sts.properties

  16. Copy the new SSL files to C:\ProgramData\VMware\CIS\runtime\VMwareSTS\conf on both/all SSO servers:


    copy C:\certs\SSO\ssoserver.p12 C:\ProgramData\VMware\CIS\runtime\VMwareSTS\conf\ssoserver.p12
    copy C:\certs\Root.cer C:\ProgramData\VMware\CIS\runtime\VMwareSTS\conf\ssoserver.crt
    copy C:\certs\SSO\rui.key C:\ProgramData\VMware\CIS\runtime\VMwareSTS\conf\ssoserver.key

  17. Stop and restart the VMware Secure Token Service on both servers

Preparing to load-balance

  1. Navigate to http://firstssoserver.domain.local:7444/sts/STSService/vsphere.local notice that the certificate gives an error, but look at the cert. The certificate should return the “common” name (in my case, “vcsso” instead of “vcsso1”. Repeat this for the second and subsequent SSO servers, verifying that they provide the same certificate
  2. Using vCNS Manager, locate the appropriate edge gateway, click Actions|Manage to open it for editing
  3. On the Configure Tab, edit the interface that will listen on the virtual IP
  4. Edit the Subnet and add the Virtual IP. It’s probably not the primary IP. Save and publish those changes

    Add the virtual IP to the Edge Gateay
    Add the virtual IP to the Edge Gateay
  5. On the Load Balancer tab, on the Pools page, click “Enable”, then “Publish Changes”Enable Load Balancer
  6. Click the green plus to add a load-balancing pool
  7. Enter a recognizable Name and Description, click “Next”LB2-AddPool1
  8. On the Services step, check HTTPS, set Balancing Method to “ROUND_ROBIN” and the Port to 7444. Clck “Next”LB2-AddPool2
  9. On the Health Check step, set it as shown. Click “Next” when done.LB2-AddPool3-HealthCheck
  10. On the members step, click the green plus to add the IP address of you SSO servers to the pool. I suggest keeping the weifght for each at 1, unless you have a reason to send more requests to specific nodes. Keep the HTTPS port and Monitor Port at 7444 for each. Click “Next” once all you members are added.LB2-AddPool4-Members
  11. Review the Ready to complete step and click “Finish” if it all correct
  12. Click the Publish Changes Button before proceeding
  13. Click the “Virtual Servers” link, then the green plus to add a Virtual Server
  14. Enter a meaningful name and description, provide the Virtual IP adddress that you added to the edge earlier, select the Pool created in the steps above and Enable HTTPS on port 7444. Set the Persistence Method to SSL_SESSION_ID and make the “Enabled” box is checked. Click “Add” then “Publish Changes”LB2-AddVirtualServer
  15. Test by navigating to https://ssovirtual.domain.local:7444/lookupservice/sdk and https://ssovirtual.domain.local:7444/sts/STSService/vpshere.local verifying that the certificates match.
  16. YAY, load-balanced SSO with matching SSL certs!

One more thing….

Using your favorite web browser, navigate to http://ssovirtual.domain.local:7444/websso/SAML2/Metadata/vsphere.local you’ll be prompted to download and save an XML file named vsphere.download. Now open the XML file in notepad or Notepad++.  First, make sure you received a readable XML file.  Second, noticed that the EntitiesDescriptor/EntityDescriptor entityID property is server-specific.  We’ll need both servers to respond with the same information.

<EntitiesDescriptor xmlns="urn:oasis:names:tc:SAML:2.0:metadata" xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion" xmlns:vmes="http://vmware.com/schemas/attr-names/2012/04/Extensions" Name="vsphere.local" validUntil="2014-08-12T23:54:04Z">
<Extensions>
<vmes:ExportedOn>2014-08-11T23:54:04Z</vmes:ExportedOn>
<vmes:ExportedBy>Exported by VMware Identity Server (c) 2012</vmes:ExportedBy>
</Extensions>
<EntityDescriptor entityID="https://VCSSO1.domain.local:7444/websso/SAML2/Metadata/vsphere.local">
<IDPSSODescriptor WantAuthnRequestsSigned="false" protocolSupportEnumeration="urn:oasis:names:tc:SAML:2.0:protocol">
<KeyDescriptor xmlns:ds="http://www.w3.org/2000/09/xmldsig#" use="signing">
<ds:KeyInfo>
<snip...>

Warning This is not in a VMware KB, and may not be best way to do it. Having the value in the EntitiesDescriptor/EntityDescriptor entityID property match the FQDN is going to be very important in the near future. Trust me.

  1. On each server, open C:\ProgramData\VMware\CIS\cfg\vmware-sso\hostname.txt. It only contains the resolved hostname, so update it to the virtual hostname (vcsso.ragaazzi.lab in my case) save the file
  2. Retrieve the XML file from http://ssovirtual.domain.local:7444/websso/SAML2/Metadata/vsphere.local again open it and confirm that it contains the virtual hostname

Conclusion
This was such a lengthy post, I considered breaking it up, but there was no good break-point. Thanks for sticking with it. This is mostly for my own benefit, hopefully you’ll find it helpful too.

Deploy vCenter Log Insight Windows Agent using GPO

The VMware documentation covers this, but I thought I’d add my “insight” (get it? har har)

Prerequsities

  1. Get Microsoft Orca.  It’s part of the SDK found here.
  2. Get the vCenter Log Insight Windows Agent from my.vmware.com if you haven’t already
  3. An appropriate network location for GPO-delivered installers containing the vCLI Agent MSI

Orca Steps

  1. Launch Orca, choose file|open and select thevCLI Agent MSI (VMware-vCenter-Log-Insight-Agent-2.0.3-1879692_1.msi in this case)

    vCLI MSI open in Orca
    vCLI MSI open in Orca
  2. Within Orca, click Transform|New Transform

    Create new transform
    Create new transform
  3. Click the “Property” table to load its rows

    Property Table
    Property Table
  4. Right-click under the populated rows and choose “Add Row”
  5. In the “Add Row” dialog, enter the property as “SERVERHOST” and the value as theFQDN of yourvCenter Log Insight server, click OK.  Notice thenew record has a green box around it?  That means it’ll be included in the transform file.

    Add SERVERHOST Property
    Add SERVERHOST Property
  6. On the menu, choose Transform|Generate Transform.  Put it in the same folder as the vCLI Agent MSI, give it a descriptive name.
  7. Once the mst is saved, you can close Orca.

GPO Steps

  1. Create a new or open an existing GPO in Group Policy Management Editor
  2. Expand Computer Configuration|Policies|Software Settings, right-click Software installation and choose New|Package

    Add software installation package to GPO
    Add software installation package to GPO
  3. Select thevCLI MSI on the network share, select the “Advanced” option on the Deploy Software dialog, click OK.

    Choose the Advanced Option
    Choose the Advanced Option
  4. On the VMware vCenter Log Insight Agent Properties dialog, navigate to the Modifications tab.
  5. Click the “Add” button and select themst you created earlier. Click OK.  Make other changes to the package as appropriate for your environment and click OK to save.

    Modifications Tab
    Modifications Tab
  6. Like all other GPOs, link it to the appropriate OU(s) containing the computers you want the agent deployed on.

NSX-v and vCNS Coexistence

It may or may not be apparent, but NSX for vSphere is in many ways the next version of vCNS. In my lab, I’ve attempted to keep vCNS while adding NSX to the same vCenter server. The license key or configuration apparently overlap; if vShield Manager boots up first, NSX indicates that it’s not licensed. If NSX manager boots first, the vShield Manager states that it’s not licensed.

I did not, but should have, performed an upgrade from vCNS to NSX and now will have to add NSX Edge Gateways to replace the vShield Edges.

Handy VMKB for SRM & VR 5.1

VMKB 1009562  has a lot of good information, I’m not going to repeat it here, but it is a great resource for determining what network ports have to be open between what devices when using SRM & vSphere Replication.

Also, this diagram is surprisingly complicated… (reminds me of a dream-catcher)

SRM & VR ports

Expanding a VMDK for OpenFiler

In my lab, I have an OpenFiler 2.99.1 VM running on the physical host providing storage via iSCSI to my virtual hosts.

Increasing the size of the VMDK used by the OpenFiler VM does not equate to more storage shared by the OpenFiler. I banged my head against the wall for a few hours figuring it out; here’s how I did it.

  1. Expand VMDK
  2. Download GParted Live CD
  3. Stop anything consuming storage provided by OpenFiler
  4. Shut Down OpenFiler VM
  5. Boot OpenFiler from GParted Live CD
  6. Create additional LVM2 PV in the unused storage
  7. Apply changes
  8. Unmount Gparted ISO, reboot OpenFiler
  9. In the OpenFiler Web Interface, navigate to Volume Groups
  10. Add new PV to the Volume Group
  11. Navigate to Manage Volumes
  12. Select the VG, Edit the Volume, enter the new size (same as the volume group’s total space) in my case
  13. Restart iSCSI service
  14. In vSphere, view the properties of the iSCSI datastore to increase its size

What a pain, why is this necessary?
There is apparently an uncorrected bug in OpenFiler in that it will not create additional partitions on a block device. Attempting to create the PV/Partition from the CLI using parted will not accept the cylinders I provide, instead attempting to make the volume half as big as asked. – If someone knows why this is and how to correct, please comment.
In the future, if my OpenFiler needs more storage to share, I’ll just add a new VMDK, create the PV on it, add it to the Volume Group and increase the volume that way.