BOSH Stemcell 3541.2 breaks Concourse 3.9.0

Looks like there was a breaking change in stemcell v3541.2 where the default umask was set to 077.  If this stemcell is used with BOSH-deployed-Concourse.CI v3.9.0, resource checking fails with a “permission denied” error.

Note that Pivotal (as of Feb 22 2018)  has not updated their stemcells to 3541.x and their latest is still in the 3468 chain.



Resolutions for 2018

12/23/2017 Comments off

If I put it here, I’m much more likely to follow-through.  Like many, I work best under some pressure.  Here is a list of what I want to do differently (with regard to technology) next year.

  1. Do more blogging.  I can make a ton of excuses for not blogging as much this year.  I love sharing what I’ve learned; the more new stuff I learn, the more I share.  So….
  2. Do more for NSX for vSphere and NSX-T.  I feel strongly that SDN is critical to the future of how datacenters operate.  NSX is the logical leader in this space and will only grow in interest.  There is still a tendency to replicate what was done with pre-SDN technology and I’d like to see modern ways to solve problems while finding and pushing the limits of what can be done in SDN.
  3. PKS
    Do more with containers and PKS.  The technologies that Pivotal provides are cutting edge.  Already and continuing, containers and applications-as-code methods are growing and will define the datacenter of the future.  Just as a few years ago, we stopped thinking of hardware servers as single-purpose, we’ll embrace multiple workloads within a VM.
  4. Do more coding.  I love concourse and pipelines, but have a lot to learn.  Let’s find the limits of BOSH and pipelines.  Can we not only deploy, but automate the operation and maintenance of a PaaS solution?
  5. Do more coding.  I feel that as we move to “applications-as-code”, it’s important to understand what that means to developers and operators.  What sort of problems become irrelevant in this approach?  What molehills become mountains?

Hope to see you next year!


Removing NSX-T VIBs from ESXi hosts

10/31/2017 Comments off

I’d wanted to revert my environment from (an incomplete install of) NSX-T v2.0 back to NSX for vSphere v6.3.x, but found that the hosts would not complete preparation.  The logs indicated that something was “claimed by multiple non-overlay vibs”.

Error in esxupdate.log

I found that the hosts still had the NSX-T VIBs loaded, so to remove them, here’s what I did:

  1. Put the host in maintenance mode.  This is necessary to “de-activate” the VIBs that may be in use
  2. Login to the host via SSH
  3. Run

    /etc/init.d/netcpad stop

  4. Run this all in one line; note the the order of the vibs is important

    esxcli software vib remove -n nsx-ctxteng -n nsx-hyperbus -n nsx-platform-client -n nsx-nestdb -n nsx-aggservice -n nsx-da -n nsx-esx-datapath -n nsx-exporter -n nsx-host -n nsx-lldp -n nsx-mpa -n nsx-netcpa -n nsx-python-protobuf -n nsx-sfhc -n nsx-support-bundle-client -n nsxa -n nsxcli -n nsx-common-libs -n nsx-metrics-libs -n nsx-nestdb-libs -n nsx-rpc-libs -n nsx-shared-libs

  5. reboot the host

Building Stand-Alone BOSH and Concourse

07/17/2017 Comments off

This should be the last “how to install concourse” post; With this, I think I’ve covered all the interesting ways to install it.  Using BOSH is by-far my favorite approach.  After this, I hope to post more related to the use of concourse and pipelines.


There are three phases to this deployment:

  1. BOSH-start – We’ll set up an ubuntu VM to create the BOSH director from.  We’ll be using BOSH v2 and not bosh-init
  2. BOSH Director – This does all the work for us, but has to be instructed how to connect to vSphere
  3. Concourse – We’ll use a deployment manifest in BOSH to deploy concourse

I took the approach that – where possible – I would manually download the files and transfer them to the target, rather than having the install process pull the files down automatically.  In my case, I went through a lot of trial-and-error, so I did not want to pull down the files every time.  In addition, I’d like to get a feel for what a self-contained (no Internet access) solution would look like. BTW, concourse requires Internet access in order to get to docker hub for a container to run its pipelines.

Starting position

Make sure you have the following:

  • Working vSphere environment with some available storage and compute capacity
  • At least one network on a vSwitch or Distributed vSwitch with available IP addresses
  • Account for BOSH to connect to vSphere with permissions to create folders, resource pools, and VMs
  • An Ubuntu  VM template.  Mine is 16.04 LTS
  • PuTTY, Win-SCP or similar tools


  1. Deploy a VM from your Ubuntu template.  Give it a name – I call mine BOSH-start – and IP address, power it on.  In my case, I’m logged in as my account to avoid using root unless necessary.
  2. Install dependencies:
    sudo apt-get install -y build-essential zlibc zlib1g-dev ruby ruby-dev openssl \
    libxslt-dev libxml2-dev libssl-dev libreadline6 libreadline6-dev \
    libyaml-dev libsqlite3-dev sqlite3
  3. Download BOSH CLI v2, make it executable and move it to the path.  Get the latest version of the BOSH v2 CLI here.
    chmod +x ~/Downloads/bosh-cli-*
    sudo mv ~/Downloads/bosh-cli-* /usr/local/bin/bosh

BOSH Director

  1. Git Director templates
    mkdir ~/bosh-1
    cd ~/bosh-1
    git clone
  2. Create a folder and use bosh to create the environment.  This command will create several “state” files and our BOSH director with the information you provide.  Replace the values in red with your own.
    bosh create-env bosh-deployment/bosh.yml \
        --state=state.json \
        --vars-store=creds.yml \
        -o bosh-deployment/vsphere/cpi.yml \
        -o bosh-deployment/vsphere/resource-pool.yml \
        -o bosh-deployment/misc/dns.yml \
        -v internal_dns=<DNS Servers ex: [,]>
        -v director_name=<name of BOSH director. eg:boshdir> \
        -v internal_cidr=<CIDR for network ex:> \
        -v internal_gw=<Gateway Address> \
        -v internal_ip=<IP Address to assign to BOSH director> \
        -v network_name="<vSphere vSwitch Port Group>" \
        -v vcenter_dc=<vSphere Datacenter> \
        -v vcenter_ds=<vSphere Datastore> \
        -v vcenter_ip=<IP address of vCenter Server> \
        -v vcenter_user=<username for connecting to vCenter Server> \
        -v vcenter_password=<password for that account> \
        -v vcenter_templates=<location for templates ex:/BOSH/templates> \
        -v vcenter_vms=<location for VM.  ex:/BOSH/vms> \
        -v vcenter_disks=<folder on datastore for bosh disks.  ex:bosh-1-disks> \
        -v vcenter_cluster=<vCenter Cluster Name> \
        -v vcenter_rp=<Resource Pool Name>

    One note here; if you do not add the line for dns.yml and internal_dns, your BOSH director will use as its DNS server and won’t be able to find anything internal. This will take a little while to download the bits and set up the Director for you.

  3. Connect to Director.  The following commands will create an alias for the new BOSH environment named “bosh-1”. Replace with the IP of your BOSH Director from the create-env command:
    # Configure local alias
    bosh alias-env bosh-1 -e --ca-cert <(bosh int ./creds.yml --path /director_ssl/ca)
    export BOSH_CLIENT=admin
    export BOSH_CLIENT_SECRET=`bosh int ./creds.yml --path /admin_password`
    bosh -e bosh-1 env
  4. Next we’ll need a “cloud config”.  This indicates to BOSH Director how to configure the CPI for interaction with vSphere.  You can find examples and details here.  For expediency, What I ended up with is below. As usual, you’ll want to update the values in red to match your environment.  Save this file as ~/bosh-1/cloud-config.yml on the BOSH-start VM
    - name: z1
        - name: <vSphere Datacenter Name>
        - clusters: 
          - <vSphere Cluster Name>: {resource_pool: <Resource Pool in that cluster>}
        address: <IP of FQDN of vCenter Server>
        user: <account to connect to vSphere with>
        password: <Password for that account>
        default_disk_type: thin
        enable_auto_anti_affinity_drs_rules: false
        - name: <vSphere Datacenter Name>
          vm_folder: /BOSH/vms
          template_folder: /BOSH/templates
          disk_path: prod-disks
          datastore_pattern: <regex filter for datastores to use ex: '\AEQL-THICK0\d' >
          persistent_datastore_pattern: <regex filter for datastores to use ex: '\AEQL-THICK0\d' >
          - <vSphere Cluster Name>: {resource_pool: <Resource Pool in that cluster>}
    - name: default
        cpu: 2
        ram: 4096
        disk: 16_384
    - name: large
        cpu: 2
        ram: 8192
        disk: 32_768
    - name: default
      disk_size: 16_384
        type: thin
    - name: large
      disk_size: 32_768
        type: thin
    - name: default
      type: manual
      - range: <network CIDR where to place VMs ex:>
        reserved: <reserved range in that CIDR ex:[] >
        gateway: <gateway address for that network>
        az: z1
        dns: <DNS Server IPs ex: [,] >
          name: <name of port group to attach created VMs to>
      workers: 5
      reuse_compilation_vms: true
      az: z1
      vm_type: large
      network: default
  5. Update Cloud Config with our file:
    bosh -e bosh-1 update-cloud-config ./cloud-config

    This is surprisingly fast.  You should now have a functional BOSH Director.


Let’s deploy something with BOSH!


  • Copy the URLs for the Concourse and Garden runC BOSH releases from here
  • Copy the URL for the latest Ubuntu Trusty stemcell for vSphere from here
  1. Upload Stemcell.  You’ll see it create a VM with a name beginning with “sc” in vSphere
    bosh -e bosh-1 upload-stemcell <URL to stemcell>
  2. Upload Garden runC release to BOSH
    bosh -e bosh-1 upload-release <URL to garden-runc tgz>
  3. Upload Concourse release to BOSH
    bosh -e bosh-1 upload-release <URL to concourse tgz>
  4. A BOSH deployment must have a stemcell, a release and a manifest.  You can get a concourse manifest from here, or start with the one I’m using.  You’ll notice that a lot of the values here must match those in our cloud-config.  Save the concourse manifest as ~/concourse.yml
    name: concourse
    - name: concourse
      version: latest
    - name: garden-runc
      version: latest
    - alias: trusty
      os: ubuntu-trusty
      version: latest
    - name: web
      instances: 1
      # replace with a VM type from your BOSH Director's cloud config
      vm_type: default
      stemcell: trusty
      azs: [z1]
      networks: [{name: default}]
      - name: atc
        release: concourse
          # replace with your CI's externally reachable URL, e.g.
          # replace with username/password, or configure GitHub auth
          basic_auth_username: myuser
          basic_auth_password: mypass
          postgresql_database: &atc_db atc
      - name: tsa
        release: concourse
        properties: {}
    - name: db
      instances: 1
      # replace with a VM type from your BOSH Director's cloud config
      vm_type: large
      stemcell: trusty
      # replace with a disk type from your BOSH Director's cloud config
      persistent_disk_type: default
      azs: [z1]
      networks: [{name: default}]
      - name: postgresql
        release: concourse
          - name: *atc_db
            # make up a role and password
            role: atc_db
            password: mypass
    - name: worker
      instances: 1
      # replace with a VM type from your BOSH Director's cloud config
      vm_type: default
      stemcell: trusty
      azs: [z1]
      networks: [{name: default}]
      - name: groundcrew
        release: concourse
        properties: {}
      - name: baggageclaim
        release: concourse
        properties: {}
      - name: garden
        release: garden-runc
            listen_network: tcp
      canaries: 1
      max_in_flight: 1
      serial: false
      canary_watch_time: 1000-60000
      update_watch_time: 1000-60000

    A couple of notes:

    • The Worker instance will need plenty of space, especially if you’re planning to use PCF Pipeline Automation, as it’ll have to download the massive binaries from PivNet. You’ll want to make sure that you have a sufficiently large vm type defined in your cloud config and assigned as worker in the Concourse manifest
  5. Now, we have everything we need to deploy concourse.  Notice that we’re using BOSH v2 and the deployment syntax is a little different than in BOSH v1.  This command will create a handful of VMs, compile a bunch of packages and push them to the VMs.  You’ll a couple extra IPs for the compilation VMs – these will go away after the deployment is complete.
    bosh -e bosh-1 -d concourse deploy ./concourse.yml
  6. Odds are that you’ll have to make adjustments to the cloud-config and deployment manifest.  If so, you can easily apply updates to the cloud-config with the bosh update-cloud-config command.
  7. If the deployment is completely hosed up and you need to remove it, you can do so with
    bosh -e bosh-1 -d concourse stop &&  bosh -e bosh-1 -d concourse deld

Try it out

  1. Get the IP address of the web instance by running
    bosh -e bosh-1 vms

    From the results, identify the IP address of the web instance:

  2. Point your browser to http://<IP of web instance>:8080
  3. Click Login, Select “main” team and login with the username and password (myuser and mypass in the example) you used in the manifest



Getting started with BOSH Backup and Restore – Pt.1 Backup

06/26/2017 Comments off

Starting with a working PCF 1.11 deployment, a random linux VM and the BOSH Backup and Restore bits, let’s try it out!


  • We’ll perform two types of backup jobs using BBR; one against the BOSH director and one against the Elastic Runtime deployment. The command and parameters are different between the jobs.
  • BBR stores the backup data in subfolders where the executable is run
  • Tiles other than Elastic Runtime (CF) may be backed up with BBR later, but as of late June 2017, they do not have the BBR scripts in place.
  • If you don’t turn on MySQL backups and the Backup Prepare Node in Elastic Runtime, the CF deployment backup job will fail in that it cannot find the backup scripts for the MySQL database
  • I’m using a CentOS VM in the environment as the jumpbox to run BBR.  You’ll want to make sure that the jumpbox is able to reach the BOSH director on TCP22 and TCP25555.


  1. Prepare PCF
    • Logon to Ops Manager
    • Click the “Pivotal Elastic Runtime” tile
    • Assuming you’re using the internal MySQL, click “Internal MySQL” on the Settings tab
    • Under Automated Backups Configuration, select “Enable automated backups from MySQL to and S3 bucket or other S3-compatible file store”.  Right here, you’re thinking, “but I don’t have an S3 server or account or whatever”.  That’s ok, just fake it.  Put bogus values in the fields and an unreachable date (like February 31st).  Click Save.

      Bogus S3 info

    • Under Resource Config, make sure the Backup Prepare Node instance count is 1 (or more?).  Click Save
    • Return to the Installation Dashboard and Apply Changes
  2. Get the BBR credentials.
    • Logon to Ops Manager
    • Click the “Ops Manager Director” tile
    • Click the “Credentials” tab
    • Click the “Link to Credential” link beside “Bbr Ssh Credentials”

      BBR Director Backup Credential

    • The page the loads will display a yml-type file with the PEM-encoded Private and Public Keys.  Select and copy from “—–BEGIN RSA PRIVATE KEY—–” through “—–END RSA PRIVATE KEY—–“.
    • Paste this into a text editor.  In my case, on Windows, the content used literally “/n” to indicate new-line rather than an actual newline.  So, to convert it, I used Notepad++ to replace “//n” with “/n” in the Extended Search Mode.

      Using Notepad++

    • The username that BBR will use for the director job is “bbr”
    • Back on the “Credentials” tab of Ops Manager Director, click “Link to Credential” beside “Uaa Bbr Client Credentials”
    • On the page that loads, note that the identity is”bbr_client” and record the password value. This will be used for the BBR deployment job(s)
    • Back on the “Credentials” tab of Ops Manager Director, click “Link to Credential” beside “Director Credentials”
    • On the page that loads, note that the identity is”director” and record the password value.  You’ll need this to login to BOSH in order to get the deployment name next
  3. Get the deployment name
    • Open an SSH session to the Ops Manager, logging on as ubuntu
    • Run this:

      uaac target –ca-cert /var/tempest/workspaces/default/root_ca_certificate https://DIRECTOR-IP-ADDRESS:8443

      bosh –ca-cert /var/tempest/workspaces/default/root_ca_certificate target DIRECTOR-IP-ADDRESS

      Logon as “director” with the password saved earlier

    • Run this:

      bosh deployments

    • In the results, copy the deployment name that begins with “cf-“. (eg: cf-67afe56410858743331)
  4. Prepare the jumpbox
    • Logon with a privileged account
    • Using SCP or similar, copy “/var/tempest/workspaces/default/root_ca_certificate” from Ops Manager to the jump box
    • Copy the bbr-0.1.2.tar file to the jumpbox
    • Extract it – tar -xvf bbr-0.1.2.tar
    • Make sure you have plenty of space on the jumpbox.  In my case, I mounted a NFS share and ran BBR from the mount point.
    • Copy <extracted files>/release/bbr to the root folder where you want the backups to reside.
    • Save the PEM-encoded RSA Private Key from above to the jumpbox, making a note of it’s path and filename.  I just stuck it in the same folder as the bbr executable.
    • Make sure you can connect to the BOSH director via ssh
      ssh -i bbr@
  5. Director Backup
    • On the jumpbox, navigate to where you placed the bbr executable.  Remember that it will create a time-stamped subfolder here and dump all the backups into it.
    • Run this, replacing the values in red with the correct path to the private key file and BOSH Director IP address :Director Pre-check
      ./bbr director –private-key-path ./private.key –username bbr –host pre-backup-check
    • Check that the pre-check results indicate that the director can be backed up
    • Run this to perform the backup: (same as before, just passing the “backup” sub-command instead of the “pre-backup-check’ subcommand)Director Backup
      ./bbr director –private-key-path ./private.key –username bbr –host backup
    • Wait a while for the backup to complete
  6. What’d it do?
    • Backed up BOSH director database to bosh-0-director.tar
    • Dumped credhub database to bosh-0-credhub.tar
    • Dumped uaa database to bosh-0-uaa.tar
    • Backed up the BOSH director blobstore to bosh-0-blobstore.tar
    • Saved the blobstore metadata to a file named metadata
  7. Elastic Runtime Backup
    • On the jumpbox, navigate to where you placed the bbr executable.  Remember that it will create a time-stamped subfolder here and dump all the backups into it.
    • Run this, replacing the values in red with the IP/FQDN of your BOSH director, password for the bbr_client account retrieved from Ops Manager, the Elastic Runtime deployment name and path to the root_ca-certificate copied from the Ops Manager:

      Deployment Pre-check

      ./bbr deployment –target –username bbr_client –password abc123 –deployment cf-abcdef123456 –ca-cert ./root_ca_certificate pre-backup-check

    • Check that the pre-check results indicate that the director can be backed up
    • Run this to perform the backup: (same as before, just passing the “backup” sub-command instead of the “pre-backup-check’ subcommand)

      Deployment Backup

      ./bbr deployment –target –username bbr_client –password abc123 –deployment cf-abcdef123456 –ca-cert ./root_ca_certificate backup

    • Wait a while for the backup to complete
  8. What’d it do this time?
    • Backed up the MySQL Cloud Controller Database to mysql-artifact.tar
    • Backed up uaa to uaa-0-uaa.tar (this is different from the UAA backup performed against the director)
    • Backed up the blobstore (in my case, from the internal NFS server) to nfs_server-0-blobstore-backup.tar
    • Saved the blobstore metadata to a file named metadata




Making a slight change in direction

05/02/2017 Comments off

My blog started as a way to remind myself how to do something, so I don’t have to rediscover it every time.  I hoped others would find it helpful too.  Until recently, I’ve been heavily involved in Dell EMC’s Enterprise Hybrid Cloud solution (EHC), but have decided to shift my focus to Dell EMC’s Native Hybrid Cloud solution (NHC) and related technologies.  It’s an interesting challenge and presents a lot to learn.  NHC is a much younger solution than EHC and I’m looking forward to applying what we learned through EHC to it.

This only means that I’ll probably post a lot less VMware-related stuff and more about BOSH, Concourse, Cloud Foundry, PCF, Kubernetes and similar technologies.  I’m excited about learning about this platform and figuring out what works (and what doesn’t).

Building a Concourse CI VM on Ubuntu

04/18/2017 Comments off

Recently, I’ve found myself needing a Concourse CI system. I struggled with the documentation on, couldn’t find any comprehensive build guides.  Knew for certain I wasn’t going to use VirtualBox.  So, having worked it out; thought I’d share what I went through to get to a working system.

Starting Position
Discovered that the CentOS version I was using previously did not have a compatible Linux kernel version.  CentOS 7.2 uses kernel 3.10, Concourse requires 3.19+.  So, I’m starting with a freshly-deployed Ubuntu Server 16.04 LTS this time.

Prep Ubuntu
Not a lot we have to do, but still pretty important:

  1. Make sure port for concourse is open

    sudo ufw allow 8080
    sudo ufw status

    sudo ufw disable

    I disabled the firewall on ubuntu because it was preventing the concourse worker and concourse web from communicating.

  2. Update and make sure wget is installed

    apt-get update
    apt-get install wget

Concourse expects to use a postgresql database, I don’t have one standing by, so let’s install it.

  1. Pretty straightforward on Ubuntu too:

    apt-get install postgresql postgresql-contrib

    Enter y to install the bits.  On Ubuntu, we don’t have to take extra steps to configure the service.

  2. Ok, now we have to create an account and a database for concourse. First, lets create the linux account. I’m calling mine “concourse” because I’m creative like that.

    adduser concourse
    passwd concourse

  3. Next, we create the account (aka “role” or “user”) in postgres via the createuser command. In order to do this, we have to switch to the postgres account, do that with sudo:

    sudo -i -u postgres

    Now, while in as postgres we can use the createuser command

    createuser –interactive

    You’ll enter the name of the account, and answer a couple of special permissions questions.

  4. While still logged in as postgres, run this command to create a new database for concourse. I’m naming my database “concourse” – my creativity is legendary. Actually, I think it makes life easier if the role and database are named the same

    createdb concourse

  5. Test by switching users to the concourse account and making sure it can run psql against the concourse databaseWhile in psql, use this command to set the password for the account in postgress

    ALTER ROLE concourse WITH PASSWORD 'changeme';

  6. Type \q to exit psql

Ok, we have a running postgresql service and and account to be used for concourse. Let’s go.

  1. Create a folder for concourse. I used /concourse, but you can use /var/lib/whatever/concourse if you feel like it.
  2. Download the binary from into your /concourse folder using wget or transfer via scp.
  3. Create a symbolic link named “concourse” to the file you downloaded and make it executable

    ln -s ./concourse_linux_amd64 ./concourse
    chmod +x ./concourse_linux_amd64

  4. Create keys for concourse

    cd /concourse

    mkdir -p keys/web keys/worker

    ssh-keygen -t rsa -f ./keys/web/tsa_host_key -N ”
    ssh-keygen -t rsa -f ./keys/web/session_signing_key -N ”
    ssh-keygen -t rsa -f ./keys/worker/worker_key -N ”
    cp ./keys/worker/ ./keys/web/authorized_worker_keys
    cp ./keys/web/ ./keys/worker

  5. Create start-up script for Concourse. Save this as /concourse/

    /concourse/concourse web \
    –basic-auth-username myuser \
    –basic-auth-password mypass \
    –session-signing-key /concourse/keys/web/session_signing_key \
    –tsa-host-key /concourse/keys/web/tsa_host_key \
    –tsa-authorized-keys /concourse/keys/web/authorized_worker_keys \
    –external-url \
    –postgres-data-source postgres://concourse:changeme@

    /concourse/concourse worker \
    –work-dir /opt/concourse/worker \
    –tsa-host \
    –tsa-public-key /concourse/keys/worker/ \
    –tsa-worker-private-key /concourse/keys/worker/worker_key

    The items in red should definitely be changed for your environment. “external_url” uses the IP address of the VM its running on. and the username and password values in the postgres-data-source should reflect what you set up earlier. Save the file and be sure to set it as executable (chmod +x ./

  6. Run the script “./”. You should see several lines go by concerning worker-collectors and builder-reapers.
    • If you instead see a message about authentication, you’ll want to make sure that 1) the credentials in the script are correct, 2) the account has not had it’s password set in linux or in postgres
    • If you instead see a message about the connection not accepting SSL, be sure that the connection string in the script includes “?sslmode=disable” after the database name
  7. Test by pointing a browser at the value you assigned to the external_url. You should see “no pipelines configured”.  You can login using the basic-auth username and password you specified in the startup script.


  8. Back in your SSH session, you can kill it with <CRTL>+C

Finishing Up
Now we just have to make sure that concourse starts when the system reboots. I am certain that there are better/safer/more reliable ways to do this, but here’s what I did:
Use nano or your favorite text editor to add “/concourse/” to /etc/rc.local ABOVE the line that reads “exit 0”
Now, reboot your VM and retest the connectivity to the concourse page.


EMC ECS Community Edition project for how to start the script on boot.

Mitchell Anicas’ very helpful post on setting up postgres on Ubuntu. for some wholly inadequate documentation

Alfredo Sánchez for bringing the issue with Concourse and CentOS to my attention