Single Touch, Production Ready ESXi Provisioning with Ansible

TL;DR: The source code and documentation for what follows is on my github here: https://github.com/bryansullins/baremetalesxi. For the full story and explanations, keep reading and enjoy!

A successful automated installation of ESXi

The Story

I got my start in IT Infrastructure Engineering at a very large and well-known hardware company that also offered IT Services. I was part of a team that built and maintained an IaaS Cloud for Healthcare Providers.

We rolled out ESXi hosts by the enclosure. That’s 16-32 ESXi hosts at a time and when I joined the team we did it . . . . manually.

Manually.

. . . From the ground up. . . . From rack and stack, to the firmware updates, to getting them into vCenter and production ready.

This was about 2014. We automated some of it and it got a lot better. I can’t take all the credit. But to get to Single/Zero touch for ESXi provisioning was super tough, but we did what we could.

With the more recent releases of Ansible, single touch (possibly zero touch) is now a reality.

What Does Single Touch Mean?

Let’s all make sure we know what I mean when I use the phrase “Single Touch”:

“Bare Metal Single Touch ESXi Rollout (Provisioning) – The User launches the automation once (input is acceptable, hostname, ip, etc.), but once executed, the ESXi host will go from powered off with no OS to being fully configured in vCenter, in its destination cluster, production ready.

As defined by me, Bryan Sullins

How It Works: A Breakdown – Part 1 – ESXi Automated Installation for a Custom-Built ISO

At the top of this post, you will see the github link for what follows, but in case you are lazy like me, here it is again: https://github.com/bryansullins/baremetalesxi.

I will break this down into two parts. Part 1 is the ESXi Bare Metal Installation. This is an automated kickstart with a self-contained ISO that has all of the installation parameters available without the need for PXE, or TFTP. You will need to make the ISO available from a webserver, but that is all made available in playbook.

As far as understanding how this has to be done, you have to have some knowledge of kickstart, which goes a bit beyond this post. There are many ways you can make a kickstart file available for an installation, but remember, I wanted to make this ISO a standalone, self-contained automated install. If you were to do this manually, you would need to:

  1. Mount the ESXi ISO provided by the vendor.
  2. Copy all files out to a staging directory.
  3. Create a kickstart file with IP info, etc. You can also do scripting for additional setup: (esxcli commands, etc.).
  4. Tarball the kickstart file with your choice of name (bmks.tgz), and copy it into the root of the iso.
  5. Edit both boot.cfg files (one for Legacy Boot and one for UEFI) to reference the kickstart file and append the bmks.tgz to the list of tarballs that must be extracted.
  6. Burn all files back into the now-customized iso.
  7. Boot from Virtual Media, etc.

For a detailed description of the process, William Lam’s post on this matter was integral for me to get this done.

You will note in the playbook I am doing what I describe above:

## /opt/baremetal is the staging directory.
  - name: Mounting source directory from official production ESXi ISO . . . copying over build files . . . backing up defaults . . .
    shell: |
      mkdir /mnt/{{ esxi_hostname }}
      mount -o loop -t iso9660 /opt/esxiisosrc/{{ src_iso_file }} /mnt/{{ esxi_hostname }}/
      mkdir /opt/baremetal/{{ esxi_hostname }}
      mkdir /opt/baremetal/temp/{{ esxi_hostname }}
      mkdir -p /opt/baremetal/temp/{{ esxi_hostname }}/etc/vmware/weasel
      cp -r /mnt/{{ esxi_hostname }}/* /opt/baremetal/{{ esxi_hostname }}/
      umount /mnt/{{ esxi_hostname }}
      mv /opt/baremetal/{{ esxi_hostname }}/boot.cfg /opt/baremetal/{{ esxi_hostname }}/boot.cfg.orig
      mv /opt/baremetal/{{ esxi_hostname }}/efi/boot/boot.cfg /opt/baremetal/{{ esxi_hostname }}/efi/boot/boot.cfg.orig
  
## The following two tasks will make the custom iso bootable by both legacy and UEFI implementations:    
  - name: Copying custom boot.cfg to root directory . . .
    copy:
      src: files/{{ esxi_build }}/boot.cfg
      dest: /opt/baremetal/{{ esxi_hostname }}
      owner: root
      group: root
      mode: '0744'

  - name: Copying custom UEFI boot.cfg to root efi directory . . .
    copy:
      src: files/{{ esxi_build }}/efi/boot/boot.cfg
      dest: /opt/baremetal/{{ esxi_hostname }}/efi/boot
      owner: root
      group: root
      mode: '0744'

## Additional options can be appened after the "reboot" at the end of the content section, such as scripted esxcli commands, etc.
  - name: Creating kickstart file with proper automation contents . . .
    copy:
      force: true
      dest: /opt/baremetal/temp/{{ esxi_hostname }}/etc/vmware/weasel/ks.cfg
      content: |
        vmaccepteula
        clearpart --firstdisk=local --overwritevmfs
        install --firstdisk=local --overwritevmfs
        rootpw --iscrypted {{ encrypted_root_password }}
        network --bootproto=static --addvmportgroup=1 --vlanid={{ vlan_id }} --ip={{ host_management_ip }} --netmask={{ net_mask }} --gateway={{ gate_way }} --nameserver="#.#.#.#,#.#.#.#" --hostname={{ esxi_hostname }}
        reboot 

  - name: Scripting commands to tarball the kickstart file and make the proper iso . . .
    shell: |
      chmod ugo+x /opt/baremetal/temp/{{ esxi_hostname }}/etc/vmware/weasel/ks.cfg
      cd /opt/baremetal/temp/{{ esxi_hostname }}
      tar czvf bmks.tgz *
      chmod ugo+x /opt/baremetal/temp/{{ esxi_hostname }}/bmks.tgz
      cp /opt/baremetal/temp/{{ esxi_hostname }}/bmks.tgz /opt/baremetal/{{ esxi_hostname }}/
      cd /opt/baremetal/{{ esxi_hostname }}
      
  - name: Creating bootable iso from all files . . .
    shell: >
      mkisofs
      -relaxed-filenames
      -J
      -R
      -b isolinux.bin
      -c boot.cat
      -no-emul-boot
      -boot-load-size 4
      -boot-info-table
      -eltorito-alt-boot
      -e efiboot.img
      -boot-load-size 1
      -no-emul-boot
      -o /opt/baremetal/{{ esxi_hostname }}.iso
      /opt/baremetal/{{ esxi_hostname }}/

  - name: Moving created iso to webserver . . .
    shell: |
      mv /opt/baremetal/{{ esxi_hostname }}.iso /usr/share/nginx/html/isos/

One important coding “teachable moment” here is the difference above between:

shell: |

and

shell: >

The “|” symbol means “each line below is a new line”.

The “>” symbol means “each new line below represents an option for a one-line command operation.”

The last part to this, simply boots from the created ISO on the nginx webserver and the lights-out management will do the rest of the work:

# Can also use the Dell/EMC iDRAC Repo . . .
  - name: Booting once using the custom built iso . . .
    hpilo_boot:
      host: "{{ ilo_ip }}"
      login: admin
      password: "{{ ilo_password }}"
      media: cdrom
      image: http://#.#.#.#/isos/{{ esxi_hostname }}.iso # <- Your webserver url should go here.
    delegate_to: localhost

After this, Ansible will simply wait 16 minutes, then we move on to Part 2: ESXi host configuration in vCenter.

Part 2: ESXi Host Configuration in vCenter

The rest of this is “the easy part”. After Ansible 2.8 was released, the Ansible Modules for vmware that were released made Single Touch a reality. Now it’s a simple matter of including the vmware modules that do what you need for your standard configuration. I won’t include all of the code here, it’s self explanatory. But let’s take the first example, vmware_host:

  - name: Adding ESXi host "{{ esxi_hostname }}.yourdomain.here" to vCenter . . .
    vmware_host:
      hostname: "{{ vcenter_hostname }}"
      username: "administrator@vsphere.local"
      password: "{{ vcenter_password }}"
      datacenter_name: "{{ datacenter_name }}"
      cluster_name: "{{ cluster_name }}"
      esxi_hostname: "{{ esxi_hostname }}.yourdomain.here"
      esxi_username: "root"
      esxi_password: "{{ esxi_password }}"
      state: present
      validate_certs: false
    delegate_to: localhost

Here, you have the module denoted as vmware_host. The rest is all authentication information and required info that the vmware_host module needs. One pretty common idea in “Ansible-world” is this idea of “state”. state: present means that the host will be added into vCenter.

The vCenter and ESXi portion of the playbook:

  1. Adds the host to vCenter into the Cluster specified.
  2. Adds the license key to the Host.
  3. Adds vmnic1 to Standard vSwitch0
  4. Changes some Advanced Settings, including the Syslog Loghost
  5. Restarts syslog (required to save syslog settings).
  6. Sets Power Management to “High Performance”
  7. Adds a vmkernel port group for the vMotion interface
  8. Adds a vMotion kernel port with the proper IP Address
  9. Configures NTP, Starts the NTP Service, and sets it to start at boot.
  10. Adds vmnic2,vmnic3 to the vDS.
  11. Stops ESXi Shell Service and sets to disable at boot (Idempotent)
  12. Stops SSH Service and sets to disable at boot (Idempotent)
  13. Takes the host out of Maintenance Mode (Idempotent)

And that’s it! Easy-peasy, right? Questions? Hit me up on twitter @RussianLitGuy or email me at bryansullins@thinkingoutcloud.org. I would love to hear from you.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s