Managing Infrastructure with Octopus and Terraform

31 May 2017

Terraform is an awesome tool. It gives fantastic control over infrastructure and massively helps avoid any issues when migrating apps across environments, as you know exactly what infrastructure will be deployed.

However, as my Terraform config has grown arms and legs I’ve had to restructure (several times). The project I’m currently working on has a few distinct groups of infrastructure that will be deployed with varying cadences. These are (using Azure for this so using Azure resource names):

Core infrastructure elements that will change very rarely, if at all. e.g. Resource Group, Storage Containers.
Network specific infrastructure that will change infrequently. e.g. Virtual Network Gateway, Network Security Rules.
Application specific infrastructure that will change frequently. e.g. Virtual Machines.

Fairly obvious but worth pointing out that each infrastructure area has a dependency on those with a slower cadence; 3 relies on 2 and 1, 2 relies on 1 etc (you can’t deploy a VM without setting up network first).

I’m still cutting my teeth with Terraform and every day brings new challenges so the details I outline here could well be out of date shortly after I publish! I will try to keep this article up to date while maintaining the history.

Final note before I get into the guts of the article is that I’m using Terraform 0.9.3.

State Files

A lot of the “best practice” and “lessons learned” articles I read (e.g. here, here and here) advocate using a separate folder per environment to facilitate having separate state files per environment. This saves accidently running a terraform apply in Production rather than Test (not good!).

As well as having separate state files per environment, I want a separate state file per infrastructure area (1, 2 + 3 above). Having separate state files per area means that I can make use of terraforms remote state data sources to share output values between areas - but not between environments. E.g.

data "terraform_remote_state" "network" {
  backend = "consul"
  config {
    path = "state/dev/network.tfstate"
  }
}

This is an important distinction as I do want to be able to query the id of a NIC created in the Network area when creating a VM. However, I definitely don’t want to be able to query the Id of a dev NIC from the QA environment.

It is also necessary due to the way I’ve split up the resources. If I run (2) and it’s using the same state file as (1) then it’ll want to destroy everything in (1) as it won’t find those Terraform config files.

Organisation

I played with having a folder per environment (e.g. dev, QA, and production) but I found it got unwieldy pretty quickly, with a lot of duplication. Ultimately all that differs between the environments are Terraform variable values. For example VM size - use a ‘Standard A7’ in production but a ‘Standard A2’ in QA.

We already use Octopus deploy for application deployment and I thought it would be a good fit for deploying Terraform infrastructure changes as it is good for managing environment specific configuration (those Terraform variable values that differ per environment). I’ll hopefully give an overview and some understanding to how I’ve accomplished this as I’ve found Octopus and Terraform work quite well together.

The project is contained within a single Git repository. An example of the folder structure is:

modules/
  network/
    - variables.tf
    - output.tf
    - securitygroup.tf
    - ip.tf
  storage/
  vm/
  ../
src/
 UK South/
   core/
    - backend.tf
    - main.tf
    - output.tf
    - terraform.tfvars
    - variables.tf
   environment/
   app/
 EU West/
   core/
   environment/
   app/

modules/

The modules folder contains standard Terraform modules, in general each module contains:

variables.tf - variables required by this module
output.tf - any values output by this module
{resourceType}.tf - resource declarations. May be multiple per module. E.g. network.tf, disk.tf, vm.tf.

Having consistent files within each module makes it a lot easier to grok what is going on. I don’t have hard and fast rules about when to split modules down into smaller modules, I just do it when it feels correct or it makes sense.

src/

As well as modules I have what I call “Entry Points”. These are essentially the glue that stitches together multiple modules to form the actual infrastructure that is required.

Similar to how we structure code repositories, all the juicy stuff is within an ‘src’ folder. ‘src’ then contains region specific folders, as we have infrastructure deployed to various regions in the world. This could be managed in the same way Environments are managed, via config, having distinct entry points per region is more flexible should we need to modify region specific infrastructure.

Within each region folder is a folder that corresponds with these entry points, which match the different levels of release cadence as documented above.

Within each entry point I have the same general files:

backend.tf - contains definition of backend configuration, where remote-state is to be stored.
variables.tf - any variables required by this ‘entry point’
terraform.tfvars - default values for variables.
output.tf - any values output by this entry point. Queryable via remote-state.
main.tf - the glue - declaration of modules, datasources etc.

Octopus Deploy

Disclaimer: I don’t work for Octopus Deploy and they’re not paying me to write this - I just found it handy for managing configuration files!

Overview

A very high level overview of Octopus Deploy, which can be skipped if you’re familiar with it, is:

A single unit in Octopus is a project.
A project has a life cycle which can be deployed to multiple environments (Dev - QA - Production).
A project has a “process” which is a series of steps (Setup DB user, deploy web application, create log folder etc).
Each project can have variables, variables can be scoped to environment(s) and/or step(s).
When you want to deploy something you create a project release.
The unit of work for a release is an “Octopack” (which is a nuget package).
This unit of work is set in stone as it moves across environments, all that changes between environments is the variable values.

In the deployment pipeline Octopus sits alongside the CI server, in my case TeamCity. The CI server creates and uploads the Octopack and then creates a release whenever code is pushed to the master branch (and the build is successful, obviously!).

Why Octopus?

As well as the configuration management aspect mentioned above, we are a small team so using Terraform in conjunction with Octopus allows for much greater transparency in what is happening. As all releases can be tracked back to a Git commit/branch it is easy to see how the infrastructure environment has changed over time, and exactly what infrastructure is in each environment. If in the future process dictates that only users above level X or in role Y can change the Production infrastructure, e.g. for compliance or financial reasons, then we can have the deployment process defined in Octopus and all that needs to happen is X or Y log in and hit the “Promote to Production” button.

Octopus makes managing variables across environments very easy, meaning across environments I can easily alter things like:

Credentials. You can have completely separate credentials for different environments.
Backend configuration. This can ensure you use a different Consul instance or S3 bucket per environment.
Variable values. I can iterate through any Octopus variables and update values in Terraform.tfvars file.

This means that I can have a single folder, per entry point, containing my Terraform config files rather than multiple environment specific folders per entry point.

Variables

A convention I’ve followed is for all Octopus variables to be named the same as those they are replacing in Terraform. For example, if I had the following Terraform.tfvars file:

disk_name = "vm_data_disk"

location = "North Europe"

disk_size_GB = "128"

The I would have those same 3 variables located in Octopus: disk_name, location and disk_size_GB

I wrote a simple Octopus step template to iterate through all vars in a tfvars file and replace the value with any found in Octopus. I’ll create a PR to the Octopus community step repository once I’m fully happy with it.

Octopus has the concept of sensitive variables which is ideal for storing stuff like Azure credentials.

Storing the variables in Terraform, and outside of Git, means that non-developers can plug values in too as it is done via a web UI. This can be handy for stuff like service account credentials or subnet ranges, members of other teams can specify the values themselves.

Process

I’ve created an Octopus project per entry point and each follows the same general flow. When creating the Octopack nuget package I add all configuration files, including modules and entry points within the Git repo. The process is then:

Set credentials. Values are read from Octopus variables and put in either environment variables or by populating a provider resource.
Download and extract Octopack. This leaves me with the folder structure sitting on disk which mimics the Git repository.
Download Terraform executable. This is downloaded as a package to allow me to have different projects on different versions of Terraform (not required yet but gives flexibility moving forward).
Update terraform.tfvars with any values from Octopus. At this point I will have variable values specific to the environment I am deploying to.
Update backend.tf with any values from Octopus. This will ensure that I’m using the correct remote state for current environment.
terraform init -backend=true -get=true -force-copy -lock=true -input=false. -input=false will cause a failure should any variables be missing (Octopus isn’t interactive!)
terraform init -out=project-name.plan. I save the plan for use in the later apply step.
Manual review. This is to review what changes will be made. The job is paused until a user chooses to proceed or stop.
terraform apply project-name.plan.

Still Outstanding

I’m at a stage where this is all new infrastructure and it’s fairly well bedded in on Staging so next step is Production. However, in the future I will need to make updates and things will go wrong so I will need a way to leverage the rest of Terraform, namely taint and destroy.

To be honest this is where the grand plan may come tumbling down, I’ll need to get my thinking cap on…