Working with Terraform Modules: Why and How they can Keep you DRY

If you spend a whole day just trying to talk without repeating yourself, you’ll have already failed with a second “a” without realising. It’s pretty hard right? The concept of Don’t Repeat Yourself (DRY) thankfully only applies to Terraform here, but can equally be as hard without a few key tips.

Parent and Child Modules #

One of the important concepts that can be relatively hard to understand in Terraform is how modules themselves work.

While this is slightly messy, it does encapsulate the lack of interaction between child modules, and how the parent is essentially responsible for everything. Here’s a quick set of snippets to show how these can interact.

Root Module - main.tf

module "web-tier" {
  source        = "./modules/web-tier
  deploy_region = var.deploy_region
}

The deploy_region variable is what will need to be defined in our root module’s variables file.

Root Module - variables.tf

variable "instance_type" {
  type = string
  description = "Enter the Instance type you want to use"
}

So… How do we then use this in the web-tier module? With another variable of course!

web-tier module - variables.tf

variable "instance_type {
}

Now finally we can use this!

web-tier module - main.tf

resource "aws_instance "test_machine" {
  instance_type = var.instance_type
}

Okay… but what if we want to use this in another module? With an output!

web-tier module - outputs.tf

output "instance_public_ip" {
  value = aws_instance.test_machine.public_ip
}

You can then call this in the root module with module.web-tier.instance_public_ip .

Sounds convoluted, right? Yes, but did you notice that in that whole point, you didn’t statically configure any variable at all? You can go a few steps further and eventually you get to a point where your main.tf is no longer a huge amount of spaghetti code and can instead enter a few much more simple variables rather than configuring from scratch. Here’s another quick example I have from my very early EC2-based module.

module "web-tier" {
 depends_on     = [module.network]
 source         = "./modules/web-tier"
 deploy_region  = var.deploy_region
 architecture   = var.architecture
 ubuntu_version = var.ubuntu_version
 instance_count = var.instance_count
 instance_type  = var.instance_type
 vpc_id         = module.network.vpc_id
 subnet_id      = module.network.subnet_id
 default_tags   = var.default_tags
 volume_size    = var.volume_size
}

That looks a heck of a lot nicer than the underlying module code which is still relatively DRY:

locals {
  env_name = "lh-lab"
  env_type = "prod"
}
resource "aws_instance" "web_tier_ec2" { 
  depends_on = [aws_security_group.ec2]
  count = var.instance_count
  instance_type = var.instance_type
  vpc_security_group_ids = [aws_security_group.ec2.id]
  subnet_id = var.subnet_id[count.index]
  associate_public_ip_address = "true"
  ami = data.aws_ami.ubuntu.id
  tags = merge(
    var.default_tags,
    {
     Name = "lh-lab-web-${count.index}"     
    }
  )
  root_block_device {
    delete_on_termination = "true"
    encrypted = "true"
    kms_key_id = data.aws_ebs_default_kms_key.current.key_arn
    volume_size = var.volume_size
    volume_type = "gp3"
    tags = merge(
      var.default_tags,
      {
        Name = "lh-lab-web-${count.index}-disk-${count.index}"
      }
    )
  }
}

There’s a few more data sources in that module that you won’t see referenced in that snippet but the point is, the more DRY your configuration is means the more simple it is to make changes moving forward. Experiencing the pain now means less in the future and vice versa. Additionally, modules help limit your blast radius.

Blast Radius #

Blast Radius is an often underspoken and misunderstood concept in Terraform partially because you have to reach a certain scale for it to really become a problem. Funnily enough there’s a tool of the same name that is great for visualizing this.

In theory, any changes that you make in any given module could affect the functionality of resources in the whole module, but shouldn’t affect anything outside of that. In practice, people make modules that don’t really fit the concept of logical abstraction.

In the example terraform code I’ve given, the requirement for logical abstraction isn’t that strong. It’s just a single VM on a single account on a single cloud provider.

Well… what if I wanted to use the same code or make the same infrastructure on multiple accounts for that provider? Well, you can! All you’d have to do is add a variable for account numbers and away you go. That’s so much better than having to copy and paste more code into multiple files and inevitably losing track of changes done to one environment or the other, right? That is really where considering blast radius comes in.

Additionally, what if I had an environment in Azure and GCP as well? Naturally, it’d be different terraform code, but having the Azure code in one module and GCP code in another would not only make for much better readability, but would let you modify the codebase for each cloud provider’s IaC without having to worry about the others. If you’re using something like Terraform Cloud or Enterprise you can also lock down access meaning that in theory, only your Database engineers could work on the DB modules, Security engineers on the Security module etc.

Re-usability #

This is half about DRY config, and half for modules. It doesn’t matter if you have a module created for every possible situation, having to go back in to each one to modify hard-coded values is just plain inefficient and a bit pointless. Likewise, if you have some great DRY config, but you have many different customer requirements, you’re going to have to chop and change it too much.

Having a good combo of both is really handy. In my examples above for example, I only have to change a few values if required: Volume type and name for whichever deployment we’re using. This really is the draw for both DRY config as a whole and the use of Modules. Being able to very quickly re-use code like this will save you ample time.

Before I knew IaC at a decent level, I remember one company I worked with were a SaaS provider which provided customers with their own dedicated EC2 instance to be able to login to. While the deployment of the instance itself was automatic for any standard configuration, many customers needed more or less CPU and RAM or needed more than one instance. Since this company was using a proprietary script, they had to into that, find the variables and change them directly. They didn’t even have these variables defined in one convenient spot!

With the config above, all I’d have to do is change the var.instance_type and var.instance_count . In theory, you could even create a self-service portal that will just pass the values given there to a .tfvars file. That’s the beauty of all of it. You suddenly go from a customer setup taking hours, to then minutes, to then zero.