Create a Kubernetes cluster with Azure AKS using Terraform

Introduction

This written Infra as Code (IaC) workshop show how to create AKS cluster using Hashicorp Terraform

Azure Kubernetes Service (AKS) is a highly available, secure, and fully managed Kubernetes service of Microsoft Azure.

The fully managed Azure Kubernetes Service (AKS) makes deploying and managing containerized applications easy. It offers serverless Kubernetes, an integrated continuous integration and continuous delivery (CI/CD) experience, and enterprise-grade security and governance. Unite your development and operations teams on a single platform to rapidly build, deliver, and scale applications with confidence.

Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently. Terraform can manage existing and popular service providers as well as custom in-house solutions.

Admitting, there are number of Cloud provisioning IaC tools, each with its own implementation. With the object of this lab, We will focus exclusively on deploying AKS cluster on Azure.

Objectives

This guide walks you through how to the following tasks:

  1. ✅ Learn HCL (HashiCorp Language) and Terraform best practices by doing
  2. ✅ Use Terraform and AKS to create a Kubernetes cluster
  3. ✅ Use the kubectl tool to test the availability of a Kubernetes cluster

Assumptions and Prerequisites

  • You have basic knowledge of Azure
  • Have basic knowledge of Kubernetes
  • You have Terraform installed in your local machine
  • Azure subscription: Sign up for an Azure account, if you don’t own one already. You will receive USD200 in free credits.

Software Dependencies

Why use Terraform (or any other IaC tool) to create an AKS cluster ?

Using the Azure Portal you can create a cluster with few clicks.
However, it usually a better idea to keep the configuration for your cluster under version control. Assuming you accidentally delete your cluster or decide to provision a copy in another region, you can easily replicate the same configuration. And if you’re working as part of a team, source control gives you peace of mind. You know precisely why changes occurred and who made them.

⭐️Infrastructure as Code, in simple terms, is a means by which we can write declarative definitions for the infrastructure we want to exist and using them with a provisioning tool that deals with the actual deployment.
This means that we can code what we want built, provide necessary credentials for the given provider, kick off the provisioning process, pop the kettle on and come back to find all your services purring along nicely in the cloud… or a terminal screen full of ominous warnings about failed deployment and “unrecoverable state” and a deep sense of growing unease (but not often, don’t worry 👍).

Using Terraform to create AKS cluster

Define AKS cluster with Terraform

The first step is to obtain the source code from my Github repository.
This will clone the sample repository and make it the current directory:

git clone https://github.com/AymenSegni/azure-aks-k8s-tf.git
cd azure-aks-k8s-tf

The directory that holds the Terraform configuration files for this lab has a special tree structure.
Obviously, there are 2 main subfolders: deployment and modules. In order to see the source code structure you can run tree

Tree project create AKS cluster with Terraform

Project structure

1- modules: represent here in this layout the Terraform modules (general re-used functions) . In this lab, we have basically 4 modules:
aks_cluster: the main unit providing the AKS service
aks_identities: the cluster identity unit that manage the cluster service principal
aks_network: Create the cluster Virtual Network and subnetwork on Azure
log_analytics: Formally Azure Operational Insight is the unit that manages logs and cluster health checks

2- Deployment: is the main function of this layout, responsible of the AKS Kubernetes cluster deployment on Azure.
In main.tf we define the Terraform modules already created in /modules sub-folder with the appropriate inputs defined in variables.tf or in a terraform.tfvars file (wich is not covered in this guide).


Stay tuned, in the next section, we’re going to talk about how to create reusable infrastructure with Terraform.

Terraform modules

With Terraform, you can put a bunch of code inside of a Terraform module and reuse that module in multiple places throughout your code. Instead of having the same code copy/pasted in the staging and production environments, you’ll be able to have both environments reuse code from the same module.
This is a big deal. Modules are the key ingredient to writing reusable, maintainable, and testable Terraform code.

Every Terraform configuration has at least one module, known as its root module (the /deployment in this lab context), which consists of the resources defined in the .tf files in the main working directory.

Terraform module structure

In this lab we have a well defined structure of the TF Modules. Let’s go through the aks_identities module as an example:

Create AKS cluster with Terraform
module tree

The module is a container for multiple resources that are used together.

  1. main.tf: the aks cluster resources are packaged in the main.tf file
  2. variables.tf: In Terraform, modules can have input parameters, too. To define them, you use a mechanism input variables.
  3. output.tf: In Terraform, a module can also return values. Again, this is done using a mechanism: output variables.
Now, you know all the Terraform modules secrets! pride 😊 ^^
Now, you know all the Terraform modules secrets! pride 😊 ^^

In this stage, will see together a full example of the aks cluster module.

  • main resources: defined in the main.tf configuration file
resource "azurerm_kubernetes_cluster" "cluster" {
  name                = var.cluster_name
  location            = var.location
  resource_group_name = var.resource_group_name
  dns_prefix          = var.dns_prefix
  kubernetes_version  = var.kubernetes_version

  default_node_pool {
    name            = var.default_pool_name
    node_count      = var.node_count
    vm_size         = var.vm_size
    os_disk_size_gb = var.os_disk_size_gb
    vnet_subnet_id  = var.vnet_subnet_id
    max_pods        = var.max_pods
    type            = var.default_pool_type

    enable_auto_scaling = true
    min_count           = var.min_count
    max_count           = var.max_count

    tags = merge(
    {
       "environment" = "runitoncloud"
    },
    {
      "aadssh" = "True"
    },
  )
  }


  network_profile {
    network_plugin     = var.network_plugin
    network_policy     = "calico"
    service_cidr       = var.service_cidr
    dns_service_ip     = "10.0.0.10"
    docker_bridge_cidr = "172.17.0.1/16"
  }

  service_principal {
    client_id     = var.client_id
    client_secret = var.client_secret
  }


  tags = {
    Environment = "Development"
  }

  lifecycle {
    prevent_destroy = true
  }
}

resource "azurerm_monitor_diagnostic_setting" "aks_cluster" {
  name                       = "${azurerm_kubernetes_cluster.cluster.name}-audit"
  target_resource_id         = azurerm_kubernetes_cluster.cluster.id
  log_analytics_workspace_id = var.diagnostics_workspace_id

  log {
    category = "kube-apiserver"
    enabled  = true

    retention_policy {
      enabled = false
    }
  }

  log {
    category = "kube-controller-manager"
    enabled  = true

    retention_policy {
      enabled = false
    }
  }

  log {
    category = "cluster-autoscaler"
    enabled  = true

    retention_policy {
      enabled = false
    }
  }

  log {
    category = "kube-scheduler"
    enabled  = true

    retention_policy {
      enabled = false
    }
  }

  log {
    category = "kube-audit"
    enabled  = true

    retention_policy {
      enabled = false
    }
  }

  metric {
    category = "AllMetrics"
    enabled  = false

    retention_policy {
      enabled = false
    }
  }
}

Terraform resources

resource "azurerm_kubernetes_cluster" "cluster" {}
This block is responsible for creating the AKS cluster

Resources are the most important element in the Terraform language. Each resource block describes one or more infrastructure objects, such as virtual networks, compute instances, or higher-level components such as DNS records.

  • Variables: define the resources inputs
variable "dns_prefix" {
  description = "DNS prefix"
}
variable "location" {
  description = "azure location to deploy resources"
}
variable "cluster_name" {
  description = "AKS cluster name"
}
variable "resource_group_name" {
  description = "name of the resource group to deploy AKS cluster in"
}
variable "kubernetes_version" {
  description = "version of the kubernetes cluster"
}
variable "api_server_authorized_ip_ranges" {
  description = "ip ranges to lock down access to kubernetes api server"
  default     = "0.0.0.0/0"
}
# Node Pool config
variable "agent_pool_name" {
  description = "name for the agent pool profile"
  default     = "default"
}
variable "agent_pool_type" {
  description = "type of the agent pool (AvailabilitySet and VirtualMachineScaleSets)"
  default     = "AvailabilitySet"
}
variable "node_count" {
  description = "number of nodes to deploy"
}
variable "vm_size" {
  description = "size/type of VM to use for nodes"
}
variable "os_disk_size_gb" {
  description = "size of the OS disk to attach to the nodes"
}
variable "vnet_subnet_id" {
  description = "vnet id where the nodes will be deployed"
}
variable "max_pods" {
  description = "maximum number of pods that can run on a single node"
}

#Network Profile config
variable "network_plugin" {
  description = "network plugin for kubenretes network overlay (azure or calico)"
  default     = "azure"
}
variable "service_cidr" {
  description = "kubernetes internal service cidr range"
  default     = "10.0.0.0/16"
}
variable "diagnostics_workspace_id" {
  description = "log analytics workspace id for cluster audit"
}
variable "min_count" {
  default     = 1
  description = "Minimum Node Count"
}
variable "max_count" {
  default     = 5
  description = "Maximum Node Count"
}
variable "default_pool_name" {
  description = "name for the agent pool profile"
  default     = "default"
}
variable "default_pool_type" {
  description = "type of the agent pool (AvailabilitySet and VirtualMachineScaleSets)"
  default     = "VirtualMachineScaleSets"
}
variable "client_id" {  
}
variable "client_secret" {
}

Terraform variabels

variable cluster_name {
description = “AKS cluster name”
default = “run-it-on-cloud”
}

Input variables serve as parameters for a Terraform module, allowing aspects of the module to be customized without altering the module’s own source code, and allowing modules to be shared between different configurations.

  • Outputs: define the resources outputs
output "azurerm_kubernetes_cluster_id" {
  value = azurerm_kubernetes_cluster.cluster.id
}

output "azurerm_kubernetes_cluster_fqdn" {
  value = azurerm_kubernetes_cluster.cluster.fqdn
}

output "azurerm_kubernetes_cluster_node_resource_group" {
  value = azurerm_kubernetes_cluster.cluster.node_resource_group
}

Terraform output

output “azurerm_kubernetes_cluster_id” {
value = azurerm_kubernetes_cluster.cluster.id
}

Output values are like the return values of a Terraform module.

Well, we have discovered what’s the main components of a Terraform module. Now, is a good time to call this function -module- in the root deployment, as well as all the other Terraform modules that define the AKS cluster

# Cluster Resource Group
resource "azurerm_resource_group" "aks" {
  name     = var.resource_group_name
  location = var.location
}

# AKS Cluster Network
module "aks_network" {
  source              = "../modules/aks_network"
  subnet_name         = var.subnet_name
  vnet_name           = var.vnet_name
  resource_group_name = azurerm_resource_group.aks.name
  subnet_cidr         = var.subnet_cidr
  location            = var.location
  address_space       = var.address_space
}

# AKS IDs
module "aks_identities" {
  source       = "../modules/aks_identities"
  cluster_name = var.cluster_name
}

# AKS Log Analytics
module "log_analytics" {
  source                           = "../modules/log_analytics"
  resource_group_name              = azurerm_resource_group.aks.name
  log_analytics_workspace_location = var.log_analytics_workspace_location
  log_analytics_workspace_name     = var.log_analytics_workspace_name
  log_analytics_workspace_sku      = var.log_analytics_workspace_sku
}

# AKS Cluster
module "aks_cluster" {
  source                   = "../modules/aks-cluster"
  cluster_name             = var.cluster_name
  location                 = var.location
  dns_prefix               = var.dns_prefix
  resource_group_name      = azurerm_resource_group.aks.name
  kubernetes_version       = var.kubernetes_version
  node_count               = var.node_count
  min_count                = var.min_count
  max_count                = var.max_count
  os_disk_size_gb          = "1028"
  max_pods                 = "110"
  vm_size                  = var.vm_size
  vnet_subnet_id           = module.aks_network.aks_subnet_id
  client_id                = module.aks_identities.cluster_client_id
  client_secret            = module.aks_identities.cluster_sp_secret
  diagnostics_workspace_id = module.log_analytics.azurerm_log_analytics_workspace
}

Define the cluster deployment


– As you can see, we start building the cluster by defining a cluster resource group. You can learn more about Azure resource group here.
– Next, we create the AKS Cluster Network by create the cluster Vnet and subnet using the TF network module already defined in the modules.
– The next block is for the AKS IDs. learn more about Azure SPs here
– Now everything is ready to create the AKS cluster.
– The last section of this deployment consists of logs and the cluster health checks

The root module -deployment- contains also tow other Terraform files:

  • provider.tf: While resources are the primary construct in the Terraform language, the behaviors of resources rely on their associated resource types, and these types are defined by providers.
    Learn more about Terraform provider here
    In the next section will discuss the project provider configuration.
terraform {
  backend "azurerm" {}
}

provider "azurerm" {
  version = "~> 2.4"
  features {}
}

A provider configuration is created using a provider block:

provider “azurerm” {
version = “~>2.4”
}

The name given in the block header (azurem Cloud provider) is the name of the provider to configure. Terraform associates each resource type with a provider by taking the first word of the resource type name

Each time a new provider is added to configuration — either explicitly via a provider block or by adding a resource from that provider — Terraform must initialize the provider before it can be used. Initialization downloads and installs the provider’s plugin so that it can later be executed.

Initialization

Provider initialization is one of the actions of terraform init.
Running this command will download and initialize any providers that are not already initialized.


terraform init
Terraform init to create AKS cluster
Terraform init

A Terraform backend storage ( terraform state ) is create using a terraform black:

terraform {
backend “azurerm” {}
}

In the second part of this lab will set up Azure storage to store Terraform state.

  • version.tf: The required_version setting can be used to constrain which versions of the Terraform CLI can be used with your configuration.
    Note: that all the lab code are written for Terraform 0.12.x.

Once you’ve declared all your resources, defined your necessary variables and provided your credentials, you’re all set to let Terraform do its magic! To check if everything will work and there’s no errors, run terraform validate and terraform plan from within the directory.
If all is well and you’re happy with what it plans to build, kick off the process with terraform apply, one final approval, then wait for your infrastructure to be deployed 🙂

Are you still exited to run your Kubernetes cluster on Cloud ? 😬 
So, let Terraform do its magic!

Terraform apply AKS deploy
terraform magic

Set up Azure storage to store Terraform state

Terraform tracks state locally via the terraform.tfstate file.
This pattern works well in a single-person environment.
However, in a multi-person environment, Azure storage is used to track the tf state.

In this section, we will see how to do the following tasks:

  • Retrieve storage account information (account name and account key)
  • Create a storage container into which Terraform state information will be stored.
  1. In the Azure portal, select All services in the left menu.
  2. Select Storage accounts.
  3. On the Storage accounts tab, select the name of the storage account into which Terraform is to store state. For example, you can use the storage account created when you opened Cloud Shell the first time. The storage account name created by Cloud Shell typically starts with cs followed by a random string of numbers and letters. Take note of the storage account you select. This value is needed later.
  4. On the storage account tab, select Access keys.
  5. Make note of the key1 key value. (Selecting the icon to the right of the key copies the value to the clipboard.)
Terraform state blob storage to create AKS cluster using Terraform
Azure cloud blob storage
terraform state keys
Azure storage key

Using your terminal, create a container in your Azure storage account. Replace the placeholders with appropriate values for your environment.

az storage container create -n tfstate --account-name  <YourAzureStorageAccountName> --account-key  \<YourAzureStorageAccountKey>

Create AKS using Terraform

In this section, you see how to use the terraform init command to create the resources defined the configuration files you created in the previous sections.

1. In your local terminal, initialize Terraform. Replace the placeholders with appropriate values for your environment

cd src/deployment
terraform init -backend-config="storage_account_name=<YourAzureStorageAccountName>" -backend-config="container_name=tfstate" -backend-config="access_key=<YourStorageAccountAccessKey>" -backend-config="key=codelab.microsoft.tfstate"

The terraform init command displays the success of initializing the backend and provider plug-in:

2. Export your service principal credentials. Replace the placeholders with appropriate values from your service principal


export TF_VAR_client_id=<service-principal-appid>

export TF_VAR_client_secret=<service-principal-password>

3. Run the terraform plan command to create the Terraform plan that defines the infrastructure elements.


terraform plan -out out.plan

The terraform plan command displays the resources that will be created when you run the terraform apply command.
There’s an Terraform plan output example:

Terraform plan AKS deployment
Terraform plan
Terraform plan of create AKS cluster using Terraform
Terraform plan output

4. Run the terraform apply command to apply the plan to create the Kubernetes cluster. The process to create a Kubernetes cluster can take several minutes.


terraform apply out.plan
tf apply example
AKS Cluster created on Azure portal

Test the Kubernetes AKS cluster

To manage a Kubernetes cluster, you use kubectl, the Kubernetes command-line client. If you use Azure Cloud Shell, kubectl is already installed. To install kubectl locally, use the az aks install-cli command:

az aks install-cli

To configure kubectl to connect to your Kubernetes cluster, use the az aks get-credentials command. This command downloads credentials and configures the Kubernetes CLI to use them. (make sure to use your own created resource group and AKS names)

az aks get-credentials --resource-group runitoncloud --name runItOnCloud

To verify the connection to your cluster, use the kubectl get command to return a list of the cluster nodes.

You should see the details of your worker nodes, and they should all have a status Ready, as shown in the following example:

|~/run-it-on-cloud/azure-aks-k8s-tf/src/deployment 💻 --⑆ master 🔧 
|
|--⫸  kubectl get nodes                                                                                                                  
NAME                              STATUS   ROLES   AGE   VERSION
aks-default-17028763-vmss000000   Ready    agent   16m   v1.16.10
aks-default-17028763-vmss000001   Ready    agent   16m   v1.16.10

Monitor health and logs

When the AKS cluster was created, monitoring was enabled to capture health metrics for both the cluster nodes and pods. These health metrics are available in the Azure portal. For more information on container health monitoring, see Monitor Azure Kubernetes Service health.

Next steps

If you make changes to your code, running the plan and apply commands again will let Terraform use its knowledge of the deployed resources (.tfstate) to calculate what changes need to be made, whether building or destroying. Finally, when you want to bring down your infrastructure, simply issue a terraform destroy command and down it comes.

The setup described is only the beginning, if you’re provisioning production-grade infrastructure you should look into:

  • how to structure your Terraform in global and environment-specific layers
  • managing Terraform state and how to work with the rest of your team

That’s all folks!

That’s all for this lab, thanks for reading! Later posts may cover how to secure your AKS deployment, as well as a fully configured cluster with hundreds of microservices deployed in one click!

Be the first to be notified when a new article, running it on Cloud or Kubernetes experiment is published.
Don’t miss the next article!

17 thoughts on “Create a Kubernetes cluster with Azure AKS using Terraform”

    1. Sure, yes this implementation is AKS with Terraform the hard way 😀
      Among the announced objectives of the article is to learn best practices so we also do it the right way 😉

  1. I savour, result in I found just what I was having a look for.
    You’ve ended my four day long hunt! God Bless you man. Have a great
    day. Bye

Leave a Reply

Related Post