Skip to content

Creating k8s cluster on GCP by Terraform. Part 1.

Using Terraform to create and maintain the GKE Kubernetes Cluster.

All cloud providers are interested to lock you in to their services and maximize the adoption of Cloud Native. Using services that billing per usage or even per transaction is the dream for cloud companies. Your bill will sky rocket as more users you have. Your application is hard to move and switch between cloud providers. But there is a solution….

It is the Kubernetes!

You can have your own universe inside the kubernetes, but how to manage this situation?

Would it be the case where DevOps costs exceed the whole gain of having it by your own?

Here we will try to use amazing tool called Terraform where we can place all our infrastructure in code and automate k8s clusters creation in any cloud provider and on prem.

First of all let’s create GKE Kubernetes cluster in GCP.

There are two options to do this.

The first option is go through GCP web interface. You can select the cluster type, name, params, networks and click create. It has a lot of settings and you do not know or can not remember what to use.

The second option is to use Terraform. Le’t go this route.

Requirements:

  • Already installed Google Cloud SDK in your laptop/desktop. Use the installation manual.
  • Having ready account in GCP with admin/extended permissions.
  • Having credits or billing in GCP account since all those operations are payable.
  • Having GCP project, let’s name it devops-example-project.

Majority of commands including the terraform itself needs local login in GCP:

gcloud auth logingcloud config set project devops-example-project

The project with billing should be already created in GCP web interface, there is no command to create it. If you are a startup do not forget to apply and use GCP credits for free.

The project names are global and finding the good name is the challenge.

Source Code

The idea behind the DevOps is to have source code describing infrastructure, not readme/wiki files.

Let’s create the project structure that we will store in GitHub.

GitHub selected because it has two major features:

  • Action Secrets
  • Workflows

Both of those features we will use later. You are free to choose any other code repository that supports both, but workflow syntax will be provided only for GitHub.

terraform    cluster       .terraform-version       backend.tf       main.tf       outputs.tf       variables.tf       terraform.tfvars    ingress       .terraform-version       backend.tf       cert-manager.tf       external-ip.tf       ingress.tf       main.tf       outputs.tf       variables.tf       versions.tf       terraform.tfvars.gitignoreREADME.md

Here is the example project that you can structure by any other way. My decision was to separate the cluster creation form other cluster-ops.

Why do we need a cluster creation folder?

  • Portable — If tomorrow I would like to move it to another project this config preserves all settings and params for the k8s cluster.
  • Placeholder — This example project is simple, but real-world configs are more complex.
  • Specific — The cluster folder is the cloud provider specific, whereas other folders are re-usable between providers.

Service Account

Cluster creation from Terraform needs certain permissions that you must have in service account.

Let’s create the service account and assign permissions:

PROJECT=devops-example-projectGKE_NODES_SA=devops-gke-nodesgcloud iam service-accounts create $GKE_NODES_SA \  --project=$PROJECT \  --description="GKE node pool service account" \  --display-name="GKE Nodes SA"gcloud projects add-iam-policy-binding $PROJECT \  --member="serviceAccount:$GKE_NODES_SA@$PROJECT.iam.gserviceaccount.com" \  --role="roles/logging.logWriter"gcloud projects add-iam-policy-binding $PROJECT \  --member="serviceAccount:$GKE_NODES_SA@$PROJECT.iam.gserviceaccount.com" \  --role="roles/monitoring.metricWriter"gcloud projects add-iam-policy-binding $PROJECT \  --member="serviceAccount:$GKE_NODES_SA@$PROJECT.iam.gserviceaccount.com" \  --role="roles/monitoring.viewer"gcloud projects add-iam-policy-binding $PROJECT \  --member="serviceAccount:$GKE_NODES_SA@$PROJECT.iam.gserviceaccount.com" \  --role="roles/container.nodeServiceAgent"gcloud projects add-iam-policy-binding $PROJECT \  --member="serviceAccount:$GKE_NODES_SA@$PROJECT.iam.gserviceaccount.com" \  --role="roles/artifactregistry.reader"gcloud projects add-iam-policy-binding $PROJECT \  --member="serviceAccount:$GKE_NODES_SA@$PROJECT.iam.gserviceaccount.com" \  --role="roles/storage.objectViewer"gcloud projects add-iam-policy-binding $PROJECT \  --member="serviceAccount:$GKE_NODES_SA@$PROJECT.iam.gserviceaccount.com" \  --role="roles/servicemanagement.serviceController"gcloud secrets add-iam-policy-binding $PROJECT \  --member="serviceAccount:$GKE_NODES_SA@$PROJECT.iam.gserviceaccount.com" \  --role="roles/secretmanager.secretAccessor"

This step is specific for GCP. For AWS Azure and on-prems it could be different.

Cluster

Let’s build the cluster first located in the folder ‘cluster’ in our source code.

For the terraform version I selected the latest one.

cd terraform/clustercat > .terraform-version1.12.1

For the backend we use GCP buckets, but you can use any provider specific backend.

cat > backend.tfterraform {  backend "gcs" {    bucket  = "devops-terraform-state" # You must create this bucket first    prefix  = "devops-cluster-dev/cluster"  }}

Let’s create the GCP bucket

REGION=us-central1gsutil mb -p $PROJECT -c STANDARD -l $REGION gs://devops-terraform-state

The terraform state file for the cluster creation would be stored in:

gs://devops-terraform-state/devops-cluster/cluster

Make it unique for your project and cluster. The common mistake is to copy-past backend.tf and not to change the bucket/path. I selected devops-cluster as an unique name of the first k8s cluster in GKE with matching prefix in the bucket.

For the concurrent ops this bucket stores the lock, you can unlock it by using this command:

terraform force-unlock LOCK_ID

Let’s define variables that we can configure for our cluster:

cat > variables.tfvariable "project_id" {  type    = string  default = "devops-example-project"}variable "region" {  type    = string  default = "us-central1"}variable "region_with_zone" {  type    = string  default = "us-central1-a"}variable "cluster_name" {  type    = string  default = "devops-cluster"}

You also can do this differently by splitting this file in two:

cat > variables.tfvariable "project_id" {  type    = string}variable "region" {  type    = string}variable "region_with_zone" {  type    = string}variable "cluster_name" {  type    = string}cat > terraform.tfvarsproject_id = "devops-example-project"region = "us-central1"region_with_zone = "us-central1-a"cluster_name = "devops-cluster"

The second approach is better practice to keep configurable variables outside of the terraform code. During deployment you can replace terraform.vars by the custom file specific for the environment: dev, staging, prod. You can also have secret variables like keys and passwords.

I specially separated region and region_with_zone because Artifact Registry works on the level of region but other deployment resources, like GKE, could be specified with region and zone.

Now we are ready for the main.tf file that creates the GKE cluster.

cat > main.tfprovider "google" {  project = var.project_id  region  = var.region}resource "google_container_cluster" "primary" {  name     = var.cluster_name  location = var.region_with_zone  remove_default_node_pool = true  initial_node_count       = 1  network    = "default"  subnetwork = "default"  ip_allocation_policy {}  lifecycle {    ignore_changes = [      initial_node_count,      enable_autopilot,      enable_tpu,      enable_intranode_visibility,      datapath_provider,    ]  }}resource "google_container_node_pool" "primary_nodes" {  name     = "primary-pool"  location = var.region_with_zone  cluster  = google_container_cluster.primary.name  autoscaling {    min_node_count = 1    max_node_count = 10  }  node_config {    machine_type    = "e2-standard-4"    service_account = "devops-gke-nodes@${var.project_id}.iam.gserviceaccount.com"    oauth_scopes = [      "https://www.googleapis.com/auth/cloud-platform"    ]    labels = {      workload = "primary"    }    tags = ["primary"]  }}

We are creating cluster in the specific zone with enabled autoscaling 1..10 with initial 1 node with the size of 16gb each and 4 vCPUs, and we are ignoring the lifecycle changes and replacing the default pool with the custom one.

And finally, we need outputs.tf to see the created resources.

output "kubernetes_cluster_name" {  value = google_container_cluster.primary.name}output "kubernetes_cluster_endpoint" {  value = google_container_cluster.primary.endpoint}

After writing all those scripts we can apply changes through terraform:

cd terraform/clusterterraform initterraform planterraform apply

It would take several minutes to allocate nodes, install kubernetes and do all registrations.

Summary

Created source code project in GitHub in branch main that stores the terraform configuration files for the k8s cluster creation in GCP.

Next is the Part 2.

Last updated:

Deep Learning · Algorithms · Engineering