Unraveling GitHub Actions Runner Controller: A Developer’s Journey with Terraform and Kubernetes

Picture this: A typical innovation day at our company, where developers gather to explore, tinker, and push the boundaries of what’s possible. Our group decided to delve into the GitHub Actions Runner Controller, blending it with Terraform magic and the wonders of Kubernetes.

Quick Summary: Streamline GitHub Actions Self-Hosted Runner Deployment with Terraform.

Effortlessly set up and dynamically scale self-hosted runners using Kubernetes and the GitHub Actions Runner Controller. If you already have a Kubernetes cluster, make use of our Terraform modules for seamless hosting:

Runner Controller Module: GitHub Actions Runner Controller

GitHub Actions Runner Module: GitHub Actions Runner

Revolutionize your CI/CD workflows with simplicity and scalability.

The GitHub Actions Runner Controller

Learn about the functionality of the GitHub Actions Runner Controller in the docs.

The core concept is to employ a Kubernetes cluster to execute jobs on self-hosted runners in containers, moving away from virtual machines.

ARC, a K8s controller, facilitates the creation of self-hosted runners on your K8s cluster. With minimal commands, you can establish self-hosted runners capable of scaling based on demand. These runners, being ephemeral and container-based, allow for rapid and clean instantiation. Find out more at ARC Documentation.

Setting Up A Kubernetes Hosted Action Runner

Now, let’s get hands-on with our GitHub Actions Runner Controller exploration. No embellishments, just an unfiltered account of our journey:

Step 1: Getting Started with GitHub Docs

Our GitHub Actions journey commenced by following the straightforward docs on Installing ARC. With a valid GitHub PAT, the process can be broken down into the following snippet:

# install cert manager
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.13.3 \
  --set installCRDs=true

# install actions runner controller
helm repo add actions-runner-controller https://actions-runner-controller.github.io/actions-runner-controller
helm install actions-runner-controller/actions-runner-controller \
  --namespace actions-runner-system \
  --create-namespace \
  --set=authSecret.create=true \
  --set=authSecret.github_token="***" \
  --wait actions-runner-controller 

A classic start – deploying the Helm chart of the GitHub Actions Runner Controller and a runner.

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: first-runner
spec:
  replicas: 1
  template:
    spec:
      repository: m4s-b3n/playground

Simple, right? Well, the devil is in the details.

Step 2: Docker Socket Security Revelation

Our first “aha” moment struck when we realized the runner was casually mounting the host’s Docker socket, triggering security red flags everywhere.

apiVersion: v1
kind: Pod
metadata:
  # [...]
spec:
  containers:
  - env:
    # [...]
    volumeMounts:
    - mountPath: /runner
      name: runner
    - mountPath: /runner/_work
      name: work
    - mountPath: /run
      name: var-run
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-97rpn
      readOnly: true  
    # [...]
  - args:
    - dockerd
    - --host=unix:///run/docker.sock
    - --group=$(DOCKER_GROUP_GID)

So, we took matters into our own hands, manually eliminating this security risk. Little did we know, setting the container mode to Kubernetes would automatically handle this for us – a lesson learned the hard way. This mode also requires a workVolumeClaimTemplate to be configured and for security reasons should be set to ephemeral mode.

apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: first-runner
spec:
  replicas: 1
  template:
    spec:
      repository: m4s-b3n/playground
      containerMode: kubernetes
      workVolumeClaimTemplate:
        storageClassName: default
        accessModes:
          - "ReadWriteOnce"
        resources:
          requests:
            storage: "100Mi"
      ephemeral: true

Step 3: Tackling RBAC Quandaries

Next up, RBAC (Role-Based Access Control) complexities emerged. The RBAC for the controller was decent, no need for meddling. But the runner’s RBAC? Not so much. We wanted control, so we decided to use our own service accounts.

Step 4: A Deep Dive into Custom CRDs

Our RBAC journey led us to custom CRDs (Custom Resource Definitions). The controller’s RBAC could be customized by assigning our self-created service account as part of the CRD. Simple enough.

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: runner-service-account
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: runner-role
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "create", "delete"]
- apiGroups: [""]
  resources: ["pods/exec"]
  verbs: ["get", "create"]
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources: ["jobs"]
  verbs: ["get", "list", "create", "delete"]
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "list", "create", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: runner-rolebinding
subjects:
- kind: ServiceAccount
  name: runner-service-account
roleRef:
  kind: Role
  name: runner-role
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: first-runner
spec:
  replicas: 1
  template:
    spec:
      repository: m4s-b3n/playground
      containerMode: kubernetes
      workVolumeClaimTemplate:
        storageClassName: default
        accessModes:
          - "ReadWriteOnce"
        resources:
          requests:
            storage: "100Mi"
      ephemeral: true
      serviceAccountName: runner-service-account

However, the runner’s pod creation process was a different beast.

Step 5: The Missing Puzzle Piece

Here’s where the documentation left us hanging: setting RBAC for the pod created to execute the Actions job. The projects issues became our virtual campfire, and after some crawling, we struck gold: GitHub Actions Runner Controller Issue #2992.

Step 6: PodTemplate to the Rescue

Enter the “PodTemplate.” We found a way to mount a Kubernetes PodTemplate into the runner, utilizing a service account we had created. This ingenious workaround allowed the runner to create pods using our service account, granting us the golden ticket to customize permissions as needed.

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: runner-service-account
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: runner-role
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "create", "delete"]
- apiGroups: [""]
  resources: ["pods/exec"]
  verbs: ["get", "create"]
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["batch"]
  resources: ["jobs"]
  verbs: ["get", "list", "create", "delete"]
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "list", "create", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: runner-rolebinding
subjects:
- kind: ServiceAccount
  name: runner-service-account
roleRef:
  kind: Role
  name: runner-role
  apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: job-service-account
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: pod-templates
data:
  default.yaml: |
    ---
    apiVersion: v1
    kind: PodTemplate
    metadata:
      name: runner-pod-template
      labels:
        app: runner-pod-template
    spec:
      serviceAccountName: job-service-account
---
apiVersion: actions.summerwind.dev/v1alpha1
kind: RunnerDeployment
metadata:
  name: first-runner
spec:
  replicas: 1
  template:
    spec:
      repository: m4s-b3n/playground
      containerMode: kubernetes
      workVolumeClaimTemplate:
        storageClassName: default
        accessModes:
          - "ReadWriteOnce"
        resources:
          requests:
            storage: "100Mi"
      ephemeral: true
      serviceAccountName: runner-service-account
      volumes:
        - name: pod-templates
          configMap:
            name: pod-templates
      volumeMounts:
        - name: pod-templates
          mountPath: /home/runner/pod-templates
          readOnly: true

In the end, our Terraform adventure wasn’t just about deployment – it was a journey through security revelations, RBAC complexities, and creative problem-solving.

Automation With Terraform

In the realm of modern software development, orchestrating seamless and reproducible workflows is paramount. While manually deploying applications through Kubernetes manifests and interacting with kubectl commands can be a quick way to get things up and running, professionals strive for more. The pursuit of a sustainable, scalable, and consistent deployment process has led us to harness the power of Terraform.

We required 4 providers…

terraform {
  required_version = ">= 0.13"
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "3.83.0"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "2.23.0"
    }
    helm = {
      source  = "hashicorp/helm"
      version = "2.11.0"
    }
    kubectl = {
      source  = "gavinbunney/kubectl"
      version = "1.14.0"
    }
  }
}

… and some variables …

variable "location" {
  type        = string
  description = "Azure location for resources"
  default     = "westeurope"
}

variable "prefix" {
  type        = string
  description = "Prefix for all resources"
  default     = "demo-features-arc"
}

variable "cluster_version" {
  type        = string
  description = "The Kubernetes version for our clusters"
  default     = "1.27"
}

variable "cluster_node_size" {
  type        = string
  description = "The Kubernetes node size for our clusters"
  default     = "standard_d2s_v5"
}

variable "k8s_namespace_cert_manager" {
  type        = string
  description = "The Kubernetes namespace for deploying cert-manager"
  default     = "cert-manager"
}

variable "k8s_namespace_controller" {
  type        = string
  description = "The Kubernetes namespace for deploying the actions runner contrtoller"
  default     = "demo-features-arc-controller"
}

variable "k8s_auth_secret_name" {
  type        = string
  description = "The Kubernetes secret name for our GitHub App"
  default     = "github-auth-secret"
}

variable "github_app_id" {
  type        = string
  description = "GitHub App ID"
  sensitive   = true
}

variable "github_app_install_id" {
  type        = string
  description = "GitHub App Install ID"
  sensitive   = true
}

variable "github_app_private_key" {
  type        = string
  description = "GitHub App Private Key"
  sensitive   = true
}

variable "arc_replicas" {
  type        = number
  description = "Number of replicas for the Arc Controller"
  default     = 3
}

variable "runners" {
  type = map(object({
    repo_owner = string
    repo_name  = string
    image      = string
    labels     = list(string)
    min_count  = number
    max_count  = number
    metrics    = list(map(string))
  }))
  description = "List of GitHub repositories to add to the Arc Controller"
  default = {
    demo-features-arc-runner = {
      repo_owner = "xpirit-training"
      repo_name  = "demo-features-arc"
      labels     = ["aks", "k8s"]
      image      = "summerwind/actions-runner:latest"
      min_count  = 2
      max_count  = 5
      metrics = [
        {
          type               = "PercentageRunnersBusy"
          scaleUpThreshold   = "0.75"
          scaleDownThreshold = "0.25"
          scaleUpFactor      = "2"
          scaleDownFactor    = "0.5"
        },
        {
          "type"       = "TotalNumberOfQueuedAndInProgressWorkflowRuns"
          "repository" = "xpirit-training/demo-features-arc"
          "name"       = "total"
        }
      ]
    }
  }
}

variable "default_tags" {
  type        = map(string)
  description = "Value for default tags"
  default = {
    owner = "xpirit-germany"
  }
}

Note that we now use a GitHub App instead of a personal access token.

First, we created a resource group containing a virtual network and an AKS instance.

resource "azurerm_resource_group" "this" {
  name     = "rg-${var.prefix}"
  location = var.location
  tags     = var.default_tags
}

module "network" {
  source  = "Azure/network/azurerm"
  version = "5.3.0"

  resource_group_name = azurerm_resource_group.this.name
  use_for_each        = true

  address_space   = "11.0.0.0/16"
  subnet_prefixes = ["11.0.1.0/24"]
  subnet_names    = ["${var.prefix}-sn"]

  depends_on = [
    azurerm_resource_group.this
  ]

  tags = var.default_tags
}

module "aks" {
  source  = "Azure/aks/azurerm"
  version = "7.4.0"

  prefix              = var.prefix
  cluster_name        = "aks"
  resource_group_name = azurerm_resource_group.this.name

  kubernetes_version        = var.cluster_version
  orchestrator_version      = var.cluster_version
  automatic_channel_upgrade = "patch"

  agents_size      = var.cluster_node_size
  agents_min_count = 1
  agents_max_count = 3
  agents_count     = 1
  os_disk_size_gb  = 100

  vnet_subnet_id = module.network.vnet_subnets[0]

  rbac_aad                          = true
  rbac_aad_managed                  = true
  role_based_access_control_enabled = true

  tags = var.default_tags

  depends_on = [
    module.network
  ]
}

To deploy into the cluster, we needed the helm, kubernetes, and kubectl provider to be set up correctly.

provider "kubernetes" {
  host                   = module.aks.admin_host
  username               = module.aks.admin_username
  password               = module.aks.admin_password
  client_certificate     = base64decode(module.aks.admin_client_certificate)
  client_key             = base64decode(module.aks.admin_client_key)
  cluster_ca_certificate = base64decode(module.aks.admin_cluster_ca_certificate)
}

provider "helm" {
  burst_limit = 300
  kubernetes {
    host                   = module.aks.admin_host
    username               = module.aks.admin_username
    password               = module.aks.admin_password
    client_certificate     = base64decode(module.aks.admin_client_certificate)
    client_key             = base64decode(module.aks.admin_client_key)
    cluster_ca_certificate = base64decode(module.aks.admin_cluster_ca_certificate)
  }
}

provider "kubectl" {
  host                   = module.aks.admin_host
  username               = module.aks.admin_username
  password               = module.aks.admin_password
  client_certificate     = base64decode(module.aks.admin_client_certificate)
  client_key             = base64decode(module.aks.admin_client_key)
  cluster_ca_certificate = base64decode(module.aks.admin_cluster_ca_certificate)
  load_config_file       = false
  apply_retry_count      = 10
}

With the help of the Kubernetes and Helm providers, we were able to deploy the Kubernetes cert manager:

resource "kubernetes_namespace" "namespace_cert_manager" {
  metadata {
    name = var.k8s_namespace_cert_manager
  }
  depends_on = [module.aks]
}

resource "helm_release" "cert_manager" {
  name       = "cert-manager"
  repository = "https://charts.jetstack.io"
  chart      = "cert-manager"
  version    = "v1.13.1"
  namespace  = var.k8s_namespace_cert_manager

  #atomic  = true
  timeout = 600

  set {
    name  = "installCRDs"
    value = "true"
  }

  depends_on = [
    kubernetes_namespace.namespace_cert_manager
  ]
}

Now, we had everything set up to deploy the controller and the runner using our new modules.

module "runner_controller" {
  source  = "infinite-automations/github-actions-runner-controller/helm"
  version = "1.0.0"

  namespace        = var.k8s_namespace_controller
  create_namespace = true

  github_app_id          = var.github_app_id
  github_app_install_id  = var.github_app_install_id
  github_app_private_key = var.github_app_private_key

  depends_on = [
    module.aks,
    helm_release.cert_manager
  ]
}

module "runner" {
  source  = "infinite-automations/github-actions-runner/kubectl"
  version = "1.0.1"

  for_each = var.runners

  namespace        = each.key
  create_namespace = true

  repo_owner = each.value.repo_owner
  repo_name  = each.value.repo_name
  labels     = each.value.labels

  runner_image = each.value.image

  min_count = each.value.min_count
  max_count = each.value.max_count
  metrics   = each.value.metrics

  depends_on = [
    module.aks,
    module.runner_controller
  ]
}

Assuming you already have your infrastructure up and running, these last few lines are everything to set up a working runner on Kubernetes.

Highlighting our commitment to community collaboration, we proudly share the Terraform modules we developed during this endeavor. These modules, now available on the Terraform Registry, empower others to replicate and extend our setup effortlessly.

Open-sourcing our work is not just a commitment but an invitation for the community to build, iterate, and innovate collectively.

Conclusion

As we wrap up this exploration, we reflect on the synergy between GitHub Actions Runner Controller, Kubernetes, and Terraform. The Innovation Day project not only unveiled the power of these tools but also exemplified how collaborative efforts can lead to innovative solutions. Dive into the world of GitHub Actions, Kubernetes, and Terraform with our comprehensive guide, and unlock a new realm of possibilities for your CI/CD workflows.

Links

GitHub Actions Runner Controller: https://registry.terraform.io/modules/infinite-automations/github-actions-runner-controller/helm/latest

GitHub Actions Runner: https://registry.terraform.io/modules/infinite-automations/github-actions-runner/kubectl/latest

About the author

Leave a comment