Introduction

GitHub Actions is the go-to CI/CD tool for many teams. But when your organization runs thousands of pipelines daily, the default setup breaks down. You hit limits on scale, security, and governance — plus skyrocketing costs.

GitHub-hosted runners are easy but expensive and don’t meet strict compliance needs. Existing self-hosted solutions like Actions Runner Controller (ARC) or Terraform EC2 modules don’t fully solve multi-tenant isolation, automation, or centralized control.

ForgeMT, built inside Cisco’s Security Business Group, fills that gap. It’s an open-source AWS-native platform that manages ephemeral runners with strong tenant isolation, full automation, and enterprise-grade governance.

This article explains why ForgeMT matters and how it works — providing a practical look at building scalable, secure GitHub Actions runner platforms.


Why Enterprise CI/CD Runners Fail at Scale

At large organizations, scaling GitHub Actions runners encounters four key bottlenecks:

In short, enterprises fail to scale GitHub Actions runners without a platform that:

But beware—over-centralization can kill flexibility and introduce new challenges.


Why GitHub Actions — And Why It’s Not Enough at Enterprise Scale

GitHub Actions is popular because it offers:

However, GitHub Actions alone can’t meet enterprise-scale demands. Enterprises require:

Cloud providers like AWS supply identity, networking, and automation building blocks—IAM/OIDC, VPC segmentation, EC2, EKS (needed to build secure, scalable, multi-tenant CI/CD platforms).


Existing Solutions and Why They Fall Short

Actions Runner Controller (ARC) runs ephemeral Kubernetes pods as GitHub runners, scaling dynamically with declarative config and Kubernetes-native integration. But:

Terraform AWS GitHub Runner Module provisions EC2 self-hosted runners with customizable AMIs, integrating well with IaC pipelines. However:

Commercial Runner-as-a-Service options offer simple UX, automatic scaling, and vendor-managed maintenance with SLAs, but:


Where ForgeMT Fits In

ForgeMT combines the best of these approaches to deliver an enterprise-ready platform:

ForgeMT doesn’t reinvent ARC or EC2 modules but extends them with:


Architecture Overview

At its core, ForgeMT is a centralized control plane that orchestrates ephemeral runner provisioning and lifecycle management across multiple tenants running on both EC2 and Kubernetes.

Key Components


ForgeMT Control Plane

The control plane is the platform’s brain — managing runner provisioning, lifecycle, security, scaling, and observability.

  1. Centralized Orchestration: Decides when and where to spin up ephemeral runners (EC2 or Kubernetes pods).
  2. Multi-Tenant Isolation: Isolates each tenant via dedicated AWS accounts or Kubernetes namespaces, IAM roles, and network policies.
  3. Security Enforcement: Applies hardened runner configurations, automates ephemeral credential rotation, and enforces least privilege.
  4. Scaling & Optimization: Integrates with Karpenter and EC2 autoscaling to scale runners up/down with demand and cost awareness.
  5. Observability & Governance: Streams logs and metrics to Splunk; provides audit trails and compliance dashboards.

Runner Types and Usage

Tenant Isolation

Each ForgeMT deployment is single-tenant and region-specific. IAM roles, policies, VPCs, and services are scoped exclusively to that tenant-region pair. This hard boundary prevents cross-tenant access, simplifies compliance, and minimizes blast radius.

EC2 Runners

EKS Runners

Warm Pools and Limits

ForgeMT supports warm pools of pre-initialized runners to minimize cold start latency—especially beneficial for EC2 runners with slower boot times.

Per-tenant limits enforce:

These controls prevent resource abuse and keep costs predictable.


Tenant Onboarding

Deploying a new tenant is straightforward and fully automated via a single declarative config file, for example:

gh_config:
  ghes_url: ''
  ghes_org: cisco-open
tenant:
  iam_roles_to_assume:
    - arn:aws:iam::123456789012:role/role_for_forge_runners
  ecr_registries:
    - 123456789012.dkr.ecr.eu-west-1.amazonaws.com
ec2_runner_specs:
  small:
    ami_name: forge-gh-runner-v*
    ami_owner: '123456789012'
    ami_kms_key_arn: ''
    max_instances: 1
    instance_types:
      - t2.small
      - t2.medium
      - t2.large
      - t3.small
      - t3.medium
      - t3.large
    pool_config: []
    volume:
      size: 200
      iops: 3000
      throughput: 125
      type: gp3
  large:
    ami_name: forge-gh-runner-v*
    ami_owner: '123456789012'
    ami_kms_key_arn: ''
    max_instances: 1
    instance_types:
      - c6i.8xlarge
      - c5.9xlarge
      - c5.12xlarge
      - c6i.12xlarge
      - c6i.16xlarge
    pool_config: []
    volume:
      size: 200
      iops: 3000
      throughput: 125
      type: gp3
arc_runner_specs:
  dind:
    runner_size:
      max_runners: 100
      min_runners: 1
    scale_set_name: dependabot
    scale_set_type: dind
    container_actions_runner: 123456789012.dkr.ecr.eu-west-1.amazonaws.com/actions-runner:latest
    container_requests_cpu: 500m
    container_requests_memory: 1Gi
    container_limits_cpu: '1'
    container_limits_memory: 2Gi
    volume_requests_storage_type: gp2
    volume_requests_storage_size: 10Gi
  k8s:
    runner_size:
      max_runners: 100
      min_runners: 1
    scale_set_name: k8s
    scale_set_type: k8s
    container_actions_runner: 123456789012.dkr.ecr.eu-west-1.amazonaws.com/actions-runner:latest
    container_requests_cpu: 500m
    container_requests_memory: 1Gi
    container_limits_cpu: '1'
    container_limits_memory: 2Gi
    volume_requests_storage_type: gp2
    volume_requests_storage_size: 10Gi

Enter fullscreen mode Exit fullscreen mode

The ForgeMT platform uses this config to:

This automation enables zero-touch onboarding with no manual AWS or GitHub setup required by the tenant.


Extensibility

ForgeMT lets tenants customize their environments and control runner access:

This lets each team tune cost, security, and performance independently without affecting core platform stability.


Security Model

ForgeMT’s foundation is strong isolation and ephemeral execution to reduce risk:


Debugging in a Secure, Ephemeral World

Ephemeral runners mean persistent debugging isn’t possible by design, but ForgeMT offers:


Conclusion

ForgeMT is likely overkill for small teams. Start simple with ephemeral runners (EC2 or ARC), GitHub Actions, and Terraform automation. Only scale up when you hit real pain points. ForgeMT shines in multi-team environments where tenant isolation, governance, and platform automation are mission-critical. For solo teams, it just adds unnecessary complexity.

ForgeMT addresses the major enterprise challenges of running GitHub Actions runners at scale by delivering:

For organizations struggling to scale self-hosted runners securely and efficiently on AWS, ForgeMT provides a battle-tested, transparent platform that combines AWS best practices with developer-friendly automation.


Dive Into the ForgeMT Project

Ideas are cheap — execution is what counts. ForgeMT’s source code is public — check it out:

👉 https://github.com/cisco-open/forge/

⭐️ If you find it useful, don’t forget to drop a star!


🤝 Connect

Let’s connect on LinkedIn and GitHub.