An Elegant, Automated Infrastructure Factory

Every Platform Engineer has a dream: to build a perfectly working machine where developers can simply create the infrastructure they need without causing any trouble or urgent work for the platform team. This infrastructure should comply with security policies, be consistent, and be observable.

My initial goal was to build this dream. I wanted to create a Proof of Concept (PoC) of such a platform, document my findings in an article, and share it with a public audience. I wanted to learn a modern open-source project called Backstage because it seemed like a well-documented solution that would be straightforward to try.

The premise was an elegant architecture to solve the “WET” (Write Everything Twice) problem. Many organizations, as their teams grow, struggle with challenges as the idea of IaC (Infrastructure as Code) degrades into a mess of duplicated Terraform modules. This is a situation where a simple change requires a manual copy-paste approach, carrying a high possibility of making an error.

My solution was a marriage of Developer Experience and Platform Engineering, combining two key parts:

Here is what it was supposed to look like:

Then, I tried to build it. And that's when the article I intended to write died, and this one was born.

Backstage is a Framework, Not a Product

My plan was scoped with three main goals:

But this "simple" plan turned out to be incredibly complicated. Every small step had a dozen other tricky steps hidden inside it. Even before writing a single line of a Scaffolder template, I had to debug the Node.js backend, wrestle with authentication providers, manage building a Docker Container, and get Backstage itself running.

I stopped. The article I wanted to write was about the synergy between a developer portal and Terragrunt. Instead, I found myself on a long and frustrating detour to becoming a full-time Backstage administrator.

Backstage has a lot of promise. It can be a single, central hub for all of a company's engineering tools, and you can customize it to do anything. But the implementation is a full software project that requires a dedicated team of engineers with skills in React, TypeScript, and Node.js. Its Total Cost of Ownership (TCO) is not the license fee (it's free), but the significant and ongoing investment in dedicated engineering talent.

Why Terragrunt Still Wins

Even though the Portal part didn't work, the automated backend design is still the right approach and is more crucial than ever. The complexity of the portal does not invalidate the need for a DRY orchestration layer.

Regardless of whether requests come from a homegrown Backstage instance, a managed IDP, or even a CI/CD job, Terragrunt is the key to solving the WET problem and ensuring long-term maintainability.

The Key Terragrunt Patterns:

Here’s a glimpse of the elegance. Instead of a 100-line .tf file, the output of the automation is this small, declarative terragrunt.hcl file:

# terragrunt.hcl for a specific service's RDS database

include "root" {
  path = find_in_parent_folders()
}

terraform {
  source = "git::ssh://[email protected]/your-org/terraform-modules.git//aws/rds?ref=v1.2.1"
}

inputs = {
  instance_class    = "db.t3.micro"
  allocated_storage = 20
  db_name           = "my-app-database"
  environment       = "staging"
}

This is the fundamental trade-off: A template generating raw .tf files is faster for Day 1, but creates a nightmare of maintenance debt on Day 2. The Terragrunt approach creates a contract, not a copy. An update to the central rds module to version v1.2.2 can be rolled out across hundreds of services in a controlled, automated fashion using terragrunt run --all. That is the power of a true factory.

Conclusion

I set out to build a factory and instead discovered the high cost of building my own tooling. The experience wasn't a failure, but a refinement of the blueprint. The core principles of DRY IaC and central orchestration with Terragrunt are more critical than ever. However, the path to a developer-friendly frontend is not one-size-fits-all.

Before you commit to building your developer portal from the ground up with a powerful framework like Backstage, take a hard look at the total cost of ownership. The most elegant architecture is useless if it never gets built. Sometimes, the most pragmatic solution is to buy the storefront and focus your energy on perfecting the factory floor.


Resources