Backstage Portal: Progressively Complex Application Deployments on AWS

READER BEWARE: THE FOLLOWING WRITTEN ENTIRELY BY AI WITHOUT HUMAN EDITING.

Introduction

Every application starts with an idea. In those early days, speed and simplicity matter most — you need to validate your concept without drowning in infrastructure complexity. But as the application gains traction, as users multiply and requirements deepen, the infrastructure must evolve alongside it. The challenge is bridging those two worlds: the lean startup mindset of “ship it fast” and the enterprise reality of “it must never go down.”

Backstage, the open-source developer portal framework originally built by Spotify and now a CNCF incubating project, offers a compelling answer. By centralizing software templates, service catalogs, and self-service workflows, Backstage enables teams to codify their infrastructure progression — allowing developers to deploy a simple AWS Lambda function on day one, then graduate to AWS Fargate, and eventually to a fully managed AWS EKS cluster, all through the same portal interface.

This post explores how Backstage can be used to manage progressively more complex application deployments on AWS, touching on compute, storage, messaging, caching, and the cost and reliability trade-offs at each stage.

Official Backstage Documentation: https://backstage.io/docs/


The Progressive Deployment Philosophy

Modern applications rarely start complex. A product idea begins as a proof of concept, grows into a minimum viable product, then evolves into a production-grade service with demanding reliability, scalability, and security requirements. Infrastructure should follow the same arc.

The problem is that jumping straight to Kubernetes for a new application is wasteful, slow, and operationally expensive. Conversely, staying on Lambda forever means hitting concurrency limits, cold start latency problems, and architectural constraints as the application grows.

Backstage’s Software Templates (also called Scaffolder templates) allow platform teams to encode this progression as a series of self-service actions that developers can invoke directly from the portal. Rather than filing JIRA tickets or waiting for a DevOps engineer to manually provision infrastructure, developers choose the right tier for their current needs and Backstage handles the provisioning.

The result is a continuous spectrum:

StageComputeComplexityBest For
1 — PrototypeAWS LambdaVery LowIdea validation, event-driven scripts
2 — Early ProductionAWS FargateLow–MediumContainerized services, small teams
3 — ScaleAWS EKSMedium–HighHigh traffic, multi-service, platform teams

Stage 1: AWS Lambda — Start Small, Ship Fast

Why Lambda First?

AWS Lambda is the natural starting point for a new application. There is no infrastructure to provision, no containers to build, and no scaling configuration to tune. You write code, deploy it, and pay only for what you use.

Lambda shines for:

  • API backends with intermittent or unpredictable traffic
  • Event-driven workflows triggered by S3 uploads, DynamoDB streams, or SQS messages
  • Scheduled jobs using EventBridge (CloudWatch Events)
  • Prototype validation before committing to a heavier compute layer

Backstage Template: Lambda Deployment

A Backstage Software Template for Lambda might look like this:

# templates/lambda-service/template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: aws-lambda-service
  title: AWS Lambda Service
  description: Deploy a new serverless service on AWS Lambda
spec:
  owner: platform-team
  type: service
  parameters:
    - title: Service Configuration
      required: [name, runtime, region]
      properties:
        name:
          title: Service Name
          type: string
          description: Unique name for the Lambda function
        runtime:
          title: Runtime
          type: string
          enum: [python3.12, nodejs20.x, java21]
          default: python3.12
        region:
          title: AWS Region
          type: string
          default: us-east-1
        memorySize:
          title: Memory (MB)
          type: integer
          default: 256
          enum: [128, 256, 512, 1024, 2048]
        timeout:
          title: Timeout (seconds)
          type: integer
          default: 30
  steps:
    - id: fetch-template
      name: Fetch Lambda Template
      action: fetch:template
      input:
        url: ./skeleton
        values:
          name: ${{ parameters.name }}
          runtime: ${{ parameters.runtime }}
          region: ${{ parameters.region }}
          memorySize: ${{ parameters.memorySize }}
          timeout: ${{ parameters.timeout }}
    - id: create-repo
      name: Create GitHub Repository
      action: publish:github
      input:
        repoUrl: github.com?owner=my-org&repo=${{ parameters.name }}
    - id: register
      name: Register in Backstage Catalog
      action: catalog:register
      input:
        repoContentsUrl: ${{ steps['create-repo'].output.repoContentsUrl }}
        catalogInfoPath: /catalog-info.yaml

This template creates a GitHub repository with a pre-configured Lambda function, an IaC definition (Terraform or AWS CDK), a GitHub Actions pipeline for deployment, and a catalog entry in Backstage so the service is immediately discoverable.

Supporting Services at Lambda Stage

At this stage, keep dependencies minimal:

Storage: Use Amazon S3 for file storage and Amazon DynamoDB for structured data. Both are serverless and require zero management. DynamoDB on-demand pricing means you only pay for actual reads and writes.

Messaging: Use Amazon SQS for decoupled messaging between Lambda functions. A dead-letter queue (DLQ) is a must-have for reliability.

Email: Use Amazon SES in sandbox mode initially. SES integrates directly with Lambda via the AWS SDK and requires no additional infrastructure.

Cost Profile at Stage 1

AWS Lambda pricing is among the most cost-effective options for low-to-moderate workloads:

  • Compute: $0.0000166667 per GB-second. A function with 256 MB RAM running for 100ms costs roughly $0.000000042 per invocation.
  • Requests: $0.20 per 1 million requests (first 1 million free each month).
  • DynamoDB on-demand: $1.25 per million write request units, $0.25 per million read request units.

For a prototype handling tens of thousands of requests per month, the total AWS bill is often under $5/month.

Limitations of Lambda

Lambda works until it doesn’t:

  • Cold starts introduce latency spikes (100ms–3s for JVM runtimes) that hurt user experience under light load.
  • 15-minute execution limit rules out long-running processes.
  • Concurrency limits (default 1,000 per region) can cause throttling under sudden traffic spikes.
  • Stateless architecture requires externalizing all state, which adds complexity.
  • Container image size limits (10 GB) and ephemeral storage (512 MB default, up to 10 GB) constrain large ML or data-processing workloads.

When you start hitting these limits, it’s time to consider Stage 2.


Stage 2: AWS Fargate — Containers Without the Cluster

Why Fargate?

AWS Fargate provides serverless compute for containers. You package your application into a Docker container, define the CPU and memory you need, and Fargate runs it — without you managing EC2 instances, AMI updates, or node groups.

Fargate fits the gap between Lambda’s strict execution model and the full control (and complexity) of Kubernetes. It works well for:

  • Long-running HTTP services that need consistent latency without cold starts
  • Background workers that process queues or run scheduled batch jobs
  • Stateful-ish services that benefit from container persistence (sidecars, shared memory within a task)
  • Teams without Kubernetes expertise who need more than Lambda offers

Backstage Template: Fargate Service

# templates/fargate-service/template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: aws-fargate-service
  title: AWS Fargate Service
  description: Deploy a containerized service on AWS Fargate (ECS)
spec:
  owner: platform-team
  type: service
  parameters:
    - title: Service Configuration
      required: [name, region, cpu, memory]
      properties:
        name:
          title: Service Name
          type: string
        region:
          title: AWS Region
          type: string
          default: us-east-1
        cpu:
          title: vCPU Units
          type: integer
          enum: [256, 512, 1024, 2048, 4096]
          default: 512
        memory:
          title: Memory (MB)
          type: integer
          enum: [512, 1024, 2048, 4096, 8192]
          default: 1024
        desiredCount:
          title: Desired Task Count
          type: integer
          default: 2
        enableAutoScaling:
          title: Enable Auto Scaling
          type: boolean
          default: true
        minCapacity:
          title: Minimum Task Count
          type: integer
          default: 1
        maxCapacity:
          title: Maximum Task Count
          type: integer
          default: 10
    - title: Networking
      properties:
        enableAlb:
          title: Enable Application Load Balancer
          type: boolean
          default: true
        enableHttps:
          title: Enable HTTPS (requires ACM certificate)
          type: boolean
          default: true
  steps:
    - id: fetch-template
      name: Fetch Fargate Template
      action: fetch:template
      input:
        url: ./skeleton
        values:
          name: ${{ parameters.name }}
          cpu: ${{ parameters.cpu }}
          memory: ${{ parameters.memory }}
    - id: create-repo
      name: Create GitHub Repository
      action: publish:github
      input:
        repoUrl: github.com?owner=my-org&repo=${{ parameters.name }}
    - id: register
      name: Register in Backstage Catalog
      action: catalog:register
      input:
        repoContentsUrl: ${{ steps['create-repo'].output.repoContentsUrl }}
        catalogInfoPath: /catalog-info.yaml

The generated repository includes a Dockerfile, ECS task definition, Application Load Balancer configuration, and a GitHub Actions pipeline for building and deploying container images to Amazon ECR.

Supporting Services at Fargate Stage

At this stage, the application has real users and real traffic. Infrastructure choices become more consequential.

Storage:

  • Amazon RDS (PostgreSQL/MySQL) replaces DynamoDB for relational workloads. Use Multi-AZ deployments for production. Aurora Serverless v2 offers a good middle ground — it scales automatically and you don’t pay for idle capacity.
  • Amazon S3 continues to serve as object storage.
  • Amazon EFS can provide shared file storage across Fargate tasks when needed.

Caching:

  • Amazon ElastiCache for Redis dramatically reduces database load and improves response times for read-heavy workloads. A single-node Redis instance can serve as a session store and query cache.
  • For even simpler caching, DynamoDB Accelerator (DAX) adds a microsecond-latency cache in front of DynamoDB tables.

Messaging:

  • Amazon SQS scales naturally to Fargate workers consuming messages.
  • Amazon SNS provides pub/sub fanout — one SNS topic can fan out to multiple SQS queues, Lambda functions, or email endpoints via SES.
  • Amazon EventBridge replaces polling patterns with event-driven routing between services.

Email:

  • Amazon SES moves out of sandbox mode once the application has a real domain and verified sending addresses. SES supports transactional email, bulk campaigns, and inbound email processing.

Cost Profile at Stage 2

Fargate pricing is based on vCPU and memory consumed per second:

  • vCPU: $0.04048 per vCPU-hour
  • Memory: $0.004445 per GB-hour
  • A task with 0.5 vCPU and 1 GB memory running continuously costs ~$22/month.
  • Two tasks behind an ALB (for high availability) costs ~$44/month for compute + ~$18/month for the ALB itself.

Add Aurora Serverless v2 ($0.12/ACU-hour minimum) and ElastiCache Redis (t4g.small at ~$25/month) and a typical Stage 2 stack runs $100–$300/month depending on traffic.

This is significantly higher than Lambda but provides consistent performance, no cold starts, and no execution time limits.

Limitations of Fargate

Fargate removes the operational burden of EC2 but introduces its own constraints:

  • Multi-service orchestration becomes complex — each service has its own ECS cluster (or shared cluster), and service discovery requires AWS Cloud Map or a service mesh.
  • No shared memory between tasks. Each task is isolated.
  • Limited networking primitives compared to Kubernetes — no native pod networking, no admission webhooks, no custom resource definitions.
  • Resource density is lower than Kubernetes — Fargate tasks cannot bin-pack as efficiently as Kubernetes pods on shared nodes.

When the application has multiple teams contributing services, when traffic patterns require sophisticated autoscaling, or when the engineering org needs a common platform for dozens of services, it’s time for Stage 3.


Stage 3: AWS EKS — Enterprise-Grade Kubernetes

Why EKS?

Amazon Elastic Kubernetes Service (EKS) is the AWS-managed Kubernetes control plane. It’s the most operationally complex option but also the most powerful — providing fine-grained control over networking, scheduling, resource isolation, security policies, and observability.

EKS is the right choice when:

  • The engineering organization has multiple teams deploying dozens of services
  • You need advanced autoscaling (Karpenter for nodes, KEDA for event-driven pod scaling)
  • Service mesh capabilities (Istio, Linkerd, or AWS App Mesh) are required for traffic management and mTLS
  • Custom operators and Kubernetes-native tooling are part of the platform strategy
  • Compliance requirements mandate strict network policies and pod security standards
  • The application runs stateful workloads (databases, message brokers, ML model servers) requiring persistent volumes

Backstage Template: EKS Application

# templates/eks-service/template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: eks-application
  title: EKS Application
  description: Deploy a production-grade application on AWS EKS
spec:
  owner: platform-team
  type: service
  parameters:
    - title: Application Configuration
      required: [name, namespace, region]
      properties:
        name:
          title: Application Name
          type: string
        namespace:
          title: Kubernetes Namespace
          type: string
          default: default
        region:
          title: AWS Region
          type: string
          default: us-east-1
        replicas:
          title: Initial Replica Count
          type: integer
          default: 3
    - title: Resource Limits
      properties:
        cpuRequest:
          title: CPU Request
          type: string
          default: "250m"
        cpuLimit:
          title: CPU Limit
          type: string
          default: "1000m"
        memoryRequest:
          title: Memory Request
          type: string
          default: "256Mi"
        memoryLimit:
          title: Memory Limit
          type: string
          default: "1Gi"
    - title: Autoscaling
      properties:
        enableHpa:
          title: Enable Horizontal Pod Autoscaler
          type: boolean
          default: true
        enableKeda:
          title: Enable KEDA (event-driven scaling)
          type: boolean
          default: false
        enableKarpenter:
          title: Enable Karpenter Node Provisioning
          type: boolean
          default: true
        minReplicas:
          title: Minimum Replicas
          type: integer
          default: 2
        maxReplicas:
          title: Maximum Replicas
          type: integer
          default: 20
    - title: Networking
      properties:
        ingressClass:
          title: Ingress Class
          type: string
          enum: [aws-alb, nginx, traefik]
          default: aws-alb
        enableServiceMesh:
          title: Enable Istio Service Mesh
          type: boolean
          default: false
  steps:
    - id: fetch-template
      name: Fetch EKS Application Template
      action: fetch:template
      input:
        url: ./skeleton
        values:
          name: ${{ parameters.name }}
          namespace: ${{ parameters.namespace }}
          replicas: ${{ parameters.replicas }}
    - id: create-repo
      name: Create GitHub Repository
      action: publish:github
      input:
        repoUrl: github.com?owner=my-org&repo=${{ parameters.name }}
    - id: create-argocd-app
      name: Register ArgoCD Application
      action: argocd:create-resources
      input:
        appName: ${{ parameters.name }}
        argoInstance: prod-argocd
        namespace: ${{ parameters.namespace }}
        repoUrl: ${{ steps['create-repo'].output.remoteUrl }}
    - id: register
      name: Register in Backstage Catalog
      action: catalog:register
      input:
        repoContentsUrl: ${{ steps['create-repo'].output.repoContentsUrl }}
        catalogInfoPath: /catalog-info.yaml

The generated Kubernetes manifests include a Deployment, Service, HorizontalPodAutoscaler, PodDisruptionBudget, NetworkPolicy, resource quotas, and an ArgoCD Application for GitOps-driven continuous delivery.

Supporting Services at EKS Stage

At this stage, the infrastructure mirrors what large enterprises run in production.

Storage:

  • Amazon Aurora PostgreSQL (Multi-AZ, Global Database) for transactional data with cross-region read replicas.
  • Amazon S3 with intelligent tiering for cost-optimized object storage.
  • Amazon EBS persistent volumes for stateful Kubernetes workloads.
  • Amazon EFS for shared ReadWriteMany storage across pods.

Caching:

  • Amazon ElastiCache for Redis (Cluster Mode) provides horizontal sharding across multiple Redis nodes, handling millions of operations per second. Combined with read replicas, it provides both high throughput and high availability.
  • Amazon ElastiCache for Memcached for simple, high-speed caching when Redis data structures are not needed.

Messaging & Eventing:

  • Amazon SQS FIFO queues for ordered, exactly-once message processing.
  • Amazon SNS with SQS fanout for event-driven microservice communication.
  • Amazon MSK (Managed Streaming for Apache Kafka) for high-throughput event streaming when SQS is insufficient.
  • Amazon EventBridge as a fully managed event bus for routing events across services and AWS accounts.

Email & Notifications:

  • Amazon SES with dedicated IP addresses for high-volume transactional and marketing email.
  • Amazon SNS for SMS notifications and mobile push notifications.
  • Amazon Pinpoint for sophisticated multi-channel customer engagement campaigns.

Security & Compliance:

  • AWS IAM Roles for Service Accounts (IRSA) to grant Kubernetes pods fine-grained AWS permissions without storing credentials.
  • AWS Secrets Manager or External Secrets Operator for injecting secrets into pods at runtime.
  • Kyverno or OPA Gatekeeper for Kubernetes admission control and policy enforcement.

Cost Profile at Stage 3

EKS has a fixed control plane cost plus the underlying EC2 or Fargate compute:

  • EKS control plane: $0.10/hour ($72/month) per cluster.
  • Managed node groups (m5.xlarge): ~$175/month per node. A production cluster with 5 nodes costs ~$875/month.
  • Karpenter can reduce this by right-sizing nodes and using Spot instances, potentially cutting compute costs by 60–70%.
  • Aurora PostgreSQL Multi-AZ (db.r6g.large): ~$370/month.
  • ElastiCache Redis Cluster (cache.r6g.large × 3): ~$555/month.
  • MSK (kafka.m5.large × 3): ~$430/month.

A full production EKS stack runs $2,000–$10,000+/month depending on traffic, replication factor, and data volumes. This is justified when the application serves significant business value and reliability requirements cannot be met by simpler alternatives.


Backstage as the Control Plane

The power of Backstage isn’t just the scaffolding — it’s the unified control plane that ties all three stages together. Once services are registered in the Backstage Software Catalog, teams get:

Software Catalog: Unified Visibility

Every Lambda function, Fargate service, and EKS application appears in a single, searchable catalog. Metadata like ownership, deployment stage, linked AWS resources, API definitions, and runbooks are co-located in catalog-info.yaml:

# catalog-info.yaml
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: payment-service
  title: Payment Service
  description: Processes customer payments via Stripe
  annotations:
    backstage.io/techdocs-ref: dir:.
    github.com/project-slug: my-org/payment-service
    aws.amazon.com/lambda-function-name: payment-service-prod
    aws.amazon.com/region: us-east-1
  tags:
    - payments
    - critical
    - stage-2
  links:
    - url: https://console.aws.amazon.com/lambda
      title: AWS Lambda Console
      icon: dashboard
    - url: https://grafana.example.com/d/payment-service
      title: Grafana Dashboard
      icon: dashboard
spec:
  type: service
  lifecycle: production
  owner: payments-team
  system: e-commerce-platform
  dependsOn:
    - resource:default/payments-db
    - resource:default/payments-sqs-queue
    - resource:default/ses-transactional

TechDocs: Living Documentation

Backstage’s TechDocs feature renders Markdown documentation directly in the portal, sourced from the same repository as the code. Architecture diagrams, runbooks, API references, and onboarding guides stay in sync with the service.

Service Upgrade Workflows

One of Backstage’s most powerful patterns is using Software Templates as upgrade workflows. When a team is ready to move from Lambda to Fargate, they invoke an upgrade template that:

  1. Reads the existing catalog-info.yaml for the service
  2. Provisions the Fargate task definition and ECS service
  3. Configures an ALB with gradual traffic shifting
  4. Updates the catalog entry to reflect the new deployment stage
  5. Archives the Lambda function after traffic is fully migrated

This makes infrastructure evolution a deliberate, tracked, self-service action rather than an ad-hoc operational task.


Cost, Reliability, and Scalability Trade-offs

Cost

StageMonthly Cost (Typical)Cost Driver
Lambda$5–$50Pay per invocation, near-zero at low volume
Fargate$100–$500Always-on tasks, ALB, managed DB
EKS$2,000–$10,000+Control plane, nodes, stateful services

The key insight: don’t over-provision. Running EKS on day one for a prototype is a $2,000/month mistake. Running Lambda for a service handling 10 million daily active users is an architectural mistake. Backstage helps teams stay at the right tier by making transitions explicit and low-friction.

Reliability

StageAvailability TargetHA Mechanism
Lambda99.95% (AWS SLA)Automatic multi-AZ, managed by AWS
Fargate99.99% (with Multi-AZ)Multiple tasks across AZs behind ALB
EKS99.99%+PodDisruptionBudgets, Karpenter, multi-AZ node groups

Each tier offers higher reliability ceiling but requires more configuration to achieve it. Backstage templates encode the right reliability defaults for each tier — developers don’t need to know the details of PodDisruptionBudgets or ALB health check tuning.

Scalability

StageScale TriggerScale SpeedLimits
LambdaAutomatic (invocation-based)Milliseconds1,000 concurrent (soft limit)
FargateCPU/memory metrics via ECS Service Auto Scaling1–3 minutesTask quotas per region
EKS + KarpenterPod pending state → node provisioning30–90 secondsEC2 capacity limits
EKS + KEDAExternal events (SQS depth, Kafka lag)Seconds (pods)Cluster node capacity

Developer Experience: The Backstage Advantage

Without Backstage, moving from Lambda to Fargate to EKS requires navigating the AWS console, writing Terraform modules, configuring GitHub Actions pipelines, and updating scattered documentation. The cognitive load falls entirely on the developer or DevOps engineer.

With Backstage:

  1. A developer opens the portal and selects “New Service” or “Upgrade Service Tier”
  2. They fill in a form with service-specific parameters (name, runtime, scaling preferences)
  3. Backstage provisions the infrastructure, creates the repository, configures the CI/CD pipeline, and registers the service in the catalog — all automatically
  4. The developer starts writing business logic within minutes

This self-service model has measurable impact:

  • Reduced time-to-production from days or weeks to hours
  • Consistent infrastructure — no snowflake services built from tribal knowledge
  • Lower cognitive load — developers focus on product, not platform
  • Audit trail — every template invocation is logged, providing a history of infrastructure changes

Conclusion

Application infrastructure is not static. The right deployment strategy depends on the application’s lifecycle stage, traffic patterns, team size, and reliability requirements. Starting on AWS Lambda, graduating to Fargate, and eventually operating on EKS is a natural progression that mirrors the growth of the application itself.

Backstage makes this progression explicit, self-service, and safe. By encoding deployment tiers as Software Templates, registering services in the catalog, and linking TechDocs for every component, platform teams give developers a unified portal that grows with their applications — from a weekend prototype costing $5/month to a multi-region production platform costing thousands.

The investment in a Backstage-based internal developer platform pays dividends not just in infrastructure consistency, but in developer velocity, operational visibility, and the organizational confidence to evolve infrastructure without fear.

Start simple. Grow deliberately. Let Backstage guide the journey.


References