DevOps Tutorial

Platform Engineering with Backstage: Building Internal Developer Portals in 2026

March 12, 2026 24 min read

Infrastructure teams are transforming into platform teams. The Internal Developer Portal (IDP) is your organization's operating system—a unified interface where developers discover services, provision resources, and follow paved roads to production. This guide covers everything from Platform as a Product philosophy to production Backstage implementations.

What Is Platform Engineering?

Platform engineering is the discipline of building and maintaining internal platforms that enable product teams to deliver value faster. Unlike traditional DevOps, which often embeds operations expertise within each team, platform engineering centralizes platform capabilities while distributing self-service access.

The core premise is simple: developers shouldn't need to be infrastructure experts to deploy applications. The platform team builds paved roads—golden paths—that encode organizational best practices, security policies, and operational standards into reusable templates and APIs.

In 2026, this has evolved beyond simple CI/CD pipelines. Modern platform engineering encompasses:

Self-service infrastructure provisioning via APIs and UIs
Developer experience optimization—reducing cognitive load
Standardized golden paths to production
Unified observability across services and teams
Cost visibility and governance for multi-tenant environments
AI-assisted operations and intelligent automation

📊 Industry Adoption in 2026

According to the 2026 State of Platform Engineering report, 68% of organizations with 500+ engineers have implemented an Internal Developer Portal, up from 42% in 2023. The average reduction in developer onboarding time is 73%, and incident MTTR has improved by 41% in platform-mature organizations.

The Platform as a Product Mindset

Successful platform teams treat their platform as a product, not a project. This means:

Product management discipline: Roadmaps, user research, and feedback loops
Developer personas: Understanding different user needs (frontend, ML, mobile)
Service Level Objectives (SLOs): The platform has defined reliability targets
Documentation as a feature: Clear, maintained, discoverable
Gradual adoption: Voluntary migration, not forced migrations

Platform teams that operate this way see adoption rates 3x higher than those imposing top-down mandates. The key is making the golden path significantly easier than the alternative—not by policy, but by design.

IDP vs Traditional DevOps: The Shift

Understanding the difference between traditional DevOps and platform engineering helps clarify why IDPs matter. Let's compare the two approaches:

Aspect	Traditional DevOps	Platform Engineering
Team Structure	Embedded SREs per product team	Centralized platform team, self-service for developers
Infrastructure Access	Direct cloud console access	Abstracted via APIs and portals
Tooling Decisions	Team-by-team choices	Curated, enterprise-wide standards
Security Enforcement	Manual reviews, late in cycle	Shift-left via templates and policies
Developer Onboarding	Weeks of tribal knowledge transfer	Hours via documented golden paths
Visibility	Fragmented across tools	Unified in developer portal
Cost Attribution	Often unknown until bill arrives	Real-time visibility per team/service

The shift isn't about eliminating DevOps—it's about industrializing it. DevOps principles (shared ownership, automation, feedback loops) remain essential. Platform engineering provides the infrastructure to scale those principles across dozens or hundreds of teams without linearly scaling operations headcount.

Platform Tools Landscape 2026

The IDP ecosystem has matured significantly. Here's the current landscape:

Internal Developer Portals

Tool	Type	Best For	Hosting
Backstage	Open Source	Large orgs, customization needs	Self-hosted
Port	SaaS/Managed	Fastest time-to-value, smaller teams	Cloud
Kratix	Open Source	Kubernetes-native, GitOps workflows	Self-hosted
OpsLevel	SaaS	Service catalog, maturity scoring	Cloud
Compass (Atlassian)	SaaS	Teams already using Jira/Confluence	Cloud
Cortex	SaaS	Scorecards, service maturity	Cloud

Supporting Infrastructure

Modern platforms integrate with a broader ecosystem:

Terraform/OpenTofu: Infrastructure as Code provisioning
Crossplane: Kubernetes-native infrastructure control plane
GitOps tools: ArgoCD, Flux for continuous delivery
Policy engines: OPA/Gatekeeper, Kyverno for guardrails
Service mesh: Istio, Cilium, Linkerd for connectivity
Observability: OpenTelemetry, Prometheus, Grafana, Tempo

Backstage Deep Dive: Build Your Portal

Backstage, originally built by Spotify and donated to the CNCF, is the most widely adopted open-source IDP framework. Let's build a production-ready instance.

Architecture Overview

Backstage consists of:

Core: Plugin system, routing, authentication
Software Catalog: Central registry of services, components, resources
Scaffolder: Templates for creating new services and resources
TechDocs: Documentation as code, integrated with the catalog
Plugins: Extensible ecosystem (100+ community plugins)

Installation and Setup

Let's deploy Backstage with a PostgreSQL backend:

# Install Node.js 20+ and Yarn
node --version  # v20.11.0
npm install -g yarn

# Create Backstage app
npx @backstage/create-app@latest
# ? Enter a name for the app: platform-portal
# ? Select database: PostgreSQL

cd platform-portal

# Install dependencies
yarn install

Configure PostgreSQL Backend

# app-config.yaml
backend:
  database:
    client: pg
    connection:
      host: ${POSTGRES_HOST}
      port: ${POSTGRES_PORT}
      user: ${POSTGRES_USER}
      password: ${POSTGRES_PASSWORD}
      database: ${POSTGRES_DB}
  # Enable caching for performance
  cache:
    store: redis
    connection: ${REDIS_URL}

Docker Deployment

# Create Dockerfile for production
FROM node:20-bookworm-slim

WORKDIR /app

# Install dependencies
COPY package.json yarn.lock ./
RUN yarn install --frozen-lockfile --production

# Build the app
COPY . .
RUN yarn build

# Production stage
ENV NODE_ENV=production
ENV PORT=7007
EXPOSE 7007

CMD ["node", "packages/backend/dist/index.js"]

Software Catalog: Modeling Your Estate

The Software Catalog is Backstage's heart. It uses a declarative YAML format called catalog-info.yaml:

# catalog-info.yaml for a microservice
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
  name: payment-service
  description: Payment processing microservice
  annotations:
    github.com/project-slug: myorg/payment-service
    backstage.io/techdocs-ref: dir:.
    prometheus.io/rule: payment_service_slo
    sentry.io/project-slug: payment-service
  tags:
    - payments
    - critical
    - java
spec:
  type: service
  lifecycle: production
  owner: payments-team
  system: commerce
  providesApis:
    - payment-api
  dependsOn:
    - resource:payment-db
    - component:fraud-detection

---
apiVersion: backstage.io/v1alpha1
kind: Resource
metadata:
  name: payment-db
  description: PostgreSQL database for payments
spec:
  type: database
  owner: dba-team
  system: commerce

Entities in Backstage follow a well-defined model:

Component: Software pieces (services, libraries, apps)
API: Interface definitions (OpenAPI, AsyncAPI, GraphQL)
Resource: Infrastructure (databases, storage, VMs)
System: Grouping of related components
Domain: Business domains containing systems
User & Group: Organizational structure from identity provider

Integrating with Identity Providers

Backstage integrates with your existing identity infrastructure:

# app-config.yaml - Microsoft Entra ID (Azure AD)
auth:
  environment: production
  providers:
    microsoft:
      production:
        clientId: ${AUTH_MICROSOFT_CLIENT_ID}
        clientSecret: ${AUTH_MICROSOFT_CLIENT_SECRET}
        tenantId: ${AUTH_MICROSOFT_TENANT_ID}
        
catalog:
  providers:
    microsoftGraphOrg:
      default:
        tenantId: ${AUTH_MICROSOFT_TENANT_ID}
        clientId: ${AUTH_MICROSOFT_CLIENT_ID}
        clientSecret: ${AUTH_MICROSOFT_CLIENT_SECRET}
        # Sync users and groups automatically
        user:
          filter: accountEnabled eq true
        group:
          filter: securityEnabled eq true
        schedule:
          frequency: PT1H
          timeout: PT50M

The Scaffolder: Self-Service Templates

The Scaffolder enables developers to create new services via UI forms. Here's a production-ready template for a Java Spring Boot microservice:

# template.yaml
apiVersion: scaffolder.backstage.io/v1beta3
kind: Template
metadata:
  name: java-microservice
  title: Java Microservice
  description: Create a new Java Spring Boot microservice with CI/CD pipeline
  tags:
    - java
    - spring
    - recommended
spec:
  owner: platform-team
  type: service
  
  parameters:
    - title: Service Details
      required:
        - name
        - owner
        - system
      properties:
        name:
          title: Service Name
          type: string
          pattern: '^[a-z][a-z0-9-]+$'
          description: Lowercase, hyphens allowed
        owner:
          title: Owner
          type: string
          ui:field: OwnerPicker
          ui:options:
            catalogFilter:
              kind: Group
        system:
          title: System
          type: string
          ui:field: EntityPicker
          ui:options:
            catalogFilter:
              kind: System
        description:
          title: Description
          type: string
          ui:widget: textarea
        
    - title: Infrastructure
      properties:
        database:
          title: Database
          type: string
          enum:
            - postgresql
            - mysql
            - none
          default: postgresql
        enableObservability:
          title: Enable Observability
          type: boolean
          default: true
          description: Includes OpenTelemetry instrumentation
  
  steps:
    - id: fetch-base
      name: Fetch Base Template
      action: fetch:template
      input:
        url: ./skeleton
        values:
          name: ${{ parameters.name }}
          owner: ${{ parameters.owner }}
          system: ${{ parameters.system }}
          description: ${{ parameters.description }}
          database: ${{ parameters.database }}
          
    - id: publish
      name: Publish to GitHub
      action: publish:github
      input:
        allowedHosts: ['github.com']
        description: "${{ parameters.name }} microservice"
        repoUrl: "${{ parameters.repoUrl | parseRepoUrl | pick('owner') }}/${{ parameters.name }}"
        defaultBranch: main
        
    - id: register
      name: Register in Catalog
      action: catalog:register
      input:
        repoContentsUrl: ${{ steps.publish.output.repoContentsUrl }}
        catalogInfoPath: /catalog-info.yaml
        
    - id: create-resources
      name: Create Infrastructure
      action: terraform:run
      input:
        workspace: ${{ parameters.name }}
        variables:
          service_name: ${{ parameters.name }}
          database_type: ${{ parameters.database }}
        
  output:
    links:
      - title: Repository
        url: ${{ steps.publish.output.remoteUrl }}
      - title: Open in Catalog
        icon: catalog
        entityRef: ${{ steps.register.output.entityRef }}

💡 Template Best Practices

Keep templates opinionated but flexible. Every template should include: README with onboarding steps, Observability instrumentation, Security scanning in CI, Resource limits, Health checks. The goal is 80% standardization with 20% flexibility for edge cases.

Key Backstage Plugins for 2026

The plugin ecosystem has expanded significantly. Essential plugins include:

Plugin	Purpose	Configuration
GitHub	Repo integration, PR visibility	GitHub App or PAT
Kubernetes	Cluster resource visibility	Service account with limited RBAC
ArgoCD	GitOps deployment status	API token
Prometheus	Metrics and alerting	Prometheus URL
TechRadar	Technology adoption tracking	YAML configuration
Cost Insights	Cloud cost visibility	Billing API integration
API Docs	OpenAPI/AsyncAPI documentation	Spec URLs in annotations
Soundcheck	Service maturity scoring	Custom rules

Designing Golden Paths: The Art of Paved Roads

Golden paths are the key abstraction in platform engineering. They're not prescriptive mandates—they're well-documented, supported paths that make the right thing easy.

The Golden Path Philosophy

A well-designed golden path:

Removes decision fatigue: Pre-configured, secure defaults
Encodes expertise: Best practices built-in
Provides escape hatches: Deviations possible with justification
Is actively maintained: Regular updates, security patches
Includes observability: Monitoring and alerting out-of-box
Has clear ownership: Platform team maintains the template

Example: Complete Golden Path for Node.js Services

Here's what a comprehensive golden path includes:

# Project Structure
delivery-service/
├── src/
│   ├── index.js
│   ├── routes/
│   ├── middleware/
│   └── utils/
├── tests/
│   ├── unit/
│   └── integration/
├── infra/
│   ├── Dockerfile
│   ├── kubernetes/
│   │   ├── deployment.yaml
│   │   ├── service.yaml
│   │   ├── hpa.yaml
│   │   └── network-policy.yaml
│   └── terraform/
│       └── main.tf
├── .github/
│   └── workflows/
│       ├── ci.yaml          # Test, lint, security scan
│       ├── cd-staging.yaml   # Deploy to staging
│       └── cd-prod.yaml      # Deploy to prod with approval
├── catalog-info.yaml
├── mkdocs.yml               # TechDocs configuration
├── README.md
├── CONTRIBUTING.md
├── Dockerfile
├── docker-compose.yml       # Local development
├── .eslintrc.json
├── .prettierrc
├── jest.config.js
└── package.json

# Key Files

# Dockerfile - Production hardened
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

FROM node:20-alpine AS production
RUN apk add --no-cache dumb-init
ENV NODE_ENV=production
ENV PORT=3000
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY . .
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nodejs -u 1001
USER nodejs
EXPOSE 3000
CMD ["dumb-init", "node", "src/index.js"]

# kubernetes/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: delivery-service
  labels:
    app: delivery-service
    version: "{{ .Values.image.tag }}"
spec:
  replicas: 3
  selector:
    matchLabels:
      app: delivery-service
  template:
    metadata:
      labels:
        app: delivery-service
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "3000"
    spec:
      serviceAccountName: delivery-service
      securityContext:
        runAsNonRoot: true
        seccompProfile:
          type: RuntimeDefault
      containers:
      - name: delivery-service
        image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
        ports:
        - containerPort: 3000
          name: http
        env:
        - name: NODE_ENV
          value: "production"
        - name: OTEL_EXPORTER_OTLP_ENDPOINT
          value: "http://otel-collector:4317"
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health/live
            port: 3000
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5
        securityContext:
          allowPrivilegeEscalation: false
          readOnlyRootFilesystem: true
          capabilities:
            drop:
            - ALL
        volumeMounts:
        - name: tmp
          mountPath: /tmp
      volumes:
      - name: tmp
        emptyDir: {}
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - delivery-service
              topologyKey: kubernetes.io/hostname

This template includes production-hardened security settings, observability instrumentation via OpenTelemetry, horizontal pod autoscaling configuration, proper health checks, and anti-affinity rules for high availability.

Port and Kratix: Alternatives to Backstage

While Backstage is powerful, it's not the only option. Let's explore alternatives:

Port: Managed IDP with Faster Time-to-Value

Port is a commercial SaaS IDP that accelerates platform adoption without infrastructure overhead:

# Port Blueprint Example (similar to Backstage's catalog)
{
  "identifier": "microservice",
  "title": "Microservice",
  "description": "A microservice component",
  "icon": "Service",
  "schema": {
    "properties": {
      "language": {
        "type": "string",
        "title": "Language",
        "enum": ["Java", "Python", "Go", "Node.js"]
      },
      "tier": {
        "type": "string",
        "title": "Service Tier",
        "enum": ["Critical", "Standard", "Experimental"]
      },
      "owner": {
        "type": "string",
        "title": "Owner",
        "format": "user"
      },
      "oncall": {
        "type": "string",
        "title": "On-call Rotation",
        "format": "url"
      }
    },
    "required": ["owner", "language"]
  },
  "relations": {
    "runsOn": {
      "title": "Runs On",
      "target": "cluster",
      "many": true
    },
    "dependsOn": {
      "title": "Depends On",
      "target": "microservice",
      "many": true
    }
  }
}

Port vs Backstage:

Factor	Port	Backstage
Setup Time	Hours	Weeks
Infrastructure	Fully managed	Self-hosted
Customization	Moderate (JSON/YAML)	Unlimited (React/TypeScript)
Cost Model	Per-developer SaaS	Infrastructure + maintenance
Best For	Teams wanting fast ROI	Large orgs needing customization

Kratix: Kubernetes-Native Platform Building

Kratix takes a different approach—instead of a portal UI, it leverages Kubernetes APIs and GitOps:

# Kratix Promise (Platform API)
apiVersion: platform.kratix.io/v1alpha1
kind: Promise
metadata:
  name: postgresql-database
spec:
  # What users request
  api:
    apiVersion: marketplace.example.com/v1
    kind: Database
    spec:
      properties:
        size:
          type: string
          enum:
            - small    # 10GB storage, 1 CPU
            - medium   # 100GB storage, 2 CPU
            - large    # 1TB storage, 4 CPU
          default: small
        backupEnabled:
          type: boolean
          default: true
  
  # What gets created
  workflows:
    - apiVersion: platform.kratix.io/v1alpha1
      kind: Pipeline
      metadata:
        name: database-provision
      spec:
        containers:
          - image: kratix-platform-io/postgresql-pipeline:v1
            name: provisioner
        
  # Where to deploy
  destinationSelectors:
    - matchLabels:
        environment: production

Kratix is ideal for teams already invested in Kubernetes and GitOps. It treats platform capabilities as Kubernetes resources, enabling powerful composition and automation patterns.

Measuring Platform Success

Platform teams need metrics that demonstrate value. The 2026 standard framework includes:

Developer Productivity Metrics (DORA)

Deployment Frequency: How often can teams deploy
Lead Time for Changes: Commit to production time
Change Failure Rate: Percentage of deployments causing incidents
Time to Restore Service: Mean time to recovery (MTTR)

Platform Adoption Metrics

Template Usage: Services created via golden paths vs ad-hoc
Catalog Completeness: Percentage of services documented
Time to Provision: Minutes from request to running resource
Developer Satisfaction: Periodic surveys (NPS-style)

Operational Excellence Metrics

Platform Uptime: SLO for IDP availability
Template Success Rate: Percentage of scaffolded services building/deploying successfully
Security Posture: Percentage of services meeting security baselines
Cost Attribution Accuracy: Tagged vs untagged resources

⚠️ The Platform Team Anti-Pattern

Avoid becoming a ticket-driven operations team. If developers are submitting tickets for routine operations ("create a database", "add a secret"), your platform isn't self-service enough. The goal is tickets for exceptions, not routine work.

Production Deployment Checklist

Before launching your IDP to the entire organization:

☐ Authentication & Authorization
  ☐ SSO integration tested
  ☐ RBAC policies defined per team
  ☐ Catalog permissions verified
  
☐ Scalability & Performance
  ☐ Load tested with expected concurrent users
  ☐ Database connection pooling configured
  ☐ Redis cache enabled
  ☐ Horizontal scaling documented
  
☐ Security & Compliance
  ☐ Secrets in external vault (not config files)
  ☐ Container images scanned
  ☐ Network policies configured
  ☐ Audit logging enabled
  ☐ GDPR/privacy compliance verified
  
☐ Reliability
  ☐ Database backups scheduled
  ☐ Health check endpoints defined
  ☐ Monitoring and alerting configured
  ☐ Runbooks for common failures
  ☐ Disaster recovery tested
  
☐ Developer Experience
  ☐ At least 3 golden paths ready
  ☐ Documentation portal populated
  ☐ Onboarding guide for new teams
  ☐ Slack/Teams support channel established
  ☐ Feedback mechanism in place
  
☐ Migration & Adoption
  ☐ Pilot team identified and onboarded
  ☐ Migration path from legacy catalog
  ☐ Training sessions scheduled
  ☐ Success metrics baseline established

Conclusion

Platform engineering with Backstage (or Port, or Kratix) represents a fundamental shift in how organizations approach developer experience. By treating internal platforms as products—with user research, roadmaps, and clear value propositions—you create an environment where developers can focus on business logic rather than infrastructure plumbing.

The investment is significant: expect 3-6 months to reach production-grade platform status with Backstage. But the returns are substantial: reduced onboarding time, improved security posture, faster delivery, and happier developers.

Start small. Pick one golden path—perhaps the most common service type in your organization—and make it genuinely delightful to use. Iterate based on feedback. Expand incrementally. Platform engineering is a journey, not a destination, and the best platforms evolve continuously with their users' needs.

The future belongs to organizations that enable their developers to ship safely and quickly. The Internal Developer Portal is the foundation of that future.