Kubernetes for Backend Engineers — Pods, Deployments, and Services Without the Jargon

The first time you need to deploy a containerised service in production you run into the same wall. The container crashes overnight and nobody restarts it. You deploy a new version and there’s 20 seconds of downtime. Traffic doubles and you can’t scale fast enough. Two services can’t find each other reliably without hardcoding IPs.

Docker solves the packaging problem — one container image that runs the same everywhere. But it doesn’t solve the operations problem: keeping services alive, updated, discoverable, and scaled. That’s what Kubernetes solves.

The one mental model that explains everything

Kubernetes is a reconciliation loop. You write YAML that declares the desired state of your system — “I want 3 replicas of this container, always running, exposed on port 80”. The control plane reads this declaration and continuously moves actual state toward desired state.

Container crashed? Start a new one. Node died? Reschedule the pod elsewhere. New deployment pushed? Roll it out without taking down the old version first.

You don’t write imperative commands (“start this container now”). You write a declaration (“this is how the system should look”) and Kubernetes enforces it continuously.

The core objects

Pod — the unit of scheduling

The smallest deployable unit. One or more containers sharing a network namespace, storage volumes, and lifecycle. Pods are ephemeral — treat them as cattle, not pets. They crash, they restart, they move to different nodes.

You almost never create Pods directly. You create a Deployment, which creates a ReplicaSet, which manages Pods.

# You rarely write this directly — it's here to show the structure
apiVersion: v1
kind: Pod
metadata:
  name: api-pod
  labels:
    app: api
spec:
  containers:
    - name: api
      image: myrepo/api:1.2.3
      ports:
        - containerPort: 8000
      env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: api-secrets
              key: DATABASE_URL

Deployment — desired state for your service

The object you actually use day-to-day. Declare how many replicas you want, which container image, resource limits, and the rollout strategy. Kubernetes handles the rest.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: api
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 0   # zero downtime: no pods removed until new ones are ready
      maxSurge: 1         # at most 1 extra pod during rollout
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
        - name: api
          image: myrepo/api:1.2.3
          ports:
            - containerPort: 8000
          resources:
            requests:
              cpu: "100m"      # 0.1 CPU cores guaranteed
              memory: "128Mi"
            limits:
              cpu: "500m"      # hard cap at 0.5 cores
              memory: "512Mi"
          readinessProbe:
            httpGet:
              path: /health
              port: 8000
            initialDelaySeconds: 5
            periodSeconds: 10

The readinessProbe matters: Kubernetes only routes traffic to a Pod once it passes the health check. No traffic to a container that’s still booting.

RollingUpdate means zero-downtime deploys: new pods come up before old ones go down, limited by maxUnavailable and maxSurge.

Service — stable network identity

Pods have dynamic IPs that change on every restart. A Service provides a stable DNS name and load-balances across all matching pods.

apiVersion: v1
kind: Service
metadata:
  name: api
spec:
  selector:
    app: api          # routes traffic to all pods with this label
  ports:
    - port: 80
      targetPort: 8000
  type: ClusterIP     # internal-only; use LoadBalancer for external traffic

Inside the cluster, any pod can now reach your API at http://api:80. DNS resolution is handled by CoreDNS — built into every Kubernetes cluster. No service discovery configuration needed.

For external traffic, either set type: LoadBalancer (cloud provider creates a load balancer) or use an Ingress resource for HTTP routing with path rules and TLS termination:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
    - host: api.mycompany.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: api
                port:
                  number: 80

ConfigMap and Secret — config outside the image

Never bake configuration into container images. That makes images environment-specific and forces a rebuild for every config change.

apiVersion: v1
kind: ConfigMap
metadata:
  name: api-config
data:
  LOG_LEVEL: "info"
  FEATURE_FLAGS: "new_dashboard=true"
---
apiVersion: v1
kind: Secret
metadata:
  name: api-secrets
type: Opaque
stringData:
  DATABASE_URL: "postgresql://user:password@db:5432/mydb"
  JWT_SECRET: "change-in-production"

Reference them in the Deployment spec:

envFrom:
  - configMapRef:
      name: api-config
  - secretRef:
      name: api-secrets

In production, Secrets should be backed by a secrets manager (AWS Secrets Manager, HashiCorp Vault, Azure Key Vault). Kubernetes Secrets are base64-encoded in their manifest form and, without additional configuration, stored unencrypted in etcd. Managed K8s services (GKE, EKS, AKS) typically enable encryption at rest, but RBAC access still exposes plaintext values.

Horizontal Pod Autoscaler

Scale automatically based on CPU, memory, or custom metrics:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

When average CPU across pods crosses 70%, Kubernetes adds replicas. When load drops, it scales back down to the minimum. This is the difference between paying for peak capacity 24/7 and paying for actual usage.

A complete deploy workflow

# 1. Build and push the image
docker build -t myrepo/api:v1.1.0 .
docker push myrepo/api:v1.1.0

# 2. Update the image tag in the Deployment manifest, then apply
kubectl apply -f k8s/
# (or imperatively: kubectl set image deployment/api api=myrepo/api:v1.1.0)

# 3. Watch the rollout
kubectl rollout status deployment/api

# 4. If something goes wrong
kubectl rollout undo deployment/api

# 5. Check logs
kubectl logs -l app=api --tail=100 -f

# 6. Get a shell in a running pod
kubectl exec -it deploy/api -- /bin/sh

The kubectl rollout undo command is the most useful command in production. It rolls back to the previous ReplicaSet instantly — no re-deploy needed.

Namespace isolation

Namespaces are virtual clusters within a cluster. Use them to separate environments or teams:

kubectl create namespace staging
kubectl apply -f k8s/ -n staging

# Production stays in the default namespace (or its own namespace)
kubectl apply -f k8s/ -n production

Resource quotas can be applied per namespace to prevent one team from consuming all cluster resources.

When Kubernetes is overkill

Kubernetes is not the right answer for every service. The operational complexity is real — you need to understand networking, RBAC, storage classes, cluster upgrades, and monitoring. For a small team or a simple service:

Situation	Better option
Single service, predictable load	Railway, Render, Fly.io
Serverless / event-driven	AWS Lambda, Cloud Run
Simple containers, no auto-scaling	AWS ECS, Azure Container Apps
Early-stage startup, small team	Managed K8s with minimal config (GKE Autopilot, EKS Fargate)

Kubernetes pays back its complexity when you have multiple services that need to discover each other, variable traffic where auto-scaling saves real money, or independent teams deploying at different cadences. At the ECB and in production DeFi infrastructure, those conditions were all met.

Why this matters in engineering interviews

The question “how do you deploy to production?” is a signal question in senior backend interviews. “We use Kubernetes” is expected. What interviewers actually want to hear is:

Why not just Docker Compose? Compose has restart: always for basic self-healing, but lacks rolling updates, horizontal auto-scaling, cross-node rescheduling, and service discovery across hosts.
How do services find each other? CoreDNS resolves Service names to stable ClusterIP addresses. No hardcoded IPs, no service registry to maintain.
How do you handle secrets? Kubernetes Secrets + a secrets manager backend. Never in environment variables checked into source control.
How do you deploy without downtime? RollingUpdate strategy with maxUnavailable: 0 and health checks — new pods receive traffic only after passing readiness probes.

Understanding the reconciliation loop, the object hierarchy (Deployment → ReplicaSet → Pod), and how Services abstract Pod discovery explains 90% of what Kubernetes does day-to-day.