Skip to content

Kubernetes for Backend Engineers — Pods, Deployments, and Services Without the Jargon

Posted on:September 10, 2025 at 09:00 AM

The first time you need to deploy a containerised service in production you run into the same wall. The container crashes overnight and nobody restarts it. You deploy a new version and there’s 20 seconds of downtime. Traffic doubles and you can’t scale fast enough. Two services can’t find each other reliably without hardcoding IPs.

Docker solves the packaging problem — one container image that runs the same everywhere. But it doesn’t solve the operations problem: keeping services alive, updated, discoverable, and scaled. That’s what Kubernetes solves.

The one mental model that explains everything

Kubernetes is a reconciliation loop. You write YAML that declares the desired state of your system — “I want 3 replicas of this container, always running, exposed on port 80”. The control plane reads this declaration and continuously moves actual state toward desired state.

Container crashed? Start a new one. Node died? Reschedule the pod elsewhere. New deployment pushed? Roll it out without taking down the old version first.

You don’t write imperative commands (“start this container now”). You write a declaration (“this is how the system should look”) and Kubernetes enforces it continuously.


The core objects

Pod — the unit of scheduling

The smallest deployable unit. One or more containers sharing a network namespace, storage volumes, and lifecycle. Pods are ephemeral — treat them as cattle, not pets. They crash, they restart, they move to different nodes.

You almost never create Pods directly. You create a Deployment, which creates a ReplicaSet, which manages Pods.

# You rarely write this directly — it's here to show the structure
apiVersion: v1
kind: Pod
metadata:
name: api-pod
labels:
app: api
spec:
containers:
- name: api
image: myrepo/api:1.2.3
ports:
- containerPort: 8000
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: api-secrets
key: DATABASE_URL

Deployment — desired state for your service

The object you actually use day-to-day. Declare how many replicas you want, which container image, resource limits, and the rollout strategy. Kubernetes handles the rest.

apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
replicas: 3
selector:
matchLabels:
app: api
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0 # zero downtime: no pods removed until new ones are ready
maxSurge: 1 # at most 1 extra pod during rollout
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: myrepo/api:1.2.3
ports:
- containerPort: 8000
resources:
requests:
cpu: "100m" # 0.1 CPU cores guaranteed
memory: "128Mi"
limits:
cpu: "500m" # hard cap at 0.5 cores
memory: "512Mi"
readinessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 5
periodSeconds: 10

The readinessProbe matters: Kubernetes only routes traffic to a Pod once it passes the health check. No traffic to a container that’s still booting.

RollingUpdate means zero-downtime deploys: new pods come up before old ones go down, limited by maxUnavailable and maxSurge.

Service — stable network identity

Pods have dynamic IPs that change on every restart. A Service provides a stable DNS name and load-balances across all matching pods.

apiVersion: v1
kind: Service
metadata:
name: api
spec:
selector:
app: api # routes traffic to all pods with this label
ports:
- port: 80
targetPort: 8000
type: ClusterIP # internal-only; use LoadBalancer for external traffic

Inside the cluster, any pod can now reach your API at http://api:80. DNS resolution is handled by CoreDNS — built into every Kubernetes cluster. No service discovery configuration needed.

For external traffic, either set type: LoadBalancer (cloud provider creates a load balancer) or use an Ingress resource for HTTP routing with path rules and TLS termination:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
rules:
- host: api.mycompany.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api
port:
number: 80

ConfigMap and Secret — config outside the image

Never bake configuration into container images. That makes images environment-specific and forces a rebuild for every config change.

apiVersion: v1
kind: ConfigMap
metadata:
name: api-config
data:
LOG_LEVEL: "info"
FEATURE_FLAGS: "new_dashboard=true"
---
apiVersion: v1
kind: Secret
metadata:
name: api-secrets
type: Opaque
stringData:
DATABASE_URL: "postgresql://user:password@db:5432/mydb"
JWT_SECRET: "change-in-production"

Reference them in the Deployment spec:

envFrom:
- configMapRef:
name: api-config
- secretRef:
name: api-secrets

In production, Secrets should be backed by a secrets manager (AWS Secrets Manager, HashiCorp Vault, Azure Key Vault). Kubernetes Secrets are base64-encoded in their manifest form and, without additional configuration, stored unencrypted in etcd. Managed K8s services (GKE, EKS, AKS) typically enable encryption at rest, but RBAC access still exposes plaintext values.


Horizontal Pod Autoscaler

Scale automatically based on CPU, memory, or custom metrics:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70

When average CPU across pods crosses 70%, Kubernetes adds replicas. When load drops, it scales back down to the minimum. This is the difference between paying for peak capacity 24/7 and paying for actual usage.


A complete deploy workflow

Terminal window
# 1. Build and push the image
docker build -t myrepo/api:v1.1.0 .
docker push myrepo/api:v1.1.0
# 2. Update the image tag in the Deployment manifest, then apply
kubectl apply -f k8s/
# (or imperatively: kubectl set image deployment/api api=myrepo/api:v1.1.0)
# 3. Watch the rollout
kubectl rollout status deployment/api
# 4. If something goes wrong
kubectl rollout undo deployment/api
# 5. Check logs
kubectl logs -l app=api --tail=100 -f
# 6. Get a shell in a running pod
kubectl exec -it deploy/api -- /bin/sh

The kubectl rollout undo command is the most useful command in production. It rolls back to the previous ReplicaSet instantly — no re-deploy needed.


Namespace isolation

Namespaces are virtual clusters within a cluster. Use them to separate environments or teams:

Terminal window
kubectl create namespace staging
kubectl apply -f k8s/ -n staging
# Production stays in the default namespace (or its own namespace)
kubectl apply -f k8s/ -n production

Resource quotas can be applied per namespace to prevent one team from consuming all cluster resources.


When Kubernetes is overkill

Kubernetes is not the right answer for every service. The operational complexity is real — you need to understand networking, RBAC, storage classes, cluster upgrades, and monitoring. For a small team or a simple service:

SituationBetter option
Single service, predictable loadRailway, Render, Fly.io
Serverless / event-drivenAWS Lambda, Cloud Run
Simple containers, no auto-scalingAWS ECS, Azure Container Apps
Early-stage startup, small teamManaged K8s with minimal config (GKE Autopilot, EKS Fargate)

Kubernetes pays back its complexity when you have multiple services that need to discover each other, variable traffic where auto-scaling saves real money, or independent teams deploying at different cadences. At the ECB and in production DeFi infrastructure, those conditions were all met.


Why this matters in engineering interviews

The question “how do you deploy to production?” is a signal question in senior backend interviews. “We use Kubernetes” is expected. What interviewers actually want to hear is:

Understanding the reconciliation loop, the object hierarchy (Deployment → ReplicaSet → Pod), and how Services abstract Pod discovery explains 90% of what Kubernetes does day-to-day.