The first time you need to deploy a containerised service in production you run into the same wall. The container crashes overnight and nobody restarts it. You deploy a new version and there’s 20 seconds of downtime. Traffic doubles and you can’t scale fast enough. Two services can’t find each other reliably without hardcoding IPs.
Docker solves the packaging problem — one container image that runs the same everywhere. But it doesn’t solve the operations problem: keeping services alive, updated, discoverable, and scaled. That’s what Kubernetes solves.
The one mental model that explains everything
Kubernetes is a reconciliation loop. You write YAML that declares the desired state of your system — “I want 3 replicas of this container, always running, exposed on port 80”. The control plane reads this declaration and continuously moves actual state toward desired state.
Container crashed? Start a new one. Node died? Reschedule the pod elsewhere. New deployment pushed? Roll it out without taking down the old version first.
You don’t write imperative commands (“start this container now”). You write a declaration (“this is how the system should look”) and Kubernetes enforces it continuously.
The core objects
Pod — the unit of scheduling
The smallest deployable unit. One or more containers sharing a network namespace, storage volumes, and lifecycle. Pods are ephemeral — treat them as cattle, not pets. They crash, they restart, they move to different nodes.
You almost never create Pods directly. You create a Deployment, which creates a ReplicaSet, which manages Pods.
# You rarely write this directly — it's here to show the structureapiVersion: v1kind: Podmetadata: name: api-pod labels: app: apispec: containers: - name: api image: myrepo/api:1.2.3 ports: - containerPort: 8000 env: - name: DATABASE_URL valueFrom: secretKeyRef: name: api-secrets key: DATABASE_URLDeployment — desired state for your service
The object you actually use day-to-day. Declare how many replicas you want, which container image, resource limits, and the rollout strategy. Kubernetes handles the rest.
apiVersion: apps/v1kind: Deploymentmetadata: name: apispec: replicas: 3 selector: matchLabels: app: api strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 0 # zero downtime: no pods removed until new ones are ready maxSurge: 1 # at most 1 extra pod during rollout template: metadata: labels: app: api spec: containers: - name: api image: myrepo/api:1.2.3 ports: - containerPort: 8000 resources: requests: cpu: "100m" # 0.1 CPU cores guaranteed memory: "128Mi" limits: cpu: "500m" # hard cap at 0.5 cores memory: "512Mi" readinessProbe: httpGet: path: /health port: 8000 initialDelaySeconds: 5 periodSeconds: 10The readinessProbe matters: Kubernetes only routes traffic to a Pod once it passes the health check. No traffic to a container that’s still booting.
RollingUpdate means zero-downtime deploys: new pods come up before old ones go down, limited by maxUnavailable and maxSurge.
Service — stable network identity
Pods have dynamic IPs that change on every restart. A Service provides a stable DNS name and load-balances across all matching pods.
apiVersion: v1kind: Servicemetadata: name: apispec: selector: app: api # routes traffic to all pods with this label ports: - port: 80 targetPort: 8000 type: ClusterIP # internal-only; use LoadBalancer for external trafficInside the cluster, any pod can now reach your API at http://api:80. DNS resolution is handled by CoreDNS — built into every Kubernetes cluster. No service discovery configuration needed.
For external traffic, either set type: LoadBalancer (cloud provider creates a load balancer) or use an Ingress resource for HTTP routing with path rules and TLS termination:
apiVersion: networking.k8s.io/v1kind: Ingressmetadata: name: api-ingress annotations: nginx.ingress.kubernetes.io/rewrite-target: /spec: rules: - host: api.mycompany.com http: paths: - path: / pathType: Prefix backend: service: name: api port: number: 80ConfigMap and Secret — config outside the image
Never bake configuration into container images. That makes images environment-specific and forces a rebuild for every config change.
apiVersion: v1kind: ConfigMapmetadata: name: api-configdata: LOG_LEVEL: "info" FEATURE_FLAGS: "new_dashboard=true"---apiVersion: v1kind: Secretmetadata: name: api-secretstype: OpaquestringData: DATABASE_URL: "postgresql://user:password@db:5432/mydb" JWT_SECRET: "change-in-production"Reference them in the Deployment spec:
envFrom: - configMapRef: name: api-config - secretRef: name: api-secretsIn production, Secrets should be backed by a secrets manager (AWS Secrets Manager, HashiCorp Vault, Azure Key Vault). Kubernetes Secrets are base64-encoded in their manifest form and, without additional configuration, stored unencrypted in etcd. Managed K8s services (GKE, EKS, AKS) typically enable encryption at rest, but RBAC access still exposes plaintext values.
Horizontal Pod Autoscaler
Scale automatically based on CPU, memory, or custom metrics:
apiVersion: autoscaling/v2kind: HorizontalPodAutoscalermetadata: name: api-hpaspec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: api minReplicas: 2 maxReplicas: 20 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70When average CPU across pods crosses 70%, Kubernetes adds replicas. When load drops, it scales back down to the minimum. This is the difference between paying for peak capacity 24/7 and paying for actual usage.
A complete deploy workflow
# 1. Build and push the imagedocker build -t myrepo/api:v1.1.0 .docker push myrepo/api:v1.1.0
# 2. Update the image tag in the Deployment manifest, then applykubectl apply -f k8s/# (or imperatively: kubectl set image deployment/api api=myrepo/api:v1.1.0)
# 3. Watch the rolloutkubectl rollout status deployment/api
# 4. If something goes wrongkubectl rollout undo deployment/api
# 5. Check logskubectl logs -l app=api --tail=100 -f
# 6. Get a shell in a running podkubectl exec -it deploy/api -- /bin/shThe kubectl rollout undo command is the most useful command in production. It rolls back to the previous ReplicaSet instantly — no re-deploy needed.
Namespace isolation
Namespaces are virtual clusters within a cluster. Use them to separate environments or teams:
kubectl create namespace stagingkubectl apply -f k8s/ -n staging
# Production stays in the default namespace (or its own namespace)kubectl apply -f k8s/ -n productionResource quotas can be applied per namespace to prevent one team from consuming all cluster resources.
When Kubernetes is overkill
Kubernetes is not the right answer for every service. The operational complexity is real — you need to understand networking, RBAC, storage classes, cluster upgrades, and monitoring. For a small team or a simple service:
| Situation | Better option |
|---|---|
| Single service, predictable load | Railway, Render, Fly.io |
| Serverless / event-driven | AWS Lambda, Cloud Run |
| Simple containers, no auto-scaling | AWS ECS, Azure Container Apps |
| Early-stage startup, small team | Managed K8s with minimal config (GKE Autopilot, EKS Fargate) |
Kubernetes pays back its complexity when you have multiple services that need to discover each other, variable traffic where auto-scaling saves real money, or independent teams deploying at different cadences. At the ECB and in production DeFi infrastructure, those conditions were all met.
Why this matters in engineering interviews
The question “how do you deploy to production?” is a signal question in senior backend interviews. “We use Kubernetes” is expected. What interviewers actually want to hear is:
- Why not just Docker Compose? Compose has
restart: alwaysfor basic self-healing, but lacks rolling updates, horizontal auto-scaling, cross-node rescheduling, and service discovery across hosts. - How do services find each other? CoreDNS resolves Service names to stable ClusterIP addresses. No hardcoded IPs, no service registry to maintain.
- How do you handle secrets? Kubernetes Secrets + a secrets manager backend. Never in environment variables checked into source control.
- How do you deploy without downtime?
RollingUpdatestrategy withmaxUnavailable: 0and health checks — new pods receive traffic only after passing readiness probes.
Understanding the reconciliation loop, the object hierarchy (Deployment → ReplicaSet → Pod), and how Services abstract Pod discovery explains 90% of what Kubernetes does day-to-day.