infra
intermediate

Infra: Python K8s

Architecture Visual

graph TD classDef actor fill:#fef3c7,stroke:#d97706,stroke-width:2px,color:#000 classDef gateway fill:#e0f2fe,stroke:#0284c7,stroke-width:2px,color:#000 classDef network fill:#f0f9ff,stroke:#0ea5e9,stroke-width:1px,stroke-dasharray: 5 5,color:#000 classDef service fill:#dbeafe,stroke:#2563eb,stroke-width:2px,color:#000 classDef database fill:#dcfce7,stroke:#16a34a,stroke-width:2px,color:#000 classDef function fill:#f3e8ff,stroke:#9333ea,stroke-width:2px,color:#000 subgraph cluster ["Kubernetes Cluster"] direction TB ingress["K8s Ingress (NGINX)"]:::gateway app_service["Python Service (K8s Service)"]:::network app_pods["Python App (FastAPI/Gunicorn)"]:::service end user["User / Client"]:::actor db["Managed PostgreSQL"]:::database storage["S3 Compatible Storage"]:::database ingress --> app_service app_service --> app_pods app_pods --> db app_pods --> storage

Infra: Python K8s

Python is the language of AI and Data, but it’s notorious for the Global Interpreter Lock (GIL) and slow startup times. Running Python in Kubernetes requires specific tuning to ensure high throughput and fast scaling.

This blueprint outlines a reference architecture for deploying FastAPI or Django services that handle thousands of requests per second.

Architecture

  • Reverse Proxy (Gunicorn/Uvicorn): You cannot run python app.py in production. You need a process manager.
  • Horizontal Pod Autoscaler (HPA): Scale pods based on CPU or Custom Metrics (Request Queue Depth).
  • Async Drivers: Use Motor (Mongo) or AsyncPG (Postgres) to avoid blocking the event loop.

Use Cases

  • Machine Learning Inference: Wrapping PyTorch models in an API.
  • Real-time APIs: High-concurrency WebSockets using FastAPI.
  • Data Processing: Workers consuming from Kafka/RabbitMQ.

Implementation Guide

We will containerize a FastAPI app with a production-grade Dockerfile.

Prerequisites

  • Python 3.10+
  • Docker
  • Kubernetes Cluster

Step 1: The Code (FastAPI)

/* app/main.py */
from fastapi import FastAPI
import uvicorn

app = FastAPI()

@app.get("/")
async def root():
    return {"message": "Hello World"}

@app.get("/health")
async def health():
    return {"status": "ok"}

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

Step 2: Optimized Dockerfile

Python images can be huge. We utilize multi-stage builds and pre-compiled wheels.

# Stage 1: Build Context
FROM python:3.11-slim as builder

WORKDIR /app

ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1

RUN apt-get update && \
    apt-get install -y --no-install-recommends gcc

COPY requirements.txt .
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /app/wheels -r requirements.txt

# Stage 2: Runtime
FROM python:3.11-slim

WORKDIR /app

COPY --from=builder /app/wheels /wheels
COPY --from=builder /app/requirements.txt .

RUN pip install --no-cache /wheels/*

COPY . .

# Run as non-root user
RUN adduser --disabled-password --gecos '' myuser
USER myuser

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]

Step 3: Kubernetes Deployment

/* k8s/deployment.yaml */
apiVersion: apps/v1
kind: Deployment
metadata:
  name: fastapi-app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: fastapi-app
  template:
    metadata:
      labels:
        app: fastapi-app
    spec:
      containers:
      - name: app
        image: myregistry/fastapi-app:v1
        ports:
        - containerPort: 8000
        resources:
          limits:
            cpu: "1000m"
            memory: "512Mi"
          requests:
            cpu: "500m"
            memory: "256Mi"
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 10
          periodSeconds: 15

Production Readiness Checklist

[ ] Workers: Set Gunicorn/Uvicorn workers based on CPU cores (workers = (2 x bandwidth) + 1). [ ] Logging: Use structlog to output logs as JSON for ELK/Datadog ingestion. [ ] Dependency Pinning: Use pip-tools or Poetry to lock transitive dependencies (requirements.lock). [ ] Distroless: Consider gcr.io/distroless/python3 for an even smaller, more secure image. [ ] Sigterm Handling: Ensure your app handles SIGTERM to close database connections gracefully during rollout. [ ] Pre-Start Script: Run database migrations (alembic upgrade head) in an initContainer, not the main app container.

Cloud Cost Estimator

Dynamic Pricing Calculator

$0 / month
MVP (1x) Startup (5x) Growth (20x) Scale (100x)
MVP Level
Compute Resources
$ 15
Database Storage
$ 25
Load Balancer
$ 10
CDN / Bandwidth
$ 5
* Estimates vary by provider & region
0%
Your Progress 0 of 0 steps