Infra: Python K8s
Architecture Visual
Infra: Python K8s
Python is the language of AI and Data, but it’s notorious for the Global Interpreter Lock (GIL) and slow startup times. Running Python in Kubernetes requires specific tuning to ensure high throughput and fast scaling.
This blueprint outlines a reference architecture for deploying FastAPI or Django services that handle thousands of requests per second.
Architecture
- Reverse Proxy (Gunicorn/Uvicorn): You cannot run
python app.pyin production. You need a process manager. - Horizontal Pod Autoscaler (HPA): Scale pods based on CPU or Custom Metrics (Request Queue Depth).
- Async Drivers: Use
Motor(Mongo) orAsyncPG(Postgres) to avoid blocking the event loop.
Use Cases
- Machine Learning Inference: Wrapping PyTorch models in an API.
- Real-time APIs: High-concurrency WebSockets using FastAPI.
- Data Processing: Workers consuming from Kafka/RabbitMQ.
Implementation Guide
We will containerize a FastAPI app with a production-grade Dockerfile.
Prerequisites
- Python 3.10+
- Docker
- Kubernetes Cluster
Step 1: The Code (FastAPI)
/* app/main.py */
from fastapi import FastAPI
import uvicorn
app = FastAPI()
@app.get("/")
async def root():
return {"message": "Hello World"}
@app.get("/health")
async def health():
return {"status": "ok"}
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000)
Step 2: Optimized Dockerfile
Python images can be huge. We utilize multi-stage builds and pre-compiled wheels.
# Stage 1: Build Context
FROM python:3.11-slim as builder
WORKDIR /app
ENV PYTHONDONTWRITEBYTECODE 1
ENV PYTHONUNBUFFERED 1
RUN apt-get update && \
apt-get install -y --no-install-recommends gcc
COPY requirements.txt .
RUN pip wheel --no-cache-dir --no-deps --wheel-dir /app/wheels -r requirements.txt
# Stage 2: Runtime
FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /app/wheels /wheels
COPY --from=builder /app/requirements.txt .
RUN pip install --no-cache /wheels/*
COPY . .
# Run as non-root user
RUN adduser --disabled-password --gecos '' myuser
USER myuser
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "4"]
Step 3: Kubernetes Deployment
/* k8s/deployment.yaml */
apiVersion: apps/v1
kind: Deployment
metadata:
name: fastapi-app
spec:
replicas: 2
selector:
matchLabels:
app: fastapi-app
template:
metadata:
labels:
app: fastapi-app
spec:
containers:
- name: app
image: myregistry/fastapi-app:v1
ports:
- containerPort: 8000
resources:
limits:
cpu: "1000m"
memory: "512Mi"
requests:
cpu: "500m"
memory: "256Mi"
livenessProbe:
httpGet:
path: /health
port: 8000
initialDelaySeconds: 10
periodSeconds: 15
Production Readiness Checklist
[ ] Workers: Set Gunicorn/Uvicorn workers based on CPU cores (workers = (2 x bandwidth) + 1).
[ ] Logging: Use structlog to output logs as JSON for ELK/Datadog ingestion.
[ ] Dependency Pinning: Use pip-tools or Poetry to lock transitive dependencies (requirements.lock).
[ ] Distroless: Consider gcr.io/distroless/python3 for an even smaller, more secure image.
[ ] Sigterm Handling: Ensure your app handles SIGTERM to close database connections gracefully during rollout.
[ ] Pre-Start Script: Run database migrations (alembic upgrade head) in an initContainer, not the main app container.
Cloud Cost Estimator
Dynamic Pricing Calculator