Scaling Strategies¶
KanjiIQ is designed to scale from its current single-node deployment to a multi-node, multi-region architecture as traffic grows.
Current State¶
| Metric | Value |
|---|---|
| Cluster nodes | 1 (Hetzner dedicated) |
| Application replicas | 2 |
| Database | Single PostgreSQL instance |
| Traffic handling | ~100 req/min per IP (rate limited) |
This handles the current traffic comfortably. The sections below outline the scaling path as demand increases.
Horizontal Pod Autoscaling (HPA)¶
The first scaling step is adding HPA to automatically adjust replica count based on load:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: jlpt-kanji-hpa
namespace: jlpt-kanji
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: jlpt-kanji
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
Requirements: Metrics Server must be installed (included in k3s by default).
Impact: Application scales automatically between 2-10 replicas based on CPU/memory pressure, with no code changes.
Multi-Node Cluster¶
When a single node reaches resource limits, add worker nodes to the k3s cluster:
# On the new worker node
curl -sfL https://get.k3s.io | K3S_URL=https://master:6443 \
K3S_TOKEN=<node-token> sh -
Kubernetes automatically schedules pods across all available nodes. The application requires no changes — it is already stateless.
Node Affinity (Optional)¶
To control pod placement:
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchLabels:
app: jlpt-kanji
topologyKey: kubernetes.io/hostname
This spreads replicas across different nodes for better fault tolerance.
Database Scaling¶
Connection Pooling¶
Add PgBouncer as a sidecar container to pool database connections:
containers:
- name: pgbouncer
image: edoburu/pgbouncer
ports:
- containerPort: 6432
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: jlpt-kanji-secrets
key: database-url
The backend connects to PgBouncer on :6432 instead of PostgreSQL directly, reducing connection overhead.
Read Replicas¶
For read-heavy workloads (flashcard queries), add PostgreSQL streaming replication:
- Primary instance handles writes (study sessions, quiz results)
- Read replicas handle reads (kanji/vocabulary queries)
- Backend routes queries based on operation type
Managed Database¶
The simplest database scaling path is migrating to a managed service:
- AWS RDS: Multi-AZ, automated backups, read replicas
- GCP Cloud SQL: HA configuration, automatic failover
- Hetzner Managed PostgreSQL: When available
See Portability for migration details.
CDN Layer¶
Static frontend assets can be served through a CDN for global performance:
graph LR
U[User] --> CF[Cloudflare CDN]
CF -->|Cache HIT| U
CF -->|Cache MISS| N[Nginx Frontend]
N --> CF
Since the Flutter Web frontend produces static files (JS, CSS, images), these are ideal CDN candidates:
- Cache policy: 1 year for hashed assets, no-cache for
index.html - Global PoPs: Content served from the nearest edge location
- DDoS protection: CDN absorbs volumetric attacks
Cloudflare DNS is already in place — enabling proxy mode activates the CDN layer.
Scaling Roadmap¶
| Traffic Level | Infrastructure | Key Changes |
|---|---|---|
| Current (low) | 1 node, 2 replicas, single PG | None needed |
| Growing (moderate) | 1 node, HPA (2-10 replicas) | Add HPA manifest |
| High | 2-3 nodes, HPA, PgBouncer | Add worker nodes + connection pooling |
| Very High | Multi-node, managed DB, CDN | Migrate DB to RDS/Cloud SQL, enable CDN |
| Global | Multi-region clusters, read replicas | Significant architecture evolution |
Each step is incremental — no rewrites required. The application code remains unchanged across all scaling levels.