Scaling Strategies¶

KanjiIQ is designed to scale from its current single-node deployment to a multi-node, multi-region architecture as traffic grows.

Current State¶

Metric	Value
Cluster nodes	1 (Hetzner dedicated)
Application replicas	2
Database	Single PostgreSQL instance
Traffic handling	~100 req/min per IP (rate limited)

This handles the current traffic comfortably. The sections below outline the scaling path as demand increases.

Horizontal Pod Autoscaling (HPA)¶

The first scaling step is adding HPA to automatically adjust replica count based on load:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: jlpt-kanji-hpa
  namespace: jlpt-kanji
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: jlpt-kanji
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80

Requirements: Metrics Server must be installed (included in k3s by default).

Impact: Application scales automatically between 2-10 replicas based on CPU/memory pressure, with no code changes.

Multi-Node Cluster¶

When a single node reaches resource limits, add worker nodes to the k3s cluster:

# On the new worker node
curl -sfL https://get.k3s.io | K3S_URL=https://master:6443 \
  K3S_TOKEN=<node-token> sh -

Kubernetes automatically schedules pods across all available nodes. The application requires no changes — it is already stateless.

Node Affinity (Optional)¶

To control pod placement:

spec:
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - weight: 100
          podAffinityTerm:
            labelSelector:
              matchLabels:
                app: jlpt-kanji
            topologyKey: kubernetes.io/hostname

This spreads replicas across different nodes for better fault tolerance.

Database Scaling¶

Connection Pooling¶

Add PgBouncer as a sidecar container to pool database connections:

containers:
  - name: pgbouncer
    image: edoburu/pgbouncer
    ports:
      - containerPort: 6432
    env:
      - name: DATABASE_URL
        valueFrom:
          secretKeyRef:
            name: jlpt-kanji-secrets
            key: database-url

The backend connects to PgBouncer on :6432 instead of PostgreSQL directly, reducing connection overhead.

Read Replicas¶

For read-heavy workloads (flashcard queries), add PostgreSQL streaming replication:

Primary instance handles writes (study sessions, quiz results)
Read replicas handle reads (kanji/vocabulary queries)
Backend routes queries based on operation type

Managed Database¶

The simplest database scaling path is migrating to a managed service:

AWS RDS: Multi-AZ, automated backups, read replicas
GCP Cloud SQL: HA configuration, automatic failover
Hetzner Managed PostgreSQL: When available

See Portability for migration details.

CDN Layer¶

Static frontend assets can be served through a CDN for global performance:

graph LR
    U[User] --> CF[Cloudflare CDN]
    CF -->|Cache HIT| U
    CF -->|Cache MISS| N[Nginx Frontend]
    N --> CF

Since the Flutter Web frontend produces static files (JS, CSS, images), these are ideal CDN candidates:

Cache policy: 1 year for hashed assets, no-cache for index.html
Global PoPs: Content served from the nearest edge location
DDoS protection: CDN absorbs volumetric attacks

Cloudflare DNS is already in place — enabling proxy mode activates the CDN layer.

Scaling Roadmap¶

Traffic Level	Infrastructure	Key Changes
Current (low)	1 node, 2 replicas, single PG	None needed
Growing (moderate)	1 node, HPA (2-10 replicas)	Add HPA manifest
High	2-3 nodes, HPA, PgBouncer	Add worker nodes + connection pooling
Very High	Multi-node, managed DB, CDN	Migrate DB to RDS/Cloud SQL, enable CDN
Global	Multi-region clusters, read replicas	Significant architecture evolution

Each step is incremental — no rewrites required. The application code remains unchanged across all scaling levels.