Why monolith — and why this infrastructure shape

This is the foundational architectural decision for KA26. It locks the system shape we run today, the reasoning behind it, and the explicit trigger conditions that would make us revisit. It exists so that 6 months from now, when someone asks "why aren't we on microservices like everyone else?" — there's a single, considered answer.

The decision

KA26 is a pragmatic monolith plus three native mobile apps:

Single Next.js 15 application — handles all consumer web, admin web, seller web, doctor web, delivery web, AND every backend API route. One codebase, one deploy, one Cloud Run service.
Three separate Expo / React Native projects — mobile/ (consumer), mobile-seller/, mobile-doctor/. Distinct package IDs, separate APK builds, zero cross-imports.
Single Postgres database — Cloud SQL school-db instance, ~85 Prisma models, all reads and writes go through one schema.
Single GCP region — us-central1. Cloud Run + Cloud SQL + GCS bucket + Cloud Build all live there.
Single repository — sidgk/ka26-marketplace contains the Next.js app, all three mobile apps, the Prisma schema, the test suites, and the deployment scripts.

There is intentionally no microservices split, no service mesh, no multi-region setup, no Kubernetes cluster, no event-sourcing layer, no separate "API gateway" service.

Why this is the right shape for KA26 today

1. We have one engineer

The single biggest factor. With one engineer (the founder) building everything, every architectural choice has to optimize for iteration speed and debugging ease, not theoretical scalability.

A microservices architecture for KA26 would:

3× the deployment complexity (each service needs its own pipeline)
5–10× the infrastructure cost (each service needs its own Cloud Run instance, DB connection pool, monitoring)
3–5× slower iteration (cross-service coordination on every feature)
Add failure modes (network calls between services can fail; in-process function calls cannot)
Require DevOps expertise the team doesn't have time to acquire

A monolith means: change a backend route + the consumer screen that uses it in the same commit, push once, ship in 9 minutes via the existing CI/CD. No coordination overhead.

2. Our scale is small (and will stay small for a long time)

Gadag has ~50,000 households. Even at 100% adoption (the year-end goal), that's ~50,000 daily active users. Cloud Run + a single Postgres instance handles that comfortably.

For context:

Cloud Run autoscales to thousands of instances; we currently run with min-instances=0 and have never exceeded a handful of concurrent instances
Postgres on db-g1-small handles ~500 queries/sec sustained — far above what 50k DAU produces
Our entire monthly GCP bill is under $80 (April 2026)

The classic monolith failure modes — deployment bottlenecks, team coordination paralysis, scaling ceilings — only kick in at much larger scale. We are nowhere near them.

3. The boundaries that matter ARE enforced

A common (correct) critique of monoliths: "they tend to become tangled balls of mud where everything depends on everything." We've explicitly enforced the most important boundaries:

App-isolation rule: 4 separate auth contexts (consumer, seller, doctor, delivery). They never cross-import. Pinned by 103 tests in tests/app-isolation.test.ts. A consumer endpoint cannot accidentally use seller auth and vice versa.
Per-feature folders under src/app/api/: every domain (orders, offers, requests, reels, doctor, etc.) has its own route folder. Easy to find code, easy to refactor.
src/lib/ discipline: shared business logic centralized into named modules (negotiation.ts, ai-search.ts, track.ts, email.ts, push.ts). Single source of truth per domain.
Mobile apps are physically separate codebases: mobile/, mobile-seller/, mobile-doctor/ each with own package.json, own node_modules, own design system.

This gives us most of the modularity benefits of microservices (clear boundaries, single ownership, independent reasoning) without paying the distributed-systems tax.

4. The full set of features being built is broad

KA26 covers shop, ads, bidding, voice, reels, requests, orders, delivery, doctor (deprioritized), admin, and seller flows. A monolith makes cross-feature work trivial — when a bid converts to an order, no inter-service contract has to be negotiated; the same Prisma transaction handles both.

In a microservices world, the bidding service and the orders service would need a published contract, an event bus, idempotency tokens, retry semantics, and integration tests across both. That's the right cost when you have 50 engineers across 5 teams. It's an absurd cost when you have one.

Real-world precedents

Most engineering blogs glamorize microservices. The reality at well-known companies:

Company	Reality
Stripe	Ran as a Ruby monolith for 5 years before extracting any services. Processed billions in payments on it.
Shopify	Still a Ruby monolith — one of the largest in the world. Serves ~10% of all e-commerce.
Instagram	Django monolith with a billion users. Only recently extracted some services.
Basecamp / HEY	Proudly monolithic Ruby. ~70 engineers. Famously productive.
GitHub	Ruby monolith for years. Still mostly is.
Google	Massive monorepo (single repo, billions of lines). Service-oriented internally but a unified codebase.
Apple	iOS frameworks are massive monoliths. Services side is microservices but each service is BIG, not the Netflix-style fine-grained kind.
Anthropic / OpenAI	Modular Python services with shared libraries. Not "microservices" in the Conway-sense.
Banks	Mostly legacy COBOL mainframes wrapped in REST/GraphQL. Microservices only for greenfield. Slower to ship than you'd think.

The pattern: successful companies start as monoliths and extract services only when forced to by team size or scale. The companies that started as microservices are mostly the ones that died from operational complexity before reaching product-market fit.

When we'd revisit

We will reconsider this architecture when any one of these triggers:

Trigger	What it implies
5+ engineers	Multiple teams need to deploy independently without stepping on each other
10,000+ DAU sustained	Single Cloud Run instance starts hitting cold-start issues, Postgres write contention shows up on hot tables
Genuinely different scaling characteristics	E.g. video transcoding (CPU-intensive, batch) vs API serving (request-response, low CPU) — these benefit from different infra
Compliance isolation requirement	E.g. healthcare data must run in a separate process per HIPAA-equivalent regulation
Per-feature SLO requirements	E.g. payments must have 99.99% uptime guarantees that other features don't need

We are at: 1 engineer, ~100 DAU, uniform load, no compliance separation, no SLO contracts. All five triggers are far away. Probably 12+ months out. The architecture is correct.

What about the cost question

April 2026 GCP bill: ~$74/month. Honest breakdown of probable composition:

Service	Estimated monthly cost	What it does
Cloud SQL (Postgres)	$30–50	Always-on; biggest line item
Cloud Run	$5–15	Pay-per-request; scales to zero
Cloud Storage (GCS)	$1–5	Images, APKs, audio recordings
Cloud Build	$5–10	Per-deploy build minutes
Logging / Monitoring	$1–5	Mostly free tier
Networking egress	$0–5	Free first 1TB/month

This is genuinely cheap for what we run — a hobby app on Heroku costs $25/mo and gives 1/10 the infrastructure.

At 5,000 DAU we project ~$200/mo. At 50,000 DAU ~$1,500–3,000/mo. All reasonable. Cost is NOT a reason to refactor architecture today.

Cost optimization knobs (if it ever matters)

In rough order of impact:

Cloud SQL is the biggest line. Could downsize to db-f1-micro (~$10) until traffic grows. Risk: tighter memory means more swap = worse query latency. Worth it if cost is acute, not for general practice.
Cloud Run already runs min-instances=0 — pays nothing when idle. Can't optimize further without sacrificing cold-start latency.
GCS lifecycle policy — already in place; auto-deletes build artifacts older than 7 days.
Cloud Build → GitHub Actions for the Docker build. GitHub Actions is free for public repos, generous for private. Saves the Cloud Build minutes line item.
Cloud SQL → CockroachDB Serverless — free tier handles low traffic. Loses some Postgres feature parity. Only worth it if cost truly hurts.

None of these are worth the engineering time today. Cost is fine.

What we explicitly DON'T have, and why

These are deliberate omissions, not oversights:

Thing we don't have	Why
Microservices split	See above — wrong tradeoff at our scale
Kubernetes	Cloud Run handles autoscaling without operational complexity
Service mesh (Istio/Linkerd)	Solves a problem we don't have (we have one service)
Multi-region active-active	Single-region GCP outage = downtime, but the cost + complexity of multi-region is 10× the value at this scale
Event sourcing / CQRS	Adds complexity for write-heavy systems. We're read-heavy.
GraphQL	REST + Prisma is enough. GraphQL adds tooling cost (codegen, schemas, query complexity limits) that doesn't pay off until 10+ frontend consumers exist
Redis caching layer	Postgres handles current load fine. Will add when first query starts to bottleneck.
Background job queue	Most async work is fast enough to do inline. Cart-abandonment + similar use GitHub Actions cron. Will add BullMQ when first long-running job appears.
Staging environment	This one IS a gap; planned for the week of 2026-05-04. Single biggest risk-reduction we'll do this month.
Per-user feature flags	We have per-env flags (`PAYMENTS_ONLINE_ENABLED`, etc.) — works for global on/off. Will add per-user when we need 10% rollout testing.

Anti-patterns we've explicitly avoided

Bugs we've fixed by NOT building:

No premature service splits. Every "we should have a separate service for X" instinct has been deferred. Modules in src/lib/ give us 90% of the benefit at 5% of the cost.
No premature performance optimization. No caching, no read replicas, no query result memoization. Will add when first slow query is identified, not before.
No premature consistency layers. No Kafka, no event bus, no async messaging. Database transactions handle our consistency needs.
No premature abstraction layers. Routes call Prisma directly. Will add a service/repository layer if 5+ engineers ever need to share business logic (unlikely for 12+ months).

How this ties back to product velocity

The architecture exists to serve the product. KA26 is in pre-product-market-fit. The single most important property of the system is: how fast can the founder ship and iterate.

Pragmatic monolith optimizes for that. Every layer we don't have is a layer we don't have to debug, deploy, or document. Every premature abstraction we don't ship is engineering time spent on actual user-facing features.

The 30-day focus principle (set 2026-05-01): no new feature surfaces, deepen what exists, ship the killer "X" use-case. The architecture already supports this. We won't change it during this focus window.

Reading list — internal cross-references

System Architecture overview — the current shape, end-to-end
Services — Cloud Run, Cloud SQL, GCS, Cloud Build details
Domains & hosting — DNS, Cloudflare, GitHub Pages
Monitoring & observability — Sentry, hourly cron, logs
Testing philosophy — the 4-class test taxonomy that lets a monolith be safely refactored
Bug tracking — BUGS-FIXED.md as the institutional memory of past mistakes

TL;DR

We have a pragmatic monolith because we have one engineer, ~100 DAU, and a 12-month window before any of the standard "you need microservices now" triggers fire. It's the right shape for our stage. Every successful consumer marketplace started this way. The infrastructure costs ~$74/month and will scale linearly to maybe $2k/month at 50k DAU — all reasonable. We'll revisit when 5+ engineers join, when we hit 10k+ DAU, or when a feature genuinely needs different scaling characteristics from the rest. None of these is close. Architecture is correct. Stop debating it; ship the product.

The decision​

Why this is the right shape for KA26 today​

1. We have one engineer​

2. Our scale is small (and will stay small for a long time)​

3. The boundaries that matter ARE enforced​

4. The full set of features being built is broad​

Real-world precedents​

When we'd revisit​

What about the cost question​

Cost optimization knobs (if it ever matters)​

What we explicitly DON'T have, and why​

Anti-patterns we've explicitly avoided​

How this ties back to product velocity​

Reading list — internal cross-references​

TL;DR​