Zero-downtime migration across three platforms.
We moved a production SaaS from a single platform to a multi-service architecture — Vercel, Fly.io, and Neon — without dropping a request. Here's what broke and how we fixed it.
Outgrowing a single platform.
disposal.space started on Railway — web app, API, worker, database, Redis, and job orchestration all in one place. It worked great until each service needed different things that a single platform couldn't provide.
Web app needed edge delivery
Next.js 15 performs best on Vercel — native ISR, edge middleware, and deployment skew protection. Railway didn't offer any of that.
Worker kept crashing the app
PDF parsing and embedding generation consumed up to 8GB of memory. Running these in the same process as the web app caused crashes for all users.
Database needed serverless scaling
Railway's PostgreSQL had no auto-scaling, no point-in-time recovery, and no connection pooling. We needed a database that could scale with demand.
Split, move, verify. Service by service.
We migrated each service independently to the platform that best fit its requirements. Every step was verified before cutting over DNS.
Database → Neon
Migrated PostgreSQL (with pgvector) to Neon's serverless platform in Frankfurt. Point-in-time recovery and auto-scaling compute included.
Backend → Fly.io
Moved the Express API to Fly.io in Stockholm (arn region). SSE streaming, Redis-based rate limiting, and persistent connections — all incompatible with serverless.
Worker → Fly.io (8GB)
Isolated the heavy processing worker on its own Fly.io machine with 8GB RAM. PDF parsing no longer risks crashing the web app.
Web → Vercel + DNS cutover
Deployed the Next.js app to Vercel, moved Inngest to Cloud, reconfigured DNS via Cloudflare, and cut over with zero downtime.
Four things we didn't see coming.
Migrations always surface surprises. Here's what broke after going live — and how we fixed each issue within hours.
Cross-region latency spike
Railway's private networking gave sub-millisecond internal calls. Post-migration, every service call crossed the public internet. Redis latency jumped from <1ms to 50ms on every API request. Fix: moved Redis to Fly.io in the same region as the backend.
Inngest signature verification failures
Railway terminates HTTPS at the edge and forwards HTTP internally. The Inngest SDK reconstructed URLs with the wrong protocol, breaking signature verification. Fix: enabled trust proxy in Express.
CloudFront key pair lost
The original private key for CloudFront signed URLs was lost during migration. All CDN file downloads broke. Fix: regenerated PKCS#8 key pair, uploaded to CloudFront, and ensured the Key Pair ID (not Key Group ID) was used.
V8 memory crash in text chunking
The text chunking function had an edge case that caused an infinite loop, exhausting V8's memory. Fix: upgraded to Node.js 22, added minimum chunk size advancement, and max iteration limits.
The result.
A multi-platform architecture where each service runs on the platform best suited to its needs — with better performance, reliability, and cost control than the single-platform setup.
Dropped requests during the full migration. DNS cutover was seamless.
Services across 3 platforms — each running where it performs best.
Total infrastructure cost for the full multi-platform setup.
Outgrowing your infrastructure?
We help teams migrate, re-platform, and scale their infrastructure without disrupting users. Let's figure out the right architecture for your stage.