Why Node.js Scales Poorly by Default
Node.js is single-threaded with an event loop. CPU-intensive tasks block the loop. Memory leaks accumulate. Callback hell becomes promise hell. Without deliberate architecture, Node.js applications collapse under production load.
We have seen applications that handle 1,000 requests per second in development grind to 100 requests per second in production, not because of the framework but because of architectural decisions made early that were never revisited.
Memory management is equally critical. Node.js applications run in a single process with a single heap. Without explicit limits, memory usage grows until the operating system kills the process. The symptoms are subtle at first: gradual slowdown, occasional restarts, then cascading failures.
The Cluster Module
Use Node.js cluster module or PM2 to fork multiple processes per machine. Each process runs on a separate core. This turns a single-threaded runtime into a multi-core server. Essential for any production deployment.
Worker Threads for CPU Work
For CPU-intensive tasks — image processing, PDF generation, complex calculations — use worker threads. They run on separate threads, not the event loop. This keeps the main thread responsive for I/O operations.
Memory Management
Set explicit memory limits with --max-old-space-size. Monitor heap usage in production. Profile memory leaks with clinic.js or the built-in inspector. The most common leak sources are: unclosed connections, growing caches without eviction, and event listeners that are never removed.
Error Handling and Resilience
Unhandled Promise rejections crash Node.js processes. Always attach catch handlers to async operations. Use process.on('unhandledRejection') as a safety net, but never rely on it — it is a last resort, not a strategy.
Streaming and Backpressure
Node.js streams are powerful but dangerous. Backpressure — when a fast producer overwhelms a slow consumer — causes memory spikes and eventual crashes. Always handle backpressure events: pause the producer when the consumer's buffer is full, resume when it drains. The pipe() method handles this automatically, but manual stream programming requires explicit backpressure management.
We use streams for file uploads, real-time data processing, and large response generation. The key pattern is to never buffer entire datasets in memory. Stream data from source to destination, transforming it incrementally. For example, when generating a CSV export of a million records, stream rows from the database through a transform stream that formats them as CSV, then pipe to the HTTP response. Memory usage remains constant regardless of dataset size.
Monitoring and Observability
Production Node.js applications need three layers of visibility: application metrics (request rates, response times, error rates), system metrics (CPU, memory, event loop lag), and business metrics (active users, transactions per minute, revenue per request). We instrument applications with the prom-client library for Prometheus metrics, exposing an /metrics endpoint that scrapers poll every 15 seconds.
Our Recommendation
Cluster by default. Worker threads for CPU work. Explicit memory limits. Systematic profiling for leaks. Comprehensive error handling. Streaming for large data. Monitoring for everything. And always profile before you optimise — premature optimisation wastes time and often makes things worse.
The Node.js ecosystem is mature, the tooling is excellent, and the performance is competitive with any runtime when architected correctly. The failures we see are rarely due to Node.js itself; they are due to assumptions carried over from other platforms, insufficient operational visibility, or architectural decisions made under time pressure that were never revisited.