Java Deep Dive: The Secrets of Profiling

27 slides

Press or Space to advance

Press F to toggle fullscreen

Back to talk details

Scaling Node.js in Production

Practical strategies for millions of requests per day.

What we'll cover

  • Process management
  • Performance optimization
  • Observability at scale
  • Reliability patterns

The Event Loop

Node.js is single-threaded.

But that's a feature, not a bug.

Understanding the Event Loop

   ┌───────────────────────────┐
┌─►│           timers          │
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │     pending callbacks     │
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │       idle, prepare       │
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │           poll            │
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
│  │           check           │
│  └─────────────┬─────────────┘
│  ┌─────────────┴─────────────┐
└──┤      close callbacks      │
   └───────────────────────────┘

The Golden Rule

Never block the event loop.

Common Blockers

  • Synchronous file I/O
  • CPU-intensive computation
  • Large JSON parsing
  • Complex regex

Cluster Mode

Use all your CPU cores.

const cluster = require('cluster');
const numCPUs = require('os').cpus().length;

if (cluster.isPrimary) {
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
}

Process Managers

Options for production:

  • PM2
  • systemd
  • Docker + orchestrator
  • Kubernetes

Graceful Shutdown

Don't drop in-flight requests.

process.on('SIGTERM', async () => {
  server.close();
  await drainConnections();
  process.exit(0);
});

Memory Management

V8 has limits.

Default: ~1.5GB on 64-bit systems.

Detecting Memory Leaks

Signs to watch for:

  • RSS growing over time
  • Increasing GC frequency
  • Longer GC pauses

Heap Snapshots

Take them in production.

Compare before and after.

Find retained objects.

Connection Pooling

Don't create new connections per request.

Reuse them.

Database Pool Sizing

const pool = new Pool({
  max: 20,
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

Too few: queuing Too many: database overload

Caching Strategies

  1. In-memory (fastest, limited)
  2. Redis (distributed, durable)
  3. CDN (edge, static content)

Cache Invalidation

The two hard problems:

  1. Naming things
  2. Cache invalidation
  3. Off-by-one errors

Structured Logging

JSON, not strings.

logger.info({
  event: 'request_completed',
  duration_ms: 45,
  status: 200,
  path: '/api/users',
});

Metrics That Matter

  • Request rate (throughput)
  • Error rate (reliability)
  • Latency percentiles (performance)
  • Saturation (capacity)

The RED Method

Rate - requests per second

Errors - failed requests

Duration - latency distribution

Distributed Tracing

Follow requests across services.

OpenTelemetry is the standard.

Circuit Breakers

Fail fast when downstream is broken.

const breaker = new CircuitBreaker(apiCall, {
  timeout: 3000,
  errorThreshold: 50,
  resetTimeout: 30000,
});

Retry with Backoff

Exponential backoff prevents thundering herd.

const delay = Math.min(
  baseDelay * Math.pow(2, attempt),
  maxDelay
);

Health Checks

Liveness: Is the process alive?

Readiness: Can it serve traffic?

Load Shedding

When overloaded, reject gracefully.

Better to serve some requests well than all requests poorly.

Key Takeaways

  1. Understand and respect the event loop
  2. Horizontal scaling is your friend
  3. Observability is not optional
  4. Plan for failure

Resources

  • github.com/gruzewski/nodejs-scaling-examples
  • Node.js documentation
  • OpenTelemetry docs

Thank You

Questions?

@gruzewski