Day 7: Observability in Node Apps (Logs, Metrics, Traces with OpenTelemetry)

When building production-ready Node.js applications, understanding what's happening under the hood is critical. Observability is the key to unlocking this understanding. It helps us monitor, debug, and optimize our apps in real time. Today, we’ll deep dive into logs, metrics, and traces, and see how OpenTelemetry can unify them.

What is Observability?

Observability is the ability to measure your system’s internal state from the outside. Unlike simple monitoring, which tracks metrics like CPU or memory usage, observability provides a deeper insight into your application’s behavior, allowing you to quickly identify and resolve issues.

The three pillars of observability are:

Logs – Record discrete events in your app.
Metrics – Quantitative data about your app’s performance.
Traces – Track requests as they flow through your system, showing how components interact.

Think of it like a doctor’s tools: logs are your patient’s history, metrics are their vitals, and traces are their movement through the hospital.

1️⃣ Logging in Node.js

Logs are the simplest and most widely used observability tool. They help us track events, errors, and debug information.

Best Practices:

Use structured logs (JSON format) for better parsing and querying.
Include contextual information like request ID, user ID, or transaction ID.
Avoid excessive logging in production to prevent performance degradation.

Example with winston:

import winston from "winston";

const logger = winston.createLogger({
  level: "info",
  format: winston.format.json(),
  transports: [
    new winston.transports.Console(),
    new winston.transports.File({ filename: "app.log" })
  ]
});

logger.info("Server started on port 3000", { port: 3000 });
logger.error("Database connection failed", { retrying: true });

2️⃣ Metrics in Node.js

Metrics give quantitative insights into how your app is performing. Typical metrics include:

Response time
Error rates
CPU/memory usage
Request counts

Example using prom-client for Prometheus:

import client from "prom-client";

const collectDefaultMetrics = client.collectDefaultMetrics;
collectDefaultMetrics();

const httpRequestDurationMicroseconds = new client.Histogram({
  name: "http_request_duration_ms",
  help: "Duration of HTTP requests in ms",
  labelNames: ["method", "route", "status_code"]
});

// Use in Express middleware
app.use((req, res, next) => {
  const end = httpRequestDurationMicroseconds.startTimer();
  res.on("finish", () => {
    end({ method: req.method, route: req.path, status_code: res.statusCode });
  });
  next();
});

Prometheus can then scrape these metrics and visualize them in Grafana dashboards.

3️⃣ Tracing with OpenTelemetry

Traces help you see how a request moves through your system, which is crucial for microservices or distributed architectures. OpenTelemetry (OTel) is a standardized framework for logs, metrics, and traces.

Setup example for Node.js:

import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node";
import { SimpleSpanProcessor } from "@opentelemetry/sdk-trace-base";
import { ConsoleSpanExporter } from "@opentelemetry/sdk-trace-base";

const provider = new NodeTracerProvider();
provider.addSpanProcessor(new SimpleSpanProcessor(new ConsoleSpanExporter()));
provider.register();

const tracer = provider.getTracer("example-node-app");

// Example span
const main = async () => {
  const span = tracer.startSpan("main-operation");
  // simulate work
  await new Promise(resolve => setTimeout(resolve, 100));
  span.end();
};

main();

You can integrate OpenTelemetry with Jaeger, Zipkin, or cloud observability tools like AWS X-Ray for full distributed tracing.

Why OpenTelemetry?

OpenTelemetry is becoming the industry standard for observability because it provides:

Unified instrumentation for logs, metrics, and traces
Vendor-neutral solution (works with Prometheus, Jaeger, Datadog, New Relic, etc.)
Easy integration with Node.js apps

Putting It All Together

A well-observed Node.js app might look like this:

Logs: Use Winston or Pino to log important events and errors.
Metrics: Export Prometheus metrics for monitoring CPU, memory, request latency.
Traces: Use OpenTelemetry to trace requests across services, catching bottlenecks and failures.

This triad gives you visibility, diagnostic power, and confidence to run your app at scale.

Summary: Key Takeaways

Observability ≠ Monitoring. It’s about understanding your system.
Structured logging, metrics collection, and tracing are the pillars.
OpenTelemetry is a one-stop solution for full observability.
Always instrument your app before it goes to production—debugging live apps without observability is like flying blind.

Day 7: Observability in Node Apps (Logs, Metrics, Traces with OpenTelemetry)

What is Observability?

1️⃣ Logging in Node.js

2️⃣ Metrics in Node.js

3️⃣ Tracing with OpenTelemetry

Why OpenTelemetry?

Putting It All Together

Summary: Key Takeaways

Comments

More from this blog

Day 15: Asynchronous Iterators & Generators in Node.js 🚀

Day 14: Designing a Distributed Rate Limiting System (Token Bucket + Redis)

Day 13: Designing a High-Throughput Logging System (ELK/Kafka pipelines)

Day 12: Message Queues & Event-Driven Architectures (Kafka, RabbitMQ)

Day 11: Database Replication & Sharding (MongoDB + SQL Tradeoffs)

Command Palette

What is Observability?

1️⃣ Logging in Node.js

2️⃣ Metrics in Node.js

3️⃣ Tracing with OpenTelemetry

Why OpenTelemetry?

Putting It All Together

Summary: Key Takeaways

Comments

More from this blog