WebSocket Load Testing with k6: Connections, Heartbeats, and Backpressure

WebSocket load testing with k6: concurrent connections, heartbeats, backpressure—k6 WebSocket API docs plus REST pairing patterns.

Your REST load test shows green p99 latency—then production melts when ten thousand clients open WebSocket feeds during market open. HTTP SLOs do not predict socket exhaustion, heartbeat misalignment, or reconnect storms. That gap is where dedicated WebSocket load testing earns its place.

The WebSocket protocol (RFC 6455) upgrades HTTP to a bidirectional channel. Load is measured in concurrent sockets, messages per second, subscription fan-out, and reconnect churn—not just http_req_duration on REST calls. k6's experimental WebSocket module implements client behavior suitable for API gateways, gaming backends, notification hubs, and collaborative editors.

In this guide you will learn which stress dimensions matter, how to pair REST auth with socket sessions in k6, and which signals should block a release before you coordinate a socket storm in shared staging.

Why WebSocket performance is a connection problem—not a request problem

REST benchmarks assume short-lived transactions. WebSockets hold state:

File descriptors and memory scale with open connections—reverse proxies enforce worker_connections and cloud L4 limits long before your app CPU maxes out.
Heartbeats and idle timeouts differ between client, CDN, API gateway, and origin; misaligned ping/pong intervals create silent drops that look like "random disconnects" in dashboards.
Backpressure appears when publishers outpace consumer processing—queues grow, GC pauses spike, and tail latency explodes without HTTP 5xx spikes.
Reconnect thundering herds after deploys or network blips replay subscriptions simultaneously—REST soak tests never exercise that pattern.
Mixed auth models issue tokens over HTTP then pass them on the upgrade handshake—load tests must cover both phases (minimal API script template).

Think of REST load tests as measuring checkout lane speed; WebSocket load tests measure how many lanes stay open for hours while cars keep arriving.

When REST passes but sockets fail under the same user count

Functional tests prove a single client can subscribe. They do not prove that 5,000 concurrent feed:orders subscriptions keep p99 message latency under 200ms after a rolling deploy. Tag logical feeds so summaries isolate hotspots (observability tags)—aggregate socket metrics hide one noisy channel drowning others.

Compare with SSE and long-polling when server→client streams suffice; not every API needs bidirectional sockets, but those that do fail differently.

Practical k6 implementation: connections, heartbeats, and mixed scenarios

Model three behaviors explicitly: steady concurrent connections, heartbeat-aligned idle periods, and a reconnect burst after simulated outage. Below is an illustrative script—adapt URLs, tokens, and message schemas to your environment.

What this example demonstrates:

REST pre-step for auth: setup() fetches a token before sockets connect—mirrors real clients.
Tagged subscriptions: feed:orders isolates metrics per logical channel in summaries.
Heartbeat loop: periodic socket.ping() aligned to production keepalive—not fire-and-forget connect-only tests.
Custom metrics: ws_message_latency tracks server push delay separately from HTTP upgrade time.

import ws from 'k6/ws';
import http from 'k6/http';
import { check, sleep } from 'k6';
import { Trend } from 'k6/metrics';

const BASE = __ENV.API_BASE || 'https://staging.example.com';
const WS_URL = __ENV.WS_URL || 'wss://staging.example.com/ws';
const msgLatency = new Trend('ws_message_latency', true);

export const options = {
  scenarios: {
    steady_connections: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '2m', target: 500 },
        { duration: '5m', target: 500 },
        { duration: '1m', target: 0 },
      ],
      tags: { phase: 'steady', feed: 'orders' },
      exec: 'ordersFeed',
    },
  },
  thresholds: {
    ws_connecting: ['p(95)<500'],
    ws_msgs_received: ['rate>0'],
    ws_message_latency: ['p(99)<200'],
    http_req_failed: ['rate<0.01'],
  },
};

export function setup() {
  const res = http.post(`${BASE}/v1/auth/token`, JSON.stringify({ client_id: 'load-gen' }), {
    headers: { 'Content-Type': 'application/json' },
  });
  check(res, { 'auth ok': (r) => r.status === 200 });
  return { token: res.json('access_token') };
}

export function ordersFeed(data) {
  const url = `${WS_URL}?feed=orders`;
  const params = { tags: { feed: 'orders', phase: 'steady' } };

  ws.connect(url, params, function (socket) {
    socket.on('open', () => {
      socket.send(JSON.stringify({ type: 'subscribe', token: data.token, channel: 'orders' }));
    });

    socket.on('message', (msg) => {
      const payload = JSON.parse(msg);
      if (payload.server_ts) {
        msgLatency.add(Date.now() - payload.server_ts);
      }
    });

    socket.on('ping', () => socket.pong());
    socket.setInterval(() => socket.ping(), 30000);

    socket.setTimeout(() => socket.close(), 360000);
  });

  sleep(1);
}

Patterns that work

Ramp connections gradually—instant max VUs finds proxy limits without telling you which layer failed.
Mirror production keepalive intervals on ping/pong and proxy idle timeouts.
Run mixed HTTP + WebSocket scenarios so connection pools and token TTL interact realistically (multi-step flows).
Add a reconnect scenario after steady state—simulate gateway restart or DNS flip (canary metrics).

Anti-patterns to avoid

Measuring only the HTTP upgrade request—not sustained message latency.
Opening sockets without subscribing—empty connections lie about server work.
Ignoring generator FD limits; coordinate socket storms in shared staging (ethical testing).
Using the same VU count as REST tests without understanding connection memory cost.

Pro tip (example command): surface WebSocket-specific trends alongside HTTP failures in one summary.

k6 run ws-orders-feed.js --summary-trend-stats="p(95),p(99)" --tag env=staging

What this command demonstrates: percentile trends for custom ws_message_latency and built-in ws_* metrics appear in the exit summary for release gate reviews.

Decision framework: which socket stress to run when

Situation	Recommended action
New WebSocket API before launch	Ramp to target concurrent connections + 30m steady with heartbeats
Post-deploy reconnect risk	Steady load + forced disconnect burst scenario
Feed-specific SLOs	Separate scenarios per `feed:*` tag with isolated thresholds
Gateway timeout tuning	Sweep idle durations; log silent close codes
REST-only staging parity	Pair HTTP auth scenario with socket scenario in one test plan
Unsure if sockets are required	Compare SSE load patterns first

Run steady-connection soak if production holds sockets for minutes or hours—find FD and memory leaks early (soak testing playbook).

Run reconnect burst if clients replay subscriptions after deploys, cell handoffs, or regional failovers.

Run mixed REST + WS if tokens expire mid-session and refresh paths interact with open sockets.

Observability, documentation, and next steps

Socket load tests only help if on-call can interpret close codes and lag spikes. Before you scale traffic:

Document target concurrent connections, messages/sec, and heartbeat intervals—with links to proxy and gateway configs.
Record generator FD and ephemeral port limits; note which host ran the test.
Tag feeds and phases consistently for APM and k6 export correlation (correlation IDs).
Alert on ws_message_latency p99 divergence vs baseline—not just HTTP 5xx rate.
Coordinate with platform teams; document rollback if staging shared stacks saturate.

How Performate simplifies WebSocket load testing

Glue code between REST login and socket subscriptions slows teams exactly when product wants "just test the feed." Below is a concrete workflow example for the orders WebSocket feed this article discusses.

Example: REST auth + orders subscription in one workspace

Import HTTP auth requests from Postman—POST /v1/auth/token with the same pre-request logic QA already trusts. Problem solved: token acquisition is not a handwritten one-off in every socket script.
Add the WebSocket subscription step chained after auth in the scenario builder—wss://.../ws?feed=orders. Problem solved: PMs see the end-user journey, not an isolated protocol demo.
Configure ramping VUs to 500 concurrent connections over two minutes, five-minute steady state. Problem solved: visual tuning without guessing stage syntax under deadline pressure.
Apply tags feed:orders and phase:steady on the scenario. Problem solved: reports match the k6 tag model for filtered SLO review.
Set thresholds on connection time and custom message latency trends. Problem solved: release gates beyond "it connected once locally."
Export the k6 script for nightly CI at reduced VUs (CI/CD load testing). Problem solved: desktop iteration and pipeline smoke stay aligned.

That workflow maps directly to the cta in this post: runnable scenarios and shareable reports without days of glue code.

Closing takeaway

WebSocket load testing is a connection, heartbeat, and backpressure problem—REST green checks do not transfer. Ramp concurrent sockets honestly, align keepalives with production, tag every feed, and stress reconnect paths before your next deploy.

Run a steady-connection soak against staging this week—and note which layer closes sockets first: your app, the gateway, or the generator.

Try Performate free | Book a demo | k6 WebSocket docs

Ready to optimize your API performance?

Use Performate to turn this playbook into runnable k6 scenarios, thresholds, and shareable reports without losing days to glue code.

Get Performate

← Back to all posts