Skip to main content
May 12, 2026how many virtual users k6

By Performate

How Many Virtual Users Should You Start With in k6?

Estimate starting VUs from concurrency targets and arrival rates, avoid saturated generators, and align k6 options with your scenario question.

Someone on Slack said “start at 500 VUs” because that number worked on another product. Your API is open-loop, your staging pool is shared, and the script still has an auth bug at iteration two. The test saturates the generator, annoys neighbors, and answers nothing about production headroom.

There is no universal starting VU count. The right number depends on whether you model closed workloads (fixed concurrent sessions) or open workloads (requests arriving independent of session count). In this guide you will learn how to translate analytics into a first k6 configuration, when VUs matter less than arrival rate, and how to read the summary when your “obvious” VU count is wrong.

If you are still choosing test type, read stress vs load vs spike first. Pair pacing decisions with think time and concurrency so VU counts reflect human gaps, not bot loops.

Why the same VU count means different things in k6

k6 virtual users are concurrent script executors, not literal humans. Executor choice changes what VUs control:

  • Closed models (constant-vus, ramping-vus): VU count ≈ concurrent sessions looping through steps; removing sleep increases throughput for the same VUs.
  • Open models (constant-arrival-rate, ramping-arrival-rate): target req/s is primary; k6 adds VUs until the rate holds (arrival-rate executors).
  • Generator limits: when iteration duration rises non-linearly, you may need more maxVUs or a smaller rate—not “more users” by intuition.

Think of VUs as musicians and arrival rate as tempo—same headcount, different rhythm, different sound.

When high VUs lie about capacity

Jumping straight to high VUs without a smoke pass lets script bugs dominate at scale (beginner guide). On shared staging, oversized closed models also trigger noisy-neighbor failures unrelated to your service. Change one dimension per experiment: VUs, duration, or arrival rate—not all three (baseline regression).

Practical k6 implementation: smoke, closed ramp, and open rate

Start small, prove fidelity, then scale the dimension that matches your question.

Example script (illustrative—not production-ready). Fictional URLs and numbers; adapt to your environment.

What this example demonstrates:

  • Smoke stage at 3 VUs validating auth and sequencing before load stages run.
  • Closed ramp (ramping-vus) when modeling concurrent browser-like sessions.
  • Open rate (constant-arrival-rate) when the question is sustained RPS with think time included.
  • Dropped iteration awareness via thresholds on http_req_failed and inspection of iteration metrics.
import http from 'k6/http';
import { check, sleep } from 'k6';

const BASE = __ENV.API_BASE || 'https://staging.example.com';

export const options = {
  scenarios: {
    smoke_closed: {
      executor: 'constant-vus',
      vus: Number(__ENV.SMOKE_VUS || 3),
      duration: '2m',
      tags: { phase: 'smoke' },
      exec: 'apiFlow',
    },
    peak_closed: {
      executor: 'ramping-vus',
      startVUs: 0,
      stages: [
        { duration: '2m', target: Number(__ENV.START_VUS || 25) },
        { duration: '5m', target: Number(__ENV.PEAK_VUS || 50) },
        { duration: '2m', target: 0 },
      ],
      startTime: '2m',
      tags: { phase: 'closed_peak' },
      exec: 'apiFlow',
    },
    sustained_open: {
      executor: 'constant-arrival-rate',
      rate: Number(__ENV.TARGET_RPS || 40),
      timeUnit: '1s',
      duration: '8m',
      preAllocatedVUs: 20,
      maxVUs: Number(__ENV.MAX_VUS || 120),
      startTime: '11m',
      tags: { phase: 'open_rate' },
      exec: 'apiFlow',
    },
  },
  thresholds: {
    http_req_failed: ['rate<0.01'],
    'http_req_duration{phase:smoke}': ['p(95)<500'],
    'http_req_duration{phase:closed_peak}': ['p(95)<800'],
    'http_req_duration{phase:open_rate}': ['p(95)<700'],
  },
};

export function apiFlow() {
  const res = http.get(`${BASE}/orders?page=1`, {
    headers: { Authorization: `Bearer ${__ENV.TOKEN}` },
    tags: { route: 'orders' },
  });
  check(res, { 'orders 2xx': (r) => r.status >= 200 && r.status < 300 });
  sleep(Number(__ENV.THINK_SEC || 2));
}

Patterns that work

  • Derive closed VUs from peak concurrent sessions in analytics (not daily uniques)—apply 25–50% on shared staging first.
  • Derive open rates from business RPS/iteration targets; tune maxVUs until dropped iterations stay zero (ramping arrival rate).
  • Short smoke at 1–5 VUs before any ramp; fix checks and tokens early.
  • Compare runs with frozen datasets and environments so VU changes are interpretable.

Anti-patterns to avoid

  • Copying VU counts from blog posts or other teams without matching executor model.
  • Raising VUs and arrival rate in the same run when triaging regressions.
  • Ignoring dropped_iterations in arrival-rate summaries—latency then describes an under-fed test.

Pro tip (example command):

k6 run vu-sizing.js -e SMOKE_VUS=3 -e PEAK_VUS=40 --summary-trend-stats="p(95),p(99)"

What this command demonstrates: percentile trends per phase tag across smoke, closed ramp, and open-rate stages in one artifact.

Decision framework: closed VUs vs open arrival rate

SituationRecommended starting point
Browser/mobile session loops with think time1–5 VU smoke → ramp to 25–50% of peak concurrent sessions
API gateway measured in RPSconstant-arrival-rate at 25–50% target RPS; raise maxVUs if iterations drop
Shared staging with noisy neighborsLower fraction (25%); shorter duration; off-peak windows
CI smoke after merges1–3 VUs or low rate for 2–3 minutes; strict http_req_failed
Stress finding breaking pointOne dimension up per run after stable baseline (stress vs load)

Use closed VUs when concurrency itself is the risk (sessions, carts, websockets).

Use open arrival rate when the contract is throughput and sessions are fungible.

Raise maxVUs before peak VUs in open models when think time increases but target RPS stays fixed.

What to read in the k6 summary output

Look at iteration duration, dropped iterations (arrival-rate modes), and http_req_failed alongside latency percentiles (metrics). If average iteration time rises non-linearly with modest VU increases, you likely hit app or generator limits—not “wrong VU magic.” Cross-check stakeholder narratives with how to read load test reports once your starting load stabilizes.

Pre-run checklist

  • Document whether the scenario is closed (concurrent sessions) or open (target RPS) and why.
  • Record analytics source for peak concurrency or RPS (dashboard name, date range).
  • Run smoke at ≤5 VUs or minimal rate until checks pass on auth and payloads.
  • Set maxVUs on arrival-rate scenarios high enough to avoid silent under-shooting.
  • Change only one sizing knob per experiment until results plateau.

How Performate helps you find the smallest load that answers the question

Performate wraps k6 with desktop workflows so teams clone scenarios, tweak ramps or arrival targets, and compare runs without rewriting scripts from scratch.

Example: size VUs from smoke to peak in one workspace

  1. Import the Postman or OpenAPI collection for your critical path. Problem solved: requests stay aligned with what QA already validates.
  2. Run a 3-VU smoke in the editor; fix token refresh and sequencing before scaling. Problem solved: script bugs surface cheaply.
  3. Switch executor to ramping-vus with stages copied from analytics (e.g. 25 → 50 VUs). Problem solved: closed-model tuning without hand-editing options blocks.
  4. Duplicate scenario as constant-arrival-rate at target RPS; increase maxVUs until dropped iterations read zero. Problem solved: open-model sizing visible in the same report UI.
  5. Compare runs side by side with exported charts for engineering review.
  6. Export k6 for CI once sizing stabilizes (thresholds examples).

Closing takeaway

Starting VUs are a hypothesis, not a badge. Derive them from executor model and analytics, smoke before you ramp, and scale one dimension at a time until the summary tells a coherent story.

Run smoke this afternoon, then one closed ramp and one open-rate stage—note whether tails move because of the app or because iterations were dropped.

Try Performate free | Book a demo | k6 scenarios documentation

Ready to optimize your API performance?

Use Performate to model VU ramps, compare runs, and tune your starting load faster.

← Back to all posts