Performance Testing with K6: From Script to CI

Introduction

Performance issues rarely fail loudly in development, but they fail fast in production. That is why I use K6 for lightweight, developer-friendly load testing that fits directly into modern CI/CD workflows. In this guide, I will walk through practical K6 patterns I use for API performance testing in real QA cycles.

Why Performance Testing Matters

Performance issues are among the top reasons users abandon applications. A slow response time can lead to:

User Abandonment: Studies show users abandon apps that take more than 3 seconds to load
Business Impact: Reduced conversion rates and revenue loss
Brand Damage: Negative reviews and reputation damage
Infrastructure Costs: Inefficient resource utilization

Performance testing helps identify bottlenecks before they impact users, allowing you to optimize database queries, reduce API response times, and ensure scalability.

Why K6 for QA Teams?

Code-first approach: Tests are plain JavaScript, easy to review and version-control
Fast setup: Minimal tooling and very quick local execution
CI-friendly: Clear pass/fail with thresholds and exit codes
Readable metrics: Built-in latency, error rate, throughput, and percentile tracking
Scalable strategy: Same script can run locally, in containers, or cloud runners

Core Performance Test Types in K6

1. Load Testing

Load testing simulates realistic user loads to measure system response. It helps you understand how your application behaves under normal and peak load conditions.

                Example: Simulate 300 virtual users hitting checkout API during peak evening traffic
            

2. Stress Testing

Stress testing pushes the application beyond normal capacity to find breaking points. It determines the maximum load your system can handle before it fails.

3. Endurance Testing

Also called "soak testing," endurance testing runs the application under load for an extended period to identify memory leaks and resource degradation.

4. Spike Testing

Spike testing suddenly increases the load to extreme levels to test system behavior during unexpected traffic surges.

Setting Up K6 in Minutes

Performance testing setup process for K6 scripts

Installation

                
                    # Windows (winget)

                    winget install k6

                    # Verify

                    k6 version

A Basic K6 API Test

The script below validates response status and latency with a clear threshold-driven pass/fail outcome.

                
                    import http from 'k6/http';

                    import { check, sleep } from 'k6';

                    export const options = {

                      stages: [

                        { duration: '1m', target: 50 },

                        { duration: '3m', target: 50 },

                        { duration: '1m', target: 0 },

                      ],

                      thresholds: {

                        http_req_failed: ['rate<0.01'],

                        http_req_duration: ['p(95)<500'],

                      },

                    };

                    export default function () {

                      const res = http.get('https://api.example.com/health');

                      check(res, {

                        'status is 200': (r) => r.status === 200,

                        'response time under 500ms': (r) => r.timings.duration < 500,

                      });

                      sleep(1);

                    }

Run the Test


                    k6 run smoke-test.js

Designing Reliable K6 Scenarios

1. Start with Business-Critical Endpoints

Target login, checkout, search, payment, and any endpoint directly tied to user revenue or retention.

2. Use Realistic Traffic Models

Ramp users gradually instead of immediate spikes
Include think time between requests
Use mixed workflows (read-heavy + write-heavy)
Feed test data from CSV/JSON to avoid cache-only behavior

3. Track the Right Metrics

p95 / p99 latency: Better than averages for user reality
Error rate: Should remain below agreed SLO threshold
Throughput: Requests/sec at stable latency
Saturation signals: CPU, DB pool, memory, queue depth

K6 in CI/CD: Quality Gate Example

CI pipeline integration for performance testing quality gates

Performance tests should run as part of your release pipeline, not as an afterthought. Thresholds make this easy because K6 exits with a failed status when limits are exceeded.

                
                    # Example pipeline step

                    k6 run --summary-export=summary.json perf-regression.js

Suggested CI Strategy

PR checks: 2-5 minute smoke performance test
Nightly: Longer regression suite with higher load
Pre-release: Stress and endurance scenarios
Post-deploy: Lightweight synthetic checks in production-safe mode

K6 vs Traditional Tools

Area	K6	Traditional GUI Tools
Test Authoring	JavaScript, code review friendly	GUI-heavy, harder diff history
Version Control	Excellent	Moderate
CI Integration	Native and simple	Often plugin or custom setup
Learning Curve	Low for JS teams	Varies by tool
Best Fit	Modern DevOps workflows	Enterprise legacy ecosystems

Common Mistakes to Avoid

Running load tests against shared unstable environments
Using only average latency and ignoring percentiles
Skipping test data variation, causing unrealistic caching effects
Not correlating app metrics with infrastructure metrics
Treating one successful run as final proof of performance

Real-World Example: Checkout API Gate

For a fintech checkout flow, we set these non-negotiable release gates:

p95 latency: under 400ms
Error rate: below 1%
Availability under peak: stable at 300 virtual users
No resource saturation: DB and app pool remain within safe limits

With K6 thresholds in CI, any commit that violates these goals gets blocked before release. That one change made performance quality visible to the entire team.

Conclusion

K6 is a practical, modern option for teams that want performance testing to behave like code quality: measurable, repeatable, and automated. Start small with one critical endpoint, define clear thresholds, and gradually expand into stress and endurance suites.

Back to Blogs