This page provides a quick read of Chappe’s performance (JVM and native) against major HTTP servers. Full numbers, run history and detailed methodology are kept in BENCHMARKS.md.

TL;DR

On Linux x86_64 (12 cores, host network, Docker), with a wrk2 open-loop harness and p99 < 10 ms as quality-of-service filter:

Tier Servers Sustained p99

Top

nginx · Jetty 12 · Netty 4

200k req/s

2.3 – 5.1 ms

Mid

chappe-jvm · chappe-native · Helidon SE 4 · JDK · Go

100k req/s

2.5 – 3.0 ms

Bottom

Vert.x 4 (defaults)

< 100k

high tail

Pragmatic read: Chappe (JVM and native) sustains 100,000 req/s with p99 < 3 ms — covers the vast majority of production HTTP workloads. Below that load, Chappe is on par with Helidon SE 4 (same virtual-threads architecture), with a 37 MB image / 3 MiB RSS footprint for the native build vs 389 MB / 38 MiB on the JVM.

Headline numbers (canonical run 2026-05-18, unleashed mode)

Service Image (MB) RSS idle Sustained @ p99<10ms p50 p99

nginx 1.27

46.0

87 MiB

200,000

0.94 ms

2.35 ms

Jetty 12.0.21

389.3

100 MiB

200,000

1.03 ms

2.46 ms

Netty 4.2

389.3

62 MiB

200,000

0.92 ms

5.07 ms

chappe-jvm

389.3

38 MiB

100,000

1.13 ms

2.46 ms

Helidon SE 4

389.3

57 MiB

100,000

1.16 ms

2.54 ms

JDK HttpServer 25

389.3

39 MiB

100,000

1.20 ms

2.74 ms

chappe-native

37.1

3 MiB

100,000

1.18 ms

2.96 ms

Go 1.24 net/http

7.3

1.7 MiB

100,000

1.16 ms

2.73 ms

Vert.x 4.5

389.3

80 MiB

0 (p99 > 100 ms at 100k)

  • Conditions: wrk2 4 threads / 100 connections / 30 s, fresh Docker container per rate, network_mode: host, payload GET / → "ok" (2 bytes, text/plain).

  • The p99 < 10 ms filter retains the highest rate at which the p99 latency stays below 10 ms — that is the quality of service seen by an actual HTTP client.

  • Progressive warmup attempt (10 s @ 100k then measurement @ 200k): does NOT help Chappe reach 200k @ p99 < 10 ms — p99 stays around ~210 ms even with a fresh container and careful warmup. The limit is architectural (Loom model: 1 VT per connection — see the JFR profile in BENCHMARKS.md).

Why is Chappe in mid-tier and not top-tier?

JFR profiling (60 s under 200k req/s load, full profile):

  • GC ruled out: 19 G1New pauses, max 2.83 ms, total 34 ms over 80 s.

  • Identified culprit: 140 ThreadParks on ForkJoinPool.awaitWork, 10-32 ms each.

  • Chappe model = 1 virtual thread per connection. Under TCP bursts at 200k req/s, Loom carriers oscillate between runWorker and awaitWork, accumulating micro-stalls that degrade the latency tail.

  • Jetty / nginx / Netty / Helidon Níma use a hybrid NIO event loop
    thread pool / virtual thread: their loop never sleeps (epoll_wait blocks in the kernel, not in the scheduler) — no wake-up step in the chain.

Considered optimization (not delivered): optional "selector loop" mode in Chappe (NIO event loop multiplexing + virtual-thread handler) — the Helidon Níma approach. Estimated effort ★★★.

Chappe native vs JVM (GraalVM CE 25)

Metric chappe-native chappe-jvm Native delta

Docker image

37 MB

389 MB

−90 % (×10.5)

Idle RSS

3 MiB

38 MiB

−92 % (×12)

Sustained throughput

100,000 req/s

100,000 req/s

identical

p50 @ 100k

1.18 ms

1.13 ms

+0.05 ms

p99 @ 100k

2.96 ms

2.46 ms

+0.50 ms

Binary built with GraalVM Community Edition 25, -march=compatibility, StaticExecutableWithDynamicLibC mode (mostly-static binary, links glibc only). Runtime image: gcr.io/distroless/cc-debian12:nonroot. No PGO and no -march=native — additional optimization headroom available.

Methodology

  • Bench host: Linux x86_64, 12 cores, 32 GiB RAM, Docker engine 29.4.3 (machine macuntu, context macuntutailscale).

  • Client: cylab/wrk2 (fork of Gil Tene’s wrk2), 4 threads / 100 connections, 30 s measurement, 5 s warmup, coordinated omission correction (HdrHistogram).

  • Unleashed mode: network_mode: host (bypasses Docker bridge), no cpuset (12 cores available per container), fresh container per rate (eliminates inter-rate coupling).

  • Acceptance filter: highest rate at which p99 < 10 ms.

The BENCHMARKS.md report also contains closed-loop in-process measurements (2026-04-16, 2026-04-23, re-run 2026-05-18) that report 275k req/s for Chappe at 16 threads. Those numbers measure raw physical capacity (client running in the same JVM, no latency constraint) — useful for tracking before/after a Chappe optimization, but not comparable to external servers nor to user-perceived performance. The table above remains the canonical reference.

Reproducibility

# Canonical reference: unleashed multi-runtime shootout
./chappe-bench/docker/run-remote.sh shootout-unleashed

# Bridge mode (CPU pinning 0-3 servers / 4-7 client, reproducible isolation)
./chappe-bench/docker/run-remote.sh shootout

# Without native build (if GraalVM 25 unavailable)
./chappe-bench/docker/run-remote.sh shootout-jvm-only

For the full Dockerfiles, harness and JFR profiling suite, see the chappe-bench/docker/shootout folder and the complete BENCHMARKS.md report.