Every number on this page comes from measured runs. No projections, no estimates. The model used is Wave Field V4 (1.49B parameters) unless otherwise noted.
Wave Field's O(N log N) attention means consumer GPUs can handle context lengths that would OOM a standard transformer. These are measured speeds at 128K context.
| GPU | Price | VRAM | Max Context | Speed @128K |
|---|---|---|---|---|
| RTX 4000 Ada | $1,000 | 20 GB | 128K | 35K tok/s |
| RTX PRO 4500 BW | $1,500 | 32 GB | 256K | 64K tok/s |
| RTX 3090 | $1,500 | 24 GB | 256K | 66K tok/s |
| RTX 5090 | $2,000 | 32 GB | 256K | 157K tok/s |
| H100 | $30,000 | 80 GB | 512K | 183K tok/s |
| Standard transformer | any | any | OOM at 32K | — |
Standard transformer OOMs at 32K on ALL consumer GPUs.
Evaluated on the DCLM CORE suite — four standard tasks. The 130M Wave Field model was trained on 32 billion tokens.
| Task | Accuracy |
|---|---|
| ARC Easy | 30.1% |
| HellaSwag | 27.6% |
| PIQA | 52.3% |
| WinoGrande | 49.1% |
Wave Field 130M beats GPT-2's DCLM score by 1.77×.
Head-to-head comparison at 32K context length, same hardware, same parameter count.
| Metric | Standard Transformer | Wave Field |
|---|---|---|
| Memory at 32K | 35.6 GB | 6.74 GB |
| Memory at 128K | OOM | 26.76 GB |
Attention state memory comparison. Standard transformer KV cache grows linearly with context length. Wave Field memory is fixed.
| Context Length | Standard KV Cache | Wave Field | Savings |
|---|---|---|---|
| 2K | 37 MB | 72 KB | 513× |
| 32K | 600 MB | 72 KB | 8,333× |
| 128K | 2.4 GB | 72 KB | 33,333× |
| 1M | 19 GB | 72 KB | 263,889× |
Wave Field memory is constant regardless of context length.
Wave parameters (180 total) stay in full FP32 precision. Everything else is quantized to INT8.
| Metric | Before | After |
|---|---|---|
| Model size | 529 MB | 171 MB |
| QA accuracy | 60% | 100% |
| Wave params precision | FP32 | FP32 |
| Everything else | FP32 | INT8 |
Quality IMPROVED after compression — 60% to 100% on QA evaluation.
Configuration and final evaluation metrics for the V4 model.
| Parameter | Value |
|---|---|
| Model | V4 — 1.49B params |
| Architecture | 24 layers, 16 heads, dim 2048 |
| Training data | 6.8B tokens ClimbMix |
| Hardware | 8×H100 SXM |
| Training time | 11.5 hours |
| Final perplexity | 8.6 |
| Final accuracy | 84.7% |
| Throughput at 32K | 164K tok/s |