We're building O(N log N) attention mechanisms that run on consumer GPUs. No quadratic bottleneck. No datacenter required.
Every large language model today uses self-attention — each token compares itself with every other token. This is powerful, but it scales quadratically: O(N²). At 32K tokens, a single attention layer performs over a billion operations. At 128K, it's physically impossible on most hardware.
Wave Field replaces this with something fundamentally different. Tokens don't compare with each other at all. Instead, they deposit information onto a continuous field, and wave physics propagates that information via FFT convolution. Each attention head is a damped oscillator with three learnable parameters: frequency (ω), damping (α), and phase (φ).
The result is O(N log N) complexity — and it changes what's computationally possible on consumer hardware.
| GPU | VRAM | Max Context | Speed |
|---|---|---|---|
| RTX 3090 | 24GB | 256K | 66K tok/s |
| RTX 5090 | 32GB | 256K | 157K tok/s |
| H100 | 80GB | 512K | 183K tok/s |
| Standard transformer | any | OOM at 32K | — |