← posts

training on consumer GPUs

Consumer GPU training and long context — conceptual illustration
Long context on consumer hardware — conceptual

2026 3 29 && note

By Badaramoni Avinash

Standard self-attention can run out of memory at long context on consumer GPUs. Wave Field uses FFT-based attention with better memory scaling, so long-context training is more feasible on smaller cards. Throughput tables and hardware comparisons are not published on this site.

See GitHub for benchmarks.