Bit-exact.
Deterministic.
Transformer compute.

A Rust + CUDA technology stack that solves fundamental inconsistencies in mathematical computing and transformer operations.

The Problem

Floating-point arithmetic — the foundation of all modern AI and scientific computing — produces slightly different results on different hardware, compilers, or optimization levels. Even the same code can return different answers across machines, making results non-reproducible and difficult to verify.

In transformer models, the attention mechanism is the largest bottleneck: it requires O(N²) memory and many memory round-trips, making long-context inference extremely expensive and power-hungry.

These issues create real barriers for edge deployment, defense, scientific workloads, and any application that requires trust, efficiency, and reproducibility.

Luxi Deterministic Kernels

Production-ready today.
Bit-exact low-level mathematical primitives including matmul, batch matmul, RoPE, RMSNorm, LayerNorm, GELU, SiLU, quantization (INT8/INT4/Q4_0), and more. Guarantees identical results across any hardware with verifiable SHA256 reproducibility. Extremely small footprint.

Geodesic Attention Engine (GAE)

Fused Waller Kernel for transformer attention. Significantly reduces HBM round-trips and memory usage with O(N) scaling demonstrated in benchmarks. Maintains mathematical equivalence to standard softmax attention. In advanced development and testing.

Adiabatic Transform Engine (ATE)

Triangle Engine and Waller Null-Space Multiplexing (WNSM). Enables efficient cross-layer data transport using MLP null-space with very low overhead. Promising results in early testing. In active development.

Transparency & Current Status

The core Luxi deterministic kernels are fully functional and production-ready. The Geodesic Attention Engine and Adiabatic Transform Engine are private repositories that have demonstrated strong results in correctness tests and benchmarks, but full production integration and additional features remain in active development.

Key Characteristics

  • 01
    Bit-exact determinism — identical results across any hardware
  • 02
    Ultra lightweight — smaller than a typical phone photo, minimal attack surface
  • 03
    Energy efficient — reduced memory traffic and power consumption
  • 04
    Simple integration — reroute math and attention calls with minimal workflow changes

Designed for workloads where reproducibility, long-context performance, and efficiency are essential — such as edge deployment, defense, scientific computing, and high-trust AI systems.

Third-party validated performance: TestFort QA Lab validation report.

Ready for confidential technical discussion

Contact: e@ewaller.com

Eric Waller • Full source available under NDA