Non-provisional patent filed January 31, 2026 · Issuance expected early July 2026
A Rust quant engine that produces bit-exact, reproducible risk outputs on the f64 CPU gold path — with a cryptographic SHA256 receipt on every run. The same determinism thesis is being extended onto GPUs.
LuxiEdge is a deterministic, receipt-attested quant risk engine. Its f64 CPU path produces bit-exact, reproducible results across CPUs (compiled with -fp-contract=off), and every evaluation carries a cryptographic SHA256 receipt. The audited risk pipeline is built on f64 linear algebra — Kahan-compensated covariance and matmul for portfolio variance — and deterministic normcdf/exp primitives for Black–Scholes-style Greeks. Written in memory-safe Rust with a stateless REST interface, it is designed for desks where reproducibility and auditability matter more than peak throughput.
The same determinism thesis is advancing onto accelerators. The quant engine already accelerates covariance and matmul on a wgpu GPU path, with Greeks held on deterministic CPU primitives. In parallel, the luxi-jit expression compiler proves the thesis on NVIDIA GPUs: bit-exact across Ampere (A100) and Hopper (H200) on rms_norm (0 ULP at both tested sizes), with other operations bounded by a disclosed sub-2-ULP envelope. luxi-jit is the consolidation path for GPU-scale work — proven in isolation, integrating next.
Audited risk receipts, today. Kahan-compensated matmul, Welford sample covariance (--online-cov), and deterministic normcdf/exp for Black–Scholes-style Greeks. Bit-exact across CPUs when compiled with -fp-contract=off. Every run produces a SHA256 receipt over the canonical output vector.
The --gpu flag accelerates covariance and matmul via wgpu. Greeks remain on deterministic CPU primitives. wgpu GPU path is same-hardware reproducible, not bit-exact equivalent to the f64 CPU oracle — this is disclosed when enabled.
The same determinism thesis extended onto NVIDIA GPUs via Triton. 12 preset expressions verified: gelu, silu, softplus, tanh, cross_entropy, rms_norm, and others. Receipt model: CPU f32 oracle is authoritative; GPU drift bounded and disclosed in response fields.
Cross-architecture proof point: rms_norm achieves 0 ULP bit-exact on both A100 SXM4 80GB (Ampere SM 8.0) and H200 (Hopper SM 9.0), at small and 64K sizes. Artifacts available in results/determinism_results.json (H200) and results/determinism_results_a100.json (A100) in the luxi-jit repository. Consolidation with the risk pipeline is the documented next step.
Every risk pipeline evaluation produces a SHA256 receipt over the canonical f64 output vector. Receipts are deterministic across CPUs compiled with -fp-contract=off, enabling reproducibility verification and regulatory audit trails.
Kahan-compensated matmul and Welford online covariance provide stable, reproducible portfolio variance estimates. Deterministic normcdf and exp primitives underpin Black–Scholes-style Greeks. Receipt hot path: linear algebra + normcdf + exp + scalar sqrt.
Stateless REST API. The /evaluate endpoint can route to the luxi-jit companion on :10000 when backend=auto|triton|flashinfer. When deterministic=true, the CPU oracle remains the receipt authority — the hash covers oracle output, not raw GPU output.
| Metric | Result | Notes |
|---|---|---|
| Oracle SHA256 (CPU f32) | 12 / 12 | All 12 preset expressions produce stable CPU oracle receipts across runs. |
| GPU bit-exact (vs. oracle) | 5 / 12 | 5 expressions achieve 0 ULP match to CPU oracle on GPU. Remainder bounded at ≤2 ULP spacing; drift disclosed in response fields. |
| rms_norm — cross-architecture | 0 ULP | Bit-exact on A100 SXM4 80GB (Ampere SM 8.0) and H200 (Hopper SM 9.0), both small and 64K input sizes. Headline cross-architecture proof point. |
| Kernel-only speed (GELU, 4M, H200) | 9,455× | Kernel-only benchmark vs. CPU. Does not include receipt-overhead wrapper. Label: kernel-only, not full pipeline. |
deterministic=true# Submit a risk evaluation; receipt hash covers CPU oracle output POST /evaluate { "expression": "normcdf(x)", "inputs": [0.5, 1.0, -0.25], "deterministic": true // CPU oracle path; SHA256 receipt attached } # Route to luxi-jit GPU companion (receipt hashes oracle, not raw GPU output) POST /evaluate { "expression": "gelu(x)", "inputs": [...], "backend": "triton", "deterministic": true }
-fp-contract=off; every run receipt-attested via SHA256
Designed for quant desks, model risk teams, and safety-critical deployments where reproducibility and auditability take precedence over peak throughput — edge, scientific computing, defense, and high-trust inference.
All technical claims are traceable to code and artifacts in the luxi-quant-engine and luxi-jit repositories. Both repositories are currently private; full source available under NDA. Ongoing work follows the roadmap documented in LUXIEDGE_BUILD_ROADMAP.md.
GPU acceleration is proven in isolation and integrating next. No claim is made that the risk pipeline currently runs on luxi-jit Triton GPUs.
Pre-built binaries for Mac ARM64 and Linux x86_64 are available on GitHub. Download, run, and see the deterministic receipt output for yourself — no build toolchain required.
Eric Waller · Proprietary technology · Full source available under NDA