· project

Benchmark Challenge: znap — Zero-Allocation Zig Neural Inference (scaffold)

Target claim (from public PRE_LAUNCH_CHECKLIST.md and paper mirror): 31 ns linear policy (30→6), 8× BitNet b1.58 ternary compression (zero quality loss on "same action" metric), 34-138× speedup vs NumPy/PyTorch equivalents on 30-dim workloads. Zero-allocation hot path on Apple M1 (NEON) and x86_64 (AVX2/AVX-512). Reproducible via public model + Zig only.

Hardware class for baseline: Apple M1 (or M-series) 10-core or equivalent x86_64 with AVX2+. Consumer GPU not required.

Reproduction (exact, repo-relative references):

Clone or extract the znap surface (public mirror at SMC17/znap or local mac-mining/gh-iceberg/clones/znap equivalent after clean extraction).
curl -L -O https://huggingface.co/SMC17/stories15M-zig/resolve/main/stories15M.bin
curl -L -O https://huggingface.co/SMC17/stories15M-zig/resolve/main/tokenizer.bin
zig build run -- stories15M.bin tokenizer.bin -t 1 -n 256
For micro-benches: build and run the SDOT / linear / DQN benches in the repo (see PRE_LAUNCH_CHECKLIST.md for 30-dim, 30→64→6 cases).

Success criteria (falsifiable):

Linear policy (30→6) reports ≤ 50 ns median on M1-class hardware at ReleaseFast (or equivalent documented ISA).
8× memory reduction on BitNet b1.58 matrices with action-match rate documented.
Speedup table vs PyTorch/NumPy baseline (same workload, single-thread where applicable) shows ≥ 30× on at least one micro-op.
No malloc in inner loop (GPA safety = true shows zero events).

Falsifier: Median exceeds 100 ns on the linear policy case on documented M1-class hardware, or speedup < 10× vs equivalent PyTorch path on the same 30-dim workload, or allocation events appear in the hot path under safety allocator.

Proof level of this challenge: scaffold (repro commands drawn from public audit checklist; independent run required for "benchmarked").

Related public evidence in this stack:

blog/content/lab/zig-h3-pure-zig-vs-libh3.md (another pure-Zig win).
COMPETITIVE_LANDSCAPE.md (evidence graveyard section).

stax-experiment register --lane znap-challenge --hypothesis "My hardware reproduces or beats the 31 ns linear policy" --falsifier "Median >100 ns or speedup <10× on equivalent workload"

Run the challenge. Publish the trace. Non-Zig inference stacks that cannot match the latency or allocation profile on equivalent silicon are the baseline this evidence attacks.

Extraction note: This document was produced as part of the clean harness + QM positioning surface (aac-launch/extract-harness-core.sh + blog canon). All paths are repository-relative.