cuzk Proving Daemon
This page explains how to set up the cuzk persistent GPU SNARK proving daemon to accelerate proof computation in Curio.
Experimental Feature This feature is currently experimental and under active development. Configuration, behavior, and interfaces may change without notice.
What is cuzk?
cuzk is a persistent GPU-resident SNARK proving daemon. It acts as a "proving server" that Curio delegates proof computations to over gRPC.
The key difference from the default proving path (ffiselect) is that cuzk loads Groth16 SRS parameters once at startup and keeps them resident in CUDA-pinned host memory across all proofs. The default Curio code path spawns a fresh child process per proof, each of which loads the SRS from disk (30-90 seconds for 32 GiB PoRep), runs one proof, and exits. cuzk eliminates this repeated loading overhead entirely.
Supported proof types
PoRep C2
PoRep
Seal commit phase 2 SNARK
SnapDeals Prove
UpdateProve
CC sector update proof
PSProve
PSProve
Snark Market proof share compute
How integration works
When cuzk is enabled in Curio's configuration:
Resource accounting bypassed:
TypeDetails()reports zero GPU and minimal RAM for proving tasks. Curio's harmony scheduler no longer gates these tasks on local GPU availability.Backpressure via polling:
CanAccept()queries the cuzk daemon's queue viaGetStatusand rejects tasks when the queue is full (controlled byMaxPending).Vanilla proofs stay local: The
Do()method generates vanilla proofs locally (requires sector data on disk), sends them to cuzk for SNARK computation, then verifies the returned proof locally.
When cuzk is not configured (default), all three tasks behave exactly as before. There is no behavioral change for existing deployments.
Requirements
NVIDIA GPU with CUDA support (the cuzk daemon itself runs on the GPU machine)
CUDA toolkit (
nvccmust be in PATH)Rust toolchain (1.86.0 or later; managed automatically via
rust-toolchain.toml)Filecoin proof parameters downloaded (same parameters as standard Curio proving)
Sufficient system RAM for SRS residency (minimum 128 GiB, recommended 256+ GiB)
Building
From the Curio repository root:
The make cuzk target:
Checks for
cargo(Rust) andnvcc(CUDA) in PATHRuns
cargo build --releaseinextern/cuzk/Copies the resulting binary to
./cuzk
To install:
Note: make cuzk is intentionally not part of make build or make buildall since it requires CUDA and Rust, which are not available in all build environments (e.g., CI).
Daemon Configuration
The cuzk daemon reads its configuration from a TOML file (default: /data/zk/cuzk.toml). An example configuration is provided at extern/cuzk/cuzk.example.toml.
Minimal configuration
RAM-based tuning
The primary tuning knob is partition_workers, which controls how many PoRep partitions are synthesized concurrently on the CPU. More workers keep the GPU fed but use more RAM.
128 GiB
2
1
~110 GiB
~152 s/proof
256 GiB
7
1
~208 GiB
~53 s/proof
384 GiB
10
2
~271 GiB
~43 s/proof
512+ GiB
12
2
~400 GiB
~38 s/proof
Memory formula: Peak RSS = 69 + (partition_workers x 20) GiB
Running the daemon
For production, run cuzk as a systemd service:
Curio Configuration
Add the following to your Curio configuration layer to connect to the cuzk daemon:
When Address is empty (the default), cuzk integration is disabled and all proving tasks use the standard local GPU path.
Which Curio subsystems are affected
The cuzk client is used by tasks on nodes that have these subsystems enabled:
EnablePoRepProof = true-- PoRep C2 provingEnableUpdateProve = true-- SnapDeals update provingEnableProofShare = true-- Snark Market proof computation
These subsystems must still be enabled as usual. The [Cuzk] configuration only changes how the SNARK computation is performed (local GPU vs. remote daemon).
Deployment Patterns
Co-located (single machine)
Run both Curio and cuzk on the same GPU machine. Use TCP localhost or a Unix socket:
This is the simplest deployment and avoids any network overhead.
Dedicated prover (separate machines)
Run Curio on CPU-only machines for sealing tasks (SDR, TreeD, etc.) and cuzk on a dedicated GPU machine. Curio connects over the network:
Note: vanilla proof data (up to ~200 MB for PoRep C2) is sent over gRPC, so ensure sufficient network bandwidth between the nodes.
Monitoring
The cuzk daemon exposes its status via the gRPC GetStatus RPC. Curio queries this automatically for backpressure. You can also query it manually:
This returns the current queue state for each proof type (pending count, in-progress count).
Troubleshooting
cuzk build fails
Verify
nvccis in PATH:nvcc --versionVerify Rust toolchain:
rustup show(should show 1.86.0 or later)The Cargo workspace in
extern/cuzk/depends on vendored forks ofbellperson,bellpepper-core, andsupraseal-c2underextern/. If you see missing crate errors, ensuregit submodule update --init --recursivewas run.
Curio cannot connect to cuzk
Check that the daemon is running:
systemctl status cuzkVerify the address matches between Curio's
[Cuzk].Addressand the daemon's[daemon].listenCheck firewall rules if using TCP across machines
Look at Curio logs for
cuzkentries:journalctl -u curio | grep cuzk
Proofs are slow or timing out
Increase
ProveTimeoutin Curio's config if proofs legitimately take longerCheck daemon logs for queue depth. If proofs pile up, reduce
MaxPendingor add more GPU capacityTune
partition_workersbased on the RAM table above. Too many workers can cause memory pressure; too few starve the GPU
Curio tasks rejected (backpressure)
If Curio logs show "cuzk pipeline full, backpressuring", the daemon's queue is at capacity. Either:
Increase
MaxPending(allows more queued proofs, uses more memory)Add GPU capacity (second GPU, second daemon instance)
Reduce the rate of incoming sealing/snap work
Last updated