EthicsComplianceVerifier
Overview
EthicsComplianceVerifier.circom is a zk-SNARK circuit that proves an AI agent’s training and evaluation meet predefined ethical requirements bias limits, fairness scores, and safety thresholds—without revealing any private metrics, data, or intermediate test results.
It is designed to plug directly into the ZKAgentVerificationOrchestrator pipeline but can operate as an independent verifier in any zero-knowledge AI-audit workflow.
Objectives
The circuit enables a prover to demonstrate that:
Bias: Every bias-test score does not exceed
max_bias_threshold.Fairness: Every fairness-metric score is at least
min_fairness_score.Safety: The aggregate count of harmful content detections is not greater than
max_harmful_rate.All three conditions are cryptographically linked to a prior ethics commitment and the agent identifier.
Inputs
Private Inputs
Bias tests
bias_test_results[n_bias_checks]
Integer scores 0-1000 per bias check
Fairness tests
fairness_scores[n_ethics_tests]
Integer scores 0-1000 per fairness evaluation
Safety flags
harmful_content_flags[n_ethics_tests]
0 = safe, 1 = harmful detected
Dataset hash
ethics_training_data_hash
Poseidon hash of ethics-specific training set
Red-team data
red_team_results[n_ethics_tests]
(placeholder, currently unused in constraints)
Public Inputs
max_bias_threshold
Upper bound for any bias score
min_fairness_score
Lower bound for any fairness score
max_harmful_rate
Maximum allowed total harmful detections
ethics_commitment_hash
Commitment published before training/validation
agent_id
Unique identifier for the agent
Public Outputs
ethics_verified
1 if all ethics criteria pass; otherwise 0
bias_compliance
1 if all bias tests satisfy threshold
fairness_compliance
1 if all fairness tests satisfy threshold
safety_compliance
1 if harmful detections ≤ max_harmful_rate
ethics_proof_hash
Poseidon hash binding compliance results to commitments
Circuit Logic
Bias Verification For each bias score,
LessEqThanenforcesbias_test_results[i] ≤ max_bias_threshold. The product of all flag outputs yieldsbias_compliance.Fairness Verification For each fairness score,
GreaterEqThanenforcesfairness_scores[i] ≥ min_fairness_score. The product of all flag outputs yieldsfairness_compliance.Safety Verification
HarmfulContentCountersums the Boolean flagsharmful_content_flags[i].LessEqThanensures the total ≤max_harmful_rate, producingsafety_compliance.Aggregate Result
ethics_verified = bias_compliance · fairness_compliance · safety_complianceImplemented as two quadratic constraints to remain R1CS-valid.
Proof Hash
ethics_proof_hash = Poseidon( agent_id, bias_compliance, safety_compliance, ethics_commitment_hash, ethics_training_data_hash )Used by higher-level contracts to reference this proof succinctly.
Compilation
# prerequisites: circom 2.x, snarkjs, pot16_final.ptau
circom EthicsComplianceVerifier.circom \
--r1cs --wasm --sym --c
snarkjs groth16 setup EthicsComplianceVerifier.r1cs \
pot16_final.ptau \
ecv_final.zkey
snarkjs zkey export verificationkey \
ecv_final.zkey \
ecv_verification_key.jsonProof Generation Example
node EthicsComplianceVerifier_js/generate_witness.js \
EthicsComplianceVerifier_js/EthicsComplianceVerifier.wasm \
input.json \
witness.wtns
snarkjs groth16 prove ecv_final.zkey \
witness.wtns \
proof.json public.json
snarkjs groth16 verify ecv_verification_key.json \
public.json proof.jsonpublic.json contains the five public outputs ready for on-chain submission.
Security
Private metrics remain local; only binary pass/fail and hashes are public.
Poseidon hashing maintains circuit efficiency on BN128.
If a single test fails,
ethics_verifiedcollapses to0, preventing partial disclosure attacks.Trusted-setup ceremony must be secured or replaced by MPC.
Status
Stable for integer bias, fairness, and safety metrics.
Planned extensions: incorporate
red_team_resultsconstraints; allow dynamic weighting of fairness scores; add differential-privacy proofs for training data.
Last updated