# EthicsComplianceVerifier

### Overview

`EthicsComplianceVerifier.circom` is a zk-SNARK circuit that proves an AI agent’s training and evaluation meet predefined ethical requirements bias limits, fairness scores, and safety thresholds—**without revealing** any private metrics, data, or intermediate test results.\
It is designed to plug directly into the `ZKAgentVerificationOrchestrator` pipeline but can operate as an independent verifier in any zero-knowledge AI-audit workflow.

***

### Objectives

The circuit enables a prover to demonstrate that:

1. **Bias**: Every bias-test score does not exceed `max_bias_threshold`.
2. **Fairness**: Every fairness-metric score is at least `min_fairness_score`.
3. **Safety**: The aggregate count of harmful content detections is not greater than `max_harmful_rate`.
4. All three conditions are cryptographically linked to a prior **ethics commitment** and the agent identifier.

***

### Inputs

#### Private Inputs

| Group          | Name                                    | Description                                    |
| -------------- | --------------------------------------- | ---------------------------------------------- |
| Bias tests     | `bias_test_results[n_bias_checks]`      | Integer scores 0-1000 per bias check           |
| Fairness tests | `fairness_scores[n_ethics_tests]`       | Integer scores 0-1000 per fairness evaluation  |
| Safety flags   | `harmful_content_flags[n_ethics_tests]` | `0` = safe, `1` = harmful detected             |
| Dataset hash   | `ethics_training_data_hash`             | Poseidon hash of ethics-specific training set  |
| Red-team data  | `red_team_results[n_ethics_tests]`      | (placeholder, currently unused in constraints) |

#### Public Inputs

| Name                     | Description                                     |
| ------------------------ | ----------------------------------------------- |
| `max_bias_threshold`     | Upper bound for any bias score                  |
| `min_fairness_score`     | Lower bound for any fairness score              |
| `max_harmful_rate`       | Maximum allowed total harmful detections        |
| `ethics_commitment_hash` | Commitment published before training/validation |
| `agent_id`               | Unique identifier for the agent                 |

#### Public Outputs

| Output                | Meaning                                                 |
| --------------------- | ------------------------------------------------------- |
| `ethics_verified`     | `1` if **all** ethics criteria pass; otherwise `0`      |
| `bias_compliance`     | `1` if all bias tests satisfy threshold                 |
| `fairness_compliance` | `1` if all fairness tests satisfy threshold             |
| `safety_compliance`   | `1` if harmful detections ≤ `max_harmful_rate`          |
| `ethics_proof_hash`   | Poseidon hash binding compliance results to commitments |

***

### Circuit Logic

1. **Bias Verification**\
   For each bias score, `LessEqThan` enforces\
   `bias_test_results[i] ≤ max_bias_threshold`.\
   The product of all flag outputs yields `bias_compliance`.
2. **Fairness Verification**\
   For each fairness score, `GreaterEqThan` enforces\
   `fairness_scores[i] ≥ min_fairness_score`.\
   The product of all flag outputs yields `fairness_compliance`.
3. **Safety Verification**\
   `HarmfulContentCounter` sums the Boolean flags\
   `harmful_content_flags[i]`.\
   `LessEqThan` ensures the total ≤ `max_harmful_rate`, producing `safety_compliance`.
4. **Aggregate Result**

   ```
   ethics_verified = bias_compliance · fairness_compliance · safety_compliance
   ```

   Implemented as two quadratic constraints to remain R1CS-valid.
5. **Proof Hash**

   ```
   ethics_proof_hash = Poseidon(
       agent_id,
       bias_compliance,
       safety_compliance,
       ethics_commitment_hash,
       ethics_training_data_hash
   )
   ```

   Used by higher-level contracts to reference this proof succinctly.

***

### Compilation

```bash
# prerequisites: circom 2.x, snarkjs, pot16_final.ptau
circom EthicsComplianceVerifier.circom \
      --r1cs --wasm --sym --c

snarkjs groth16 setup EthicsComplianceVerifier.r1cs \
                 pot16_final.ptau \
                 ecv_final.zkey

snarkjs zkey export verificationkey \
                 ecv_final.zkey \
                 ecv_verification_key.json
```

***

### Proof Generation Example

```bash
node EthicsComplianceVerifier_js/generate_witness.js \
     EthicsComplianceVerifier_js/EthicsComplianceVerifier.wasm \
     input.json \
     witness.wtns

snarkjs groth16 prove ecv_final.zkey \
                     witness.wtns \
                     proof.json public.json

snarkjs groth16 verify ecv_verification_key.json \
                     public.json proof.json
```

`public.json` contains the five public outputs ready for on-chain submission.

***

### Security&#x20;

* Private metrics remain local; only binary pass/fail and hashes are public.
* Poseidon hashing maintains circuit efficiency on BN128.
* If a single test fails, `ethics_verified` collapses to `0`, preventing partial disclosure attacks.
* Trusted-setup ceremony must be secured or replaced by MPC.

***

### Status

* Stable for integer bias, fairness, and safety metrics.
* Planned extensions: incorporate `red_team_results` constraints; allow dynamic weighting of fairness scores; add differential-privacy proofs for training data.
