# Sandbox

The sandbox is an isolated testing environment where Valiron evaluates agent behavior before granting production access to your API. New agents start in the sandbox automatically.

## How It Works

1. A sandbox test is triggered (via your API call or automatically for new agents)
2. The agent receives a series of simulated API interactions
3. The sandbox monitors how the agent handles normal requests, rate limits, errors, and edge cases
4. A trust evaluation is performed and a tier is assigned
5. Based on the tier, a route decision determines what access level the agent should receive

## What Gets Tested

The sandbox evaluates agents through a series of simulated API interactions that test how the agent handles various scenarios. The specific test types, distribution, and patterns are proprietary and not disclosed.

## Graduation

Agents graduate from sandbox to production based on their behavior:

1. **High confidence** — strong behavioral compliance → `prod` (allow full access)
2. **Moderate confidence** — passes with some concerns → `prod_throttled` (allow with rate limiting)
3. **Low confidence** — risky behavior → remains `sandbox_only` (block or serve test data only)

Once graduated, the trust evaluation is cached. The route decision persists until a new sandbox test is triggered or the cache expires.

## Triggering a Sandbox Test

You can proactively evaluate agents before they hit your production endpoints, or let the middleware handle it automatically (see Auto-Sandbox below).

### Via SDK

```typescript
import { ValironSDK } from "@valiron/sdk";

const valiron = new ValironSDK();

// On-chain agent
const result = await valiron.triggerSandboxTest("YOUR_AGENT_ID");

console.log(result.tier);            // "AAA"
console.log(result.riskLevel);       // "GREEN"
console.log(result.meetsThreshold);  // true

// Key-based agent
const keyResult = await valiron.triggerKeyAgentSandbox("0x1234...abcd");
console.log(keyResult.tier);         // "A"
```

### Via HTTP

```bash
# On-chain agent
curl -X POST https://valiron-edge-proxy.onrender.com/operator/trigger-sandbox/YOUR_AGENT_ID

# Key-based agent
curl -X POST https://valiron-edge-proxy.onrender.com/operator/trigger-sandbox-key/0x1234...abcd
```

## Auto-Sandbox

When a new agent hits a gated endpoint for the first time, the middleware automatically triggers sandbox evaluation in the background. No manual trigger is needed.

### On-Chain Agents (ERC-8004)

New on-chain agents are detected by `totalFeedback === 0` in their reputation profile. The middleware calls `triggerSandboxTest()` as a fire-and-forget background task and returns:

```json
{ "error": "Agent pending evaluation", "retryAfterMs": 30000 }
```

The agent should retry after 30 seconds. Once evaluation completes, subsequent requests are gated normally.

### Key-Based Agents (Web2)

New key agents are detected by `verified === true` but `score === null`. The middleware calls `triggerKeyAgentSandbox()` in the background and returns the same 403 pending response.

Key agents don't have on-chain endpoints, so sandbox evaluation uses `sandbox_relay` mode — a conservative set of behavioral metrics are computed without probing endpoints.

#### Via SDK

```typescript
// On-chain agent
const result = await valiron.triggerSandboxTest("YOUR_AGENT_ID");

// Key-based agent
const result = await valiron.triggerKeyAgentSandbox("0x1234...abcd");
```

#### Via HTTP

```bash
# On-chain agent
curl -X POST https://valiron-edge-proxy.onrender.com/operator/trigger-sandbox/YOUR_AGENT_ID

# Key-based agent
curl -X POST https://valiron-edge-proxy.onrender.com/operator/trigger-sandbox-key/0x1234...abcd
```

## Tips for Passing Sandbox Tests

Agents that demonstrate responsible, well-behaved API usage will earn higher trust tiers. The specific scoring criteria are proprietary, but in general: respect API boundaries, handle errors gracefully, and avoid aggressive request patterns.

---

## Solana Sandbox Differences

For Solana agents (`chain=solana`), the sandbox flow has additional steps powered by the QuantuLabs 8004-solana SDK.

### Liveness Probe

Before running behavioral tests, the sandbox triggers a **liveness check** via `isItAlive()`. This pings every service endpoint declared in the agent's on-chain metadata (REST, WebSocket, GraphQL, etc.) and returns a health report.

```bash
GET /operator/agent/{agentId}/liveness?chain=solana
```

If all endpoints are unreachable (`status: "dead"`), the sandbox may still proceed — liveness is informational and does not gate the evaluation.

### Feedback Write-back

After a sandbox evaluation completes, Valiron writes the computed trust score back to the Solana reputation registry via the ATOM engine's `giveFeedback()`. This closes the loop — sandbox results become on-chain reputation.

Requirements:
- `SOLANA_FEEDBACK_KEYPAIR` environment variable must be set (JSON array of the Ed25519 secret key)
- The keypair must have sufficient SOL for transaction fees

Feedback write-back is **fire-and-forget** — if it fails, the sandbox result is still returned to the caller.

### ATOM Engine Metrics

Solana reputation includes ATOM engine metrics that EVM agents don't have:

| Metric | Description |
|--------|-------------|
| `trustTier` | ATOM classification: `"trusted"`, `"neutral"`, etc. |
| `qualityScore` | Overall behavioral quality (0–100) |
| `confidence` | Statistical confidence in quality score (0–1) |
| `riskScore` | ATOM risk assessment (0–100, lower = safer) |
| `uniqueCallers` | Number of distinct feedback submitters |

These appear in the sandbox/gate response under `onchainReputation.solana` and are factored into the final Valiron risk score alongside the standard behavioral penalties.