Eval Cases
An eval case is the atomic unit of evaluation — a single check with an input, a scorer, and a threshold.
Key properties
| Field | Type | Description |
|---|---|---|
name | string | Descriptive name for the case |
input | object | Input data passed to the scorer |
expected | any | null | Expected output (used by some scorers) |
scorer_id | UUID | The scorer to use for evaluation |
threshold | number | 0–1; minimum score to pass this case (default: 0.5) |
weight | number | Relative weight in pass rate calculation (default: 1.0) |
is_active | boolean | Whether this case runs during eval (default: true) |
How cases are scored
When a run executes:
- Each active case in the suite runs concurrently
- The scorer evaluates the output against the case input/expected values
- If the score ≥ case threshold → case passes
- If the score < case threshold → case fails
- Individual case failures don’t abort the run — they’re recorded as failed results
- The suite’s pass rate is the ratio of passed cases to total active cases
Creating a case
curl -X POST https://api.launchgate.ai/v1/suites/{suiteId}/cases \
-H "Authorization: Bearer $LAUNCHGATE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Contains source citation",
"input": { "query": "What year was it built?" },
"scorer_id": "scorer-uuid-here",
"threshold": 1.0
}'Reordering cases
Cases can be reordered within a suite:
curl -X POST https://api.launchgate.ai/v1/suites/{suiteId}/cases/reorder \
-H "Authorization: Bearer $LAUNCHGATE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"case_ids": ["case-1-uuid", "case-2-uuid", "case-3-uuid"]
}'Deactivating vs deleting
- Set
is_active: falseto skip a case during runs without removing it - Delete a case to soft-remove it permanently (it won’t appear in future runs or listings)
Last updated on