Custom Scorers
Create scorers tailored to your specific evaluation needs.
Choosing a scorer type
| Type | Best for | Requires BYOK? |
|---|---|---|
exact_match | Deterministic outputs (classifications, labels) | No |
regex | Pattern validation (dates, IDs, formats) | No |
json_schema | Structured output validation | No |
contains | Checking for required/forbidden content | No |
llm_judge | Subjective quality assessment (faithfulness, tone, relevance) | Yes |
Examples
Exact match — classification
Verify that a classifier returns the correct label:
curl -X POST https://api.launchgate.ai/v1/projects/my-project/scorers \
-H "Authorization: Bearer $LAUNCHGATE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Classification accuracy",
"type": "exact_match",
"config": { "case_sensitive": false }
}'Then create a case using this scorer with expected: "positive".
Regex — date format validation
Ensure outputs contain properly formatted dates:
{
"name": "Valid date format",
"type": "regex",
"config": {
"pattern": "\\d{4}-\\d{2}-\\d{2}",
"should_match": true
}
}Regex — no PII leaked
Ensure outputs don’t contain email addresses:
{
"name": "No email in output",
"type": "regex",
"config": {
"pattern": "[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}",
"should_match": false
}
}JSON schema — structured output
Validate that AI-generated JSON conforms to your schema:
{
"name": "Valid response schema",
"type": "json_schema",
"config": {
"schema": {
"type": "object",
"required": ["answer", "confidence", "sources"],
"properties": {
"answer": { "type": "string", "minLength": 1 },
"confidence": { "type": "number", "minimum": 0, "maximum": 1 },
"sources": {
"type": "array",
"items": { "type": "string" },
"minItems": 1
}
},
"additionalProperties": false
}
}
}Contains — required content
Ensure outputs mention required disclaimers:
{
"name": "Includes disclaimer",
"type": "contains",
"config": {
"values": ["not financial advice", "consult a professional"],
"mode": "any"
}
}Contains — forbidden content
Ensure outputs don’t contain competitor mentions:
{
"name": "No competitor mentions",
"type": "contains",
"config": {
"values": ["CompetitorA", "CompetitorB"],
"mode": "none"
}
}LLM judge — faithfulness
Use an LLM to evaluate whether the output is faithful to source context:
{
"name": "RAG faithfulness judge",
"type": "llm_judge",
"config": {
"rubric": "Evaluate whether the output is faithful to the provided context. Score 1.0 if all claims are supported by the context. Score 0.0 if any claims are unsupported or contradicted by the context. Score 0.5 for partial faithfulness.",
"model": "gpt-4o",
"_provider": "openai"
}
}LLM judge scorers require a BYOK key for the specified provider.
Dual-sided scoring
For cases where you need to measure both precision (accuracy of what’s said) and recall (completeness of what should be said):
{
"name": "Comprehensive answer judge",
"type": "llm_judge",
"config": {
"rubric": "Evaluate the answer for accuracy and completeness..."
},
"dual_sided": true,
"precision_threshold": 0.8,
"recall_threshold": 0.7
}Both thresholds must be met for the case to pass.
Last updated on