Running Evals

The run() method is the core of the SDK — it triggers an evaluation and returns the results.

run() never throws on transient network failures. After 3 automatic retries on network, 5xx, or 429 errors, the SDK returns a neutral { status: "skipped", passed: true } result so your application keeps running. You only need try/catch for AuthError (invalid key), PaymentError (usage limit reached), and ValidationError (bad request). See Error Handling for the full matrix.

Method signature


async run(suite: string, options: RunOptions): Promise<RunResult>

Parameters

Parameter	Type	Description
`suite`	`string`	The eval suite slug (e.g., `"rag-faithfulness"`)
`options`	`RunOptions`	Run configuration (see below)

`RunOptions`

Property	Type	Default	Description
`output`	`string`	required	The AI-generated output to evaluate
`input`	`object`	`{}`	Input data passed to scorers (e.g., context, query)
`trigger`	`string`	`"sdk"`	Source identifier: `"sdk"`, `"cli"`, `"github_action"`, `"dashboard"`, `"api"`
`triggerMeta`	`object`	`{}`	Additional metadata (e.g., git SHA, PR number)
`async`	`boolean`	`false`	If `true`, returns immediately with a pending run ID

`RunResult`

Property	Type	Description
`id`	`string`	Unique run identifier
`passed`	`boolean`	`true` if status is `"cleared"`
`status`	`string`	`"cleared"`, `"aborted"`, `"skipped"`, `"error"`, or `"pending"`
`scores`	`Score[]`	All individual case scores
`failures`	`Score[]`	Only the failed case scores
`passRate`	`number`	0.0–1.0 ratio of passed cases
`totalCases`	`number`	Total active cases evaluated
`passedCases`	`number`	Number of passing cases
`failedCases`	`number`	Number of failing cases
`durationMs`	`number`	Total evaluation time in milliseconds

`Score`

Property	Type	Description
`case`	`string`	Case name
`passed`	`boolean`	Whether the case passed its threshold
`score`	`number`	0.0–1.0 score from the scorer
`threshold`	`number`	The case’s pass threshold
`reason`	`string`	Human-readable explanation

Examples

Basic evaluation


const result = await lg.run("content-safety", {
  output: "The capital of France is Paris.",
});
 
console.log(result.status);   // "cleared"
console.log(result.passRate); // 1.0

With input context


const result = await lg.run("rag-faithfulness", {
  input: {
    context: "The Eiffel Tower was completed in 1889 for the World's Fair.",
    query: "When was the Eiffel Tower built?",
  },
  output: "The Eiffel Tower was completed in 1889.",
});

With trigger metadata


const result = await lg.run("my-suite", {
  output: generatedText,
  trigger: "sdk",
  triggerMeta: {
    commitSha: process.env.GIT_SHA,
    branch: process.env.GIT_BRANCH,
    model: "gpt-4o",
    promptVersion: "v2.3",
  },
});

Async mode


const pending = await lg.run("large-suite", {
  output: "...",
  async: true,
});
 
console.log(pending.id);     // "run_abc123"
console.log(pending.status); // "pending"
 
// Poll for results
const completed = await lg.getRun(pending.id);

Additional methods

`getRun(runId)`

Fetch the status and details of a run.


const run = await lg.getRun("run_abc123");

`getRunResults(runId)`

Fetch detailed per-case results for a run.


const results = await lg.getRunResults("run_abc123");

`health()`

Check the LaunchGate API health status.


const status = await lg.health();
// { status: "healthy", version: "0.1.0", timestamp: "...", services: {...} }