Skip to main content
Exa Agent is in beta. It may change before launch, and requires Exa-Beta: agent-2026-05-07 on every request.
Agent creates long-running tasks that can search, read, reason, enrich rows, and return answers with source grounding. Use it when a workflow needs more than a single search or contents call: open-ended research, list building, structured extraction, entity enrichment, or follow-up questions over previous results. For implementation examples and workflow guidance, start with the Agent guide.

How it works

  1. Create a run with POST /agent/runs.
  2. The agent queues and starts the run, returning an agent_run object immediately unless you request streaming.
  3. The run searches, reads, reasons, and writes until it completes, fails, is cancelled, or reaches the one-hour timeout.
  4. You poll GET /agent/runs/{id}, stream creation events, or replay stored events with GET /agent/runs/{id}/events.
  5. You can continue from a completed run by passing previousRunId to a new create request.

Endpoints

MethodPathDescription
POST/agent/runsCreate a run. Can return JSON or stream server-sent events.
GET/agent/runsList runs for your team.
GET/agent/runs/{id}Get a run by ID.
POST/agent/runs/{id}/cancelCancel a queued or running run.
DELETE/agent/runs/{id}Delete a stored run.
GET/agent/runs/{id}/eventsList run events or replay them as server-sent events.

Run lifecycle

Runs progress through these statuses:
queued -> running -> completed | failed | cancelled
Completed, failed, and cancelled runs are terminal. Running or queued runs have stopReason: null. Terminal runs use one of these stop reasons:
schema_satisfied | budget_reached | error | cancelled

Output

Each run returns an output object:
FieldDescription
output.textNatural-language answer or summary.
output.structuredValidated JSON when you provide outputSchema; otherwise null.
output.groundingCitations for the text answer or structured fields, when emitted.
outputSchema supports JSON Schema draft-07, 2019-09, and 2020-12 via $schema. Standard formats are supported, plus phone. To request contact information, include contact fields in outputSchema using standard JSON Schema string formats, for example { "type": "string", "format": "email" }. Bound arrays with maxItems when possible so the maximum contact-enrichment cost is predictable. Create requests also accept effort, which controls the run’s cost and reasoning effort preference. Supported values are low, medium, high, xhigh, and auto; the default is auto.

Events and streaming

Set Accept: text/event-stream when you create a run to stream lifecycle events as they happen. You can also replay stored events later with GET /agent/runs/{id}/events. Events use standard SSE framing:
id: 1
event: agent_run.created
data: {"id":"agent_run_01j...","status":"queued","createdAt":"2026-05-07T21:21:52.051Z"}
Terminal event names are agent_run.completed, agent_run.failed, and agent_run.cancelled.

Limits and pricing

Your Agent concurrency limit is one fifth of your account QPS. For pay-as-you-go accounts with default QPS, this means two active Agent runs at a time.
Agent pricing is beta pricing and may change before launch.
ComponentPrice
Agent Compute Units1 ACU = $0.0001
Search tool calls$5 / 1,000 searches
Contact enrichment is separate from the core pricing components above: email contact enrichment is $0.02 / email, and phone number contact enrichment is $0.07 / phone number.

Effort

Use effort to set a cost and reasoning effort preference for a run. Supported values are low, medium, high, xhigh, and auto; the default is auto. If an effort is set, each run is capped at the following costs:
EffortPrice*
low$25 / 1,000 searches
medium$100 / 1,000 searches
high$500 / 1,000 searches
xhigh$2000 / 1,000 searches
*Email and phone enrichment is additional and is not included in fixed effort pricing.
While Exa Agent is in beta, it is not ZDR. If you require ZDR, reach out to us.

Next steps