Overview
Base URL:https://api.exa.ai/websets/v0
Auth: Pass your API key via the x-api-key header. Get one at https://dashboard.exa.ai/api-keys
Websets is an asynchronous search system. You define a query, criteria for verification, and optional enrichments. The system searches, verifies each result, and returns structured items over time. Results are available via polling or webhooks.
Installation
Minimal Working Example
SDK Sub-Client Reference
The SDKs provide sub-clients for all API resources. Here are the key operations beyond the minimal example above. Python SDK note: All response attributes usesnake_case. JSON field hasMore → has_more, nextCursor → next_cursor, createdAt → created_at, externalId → external_id, websetId → webset_id.
How Websets Work
Lifecycle
- Create — You POST a search config (query, count, optional criteria/enrichments/entity type). A webset is created with status
running. - Search — The system searches and verifies each result against your criteria. Matching items are added to the webset. Each item triggers a
webset.item.createdevent. - Enrichment — If enrichments are configured, each item is processed.
webset.item.enrichedevents fire as enrichment results arrive. - Idle — When all searches and enrichments complete, the webset status becomes
idleand awebset.idleevent fires.
Key Concepts
- Search: Defines what to look for (query + count). Multiple searches can be added to one webset.
- Criteria: Verification rules. Each result is checked against criteria before becoming an item. Max 5 criteria per search.
- Entity: Optional type hint (e.g.
"company","person","article","research_paper","custom") that shapes how results are found and verified. Auto-detected if not specified. - Enrichments: Additional data extraction applied to each item (e.g. “Find the CEO name”). Max 10 per webset.
- Monitors: Scheduled re-runs that keep websets updated. Supports cron expressions.
- Webhooks: Real-time HTTP callbacks for events.
- Imports: Bring your own URLs and run enrichments on them.
- Exports: Bulk download of webset items as CSV/JSON.
API Endpoints — Full Reference
Websets
POST /websets/ — Create a Webset
Request body:
| Field | Type | Required | Description |
|---|---|---|---|
search.query | string (min 1 char) | Yes | Natural language query. Any URL in the query will be crawled and used as context. |
search.count | number (>= 1) | No (default 10) | Target number of items to find. Actual results may be fewer depending on query complexity. |
search.criteria | array (1–5 items) | No | Verification rules. Auto-detected from query if omitted. Each has a description string. |
search.entity | object | No | One of: {"type": "company"}, {"type": "person"}, {"type": "article"}, {"type": "research_paper"}, {"type": "custom", "description": "Job Postings"}. Auto-detected if omitted. |
search.behaviour | string | No (default "override") | "override": reuses existing items, re-evaluates against new criteria, discards non-matching. |
search.metadata | object | No | Arbitrary key-value pairs for the search. |
enrichments | array (max 10) | No | Each enrichment has description (required), format, options, metadata. |
enrichments[].description | string (min 1 char) | Yes | What data to extract. |
enrichments[].format | string | No | One of: text, number, date, url, email, phone, options. Auto-detected if omitted. |
enrichments[].options | array (1–20 items) | Conditional | Required when format is options. Each has a label string. |
enrichments[].metadata | object | No | Arbitrary key-value pairs. |
externalId | string | No | Your own identifier. Can be used in place of webset ID in all GET/PATCH/DELETE calls. Returns 409 if duplicate. |
metadata | object | No | Arbitrary key-value pairs for the webset. |
Webset object (see Object Schemas below).
GET /websets/{id} — Get a Webset
{id}can be the webset ID orexternalId.- Query param
?expand=itemsincludes up to 100 items in the response.
Webset object. When expanded, includes an items array of WebsetItem objects.
GET /websets/ — List All Websets
Query params:
| Param | Type | Description |
|---|---|---|
cursor | string | Pagination cursor from previous response’s nextCursor. |
limit | number | Results per page (max 200). |
{ "data": [Webset, ...], "hasMore": boolean, "nextCursor": string | null }
POST /websets/{id} — Update a Webset
Request body:
metadata can be updated. Response: Updated Webset object.
DELETE /websets/{id} — Delete a Webset
Deletes the webset and all associated items, searches, and enrichments. Response: The deleted Webset object.
POST /websets/{id}/cancel — Cancel Running Operations
Cancels all running searches and enrichments on the webset. Response: The Webset object with updated status.
POST /websets/preview — Preview Search Results
Runs a search without creating a webset. Same request body as create. Useful for testing queries before committing.
Items
GET /websets/{websetId}/items — List Items
Query params:
| Param | Type | Description |
|---|---|---|
cursor | string | Pagination cursor. |
limit | number | Results per page. |
{ "data": [WebsetItem, ...], "hasMore": boolean, "nextCursor": string | null }
GET /websets/{websetId}/items/{itemId} — Get a Single Item
Response: A WebsetItem object.
DELETE /websets/{websetId}/items/{itemId} — Delete an Item
Response: The deleted WebsetItem object.
Searches
POST /websets/{websetId}/searches — Add a Search
Add a new search to an existing webset. Request body is the same shape as search in the create webset request:
WebsetSearch object.
GET /websets/{websetId}/searches/{searchId} — Get Search Status
Response: A WebsetSearch object with progress field showing found count and completion percentage (0–100).
POST /websets/{websetId}/searches/{searchId}/cancel — Cancel a Search
Response: The canceled WebsetSearch object.
Enrichments
POST /websets/{websetId}/enrichments — Add an Enrichment
WebsetEnrichment object.
GET /websets/{websetId}/enrichments/{enrichmentId} — Get Enrichment Status
Response: A WebsetEnrichment object.
PATCH /websets/{websetId}/enrichments/{enrichmentId} — Update an Enrichment
WebsetEnrichment object.
DELETE /websets/{websetId}/enrichments/{enrichmentId} — Delete an Enrichment
Response: The deleted WebsetEnrichment object.
POST /websets/{websetId}/enrichments/{enrichmentId}/cancel — Cancel a Running Enrichment
Response: The canceled WebsetEnrichment object.
Exports
POST /websets/{websetId}/exports — Schedule an Export
Generates a downloadable file of all items. Request body:
id, status (pending → completed), and downloadUrl (available when completed).
GET /websets/{websetId}/exports/{exportId} — Get Export Status
Poll until status is completed, then use the downloadUrl.
Imports
Imports let you bring your own URLs (e.g. from a CSV) and run enrichments on them.POST /imports — Create an Import
id and status.
GET /imports/{importId} — Get Import Details
Response: Import object with status and progress.
GET /imports — List All Imports
Query params: cursor, limit (same pagination pattern).
Response: { "data": [Import, ...], "hasMore": boolean, "nextCursor": string | null }
PATCH /imports/{importId} — Update an Import
DELETE /imports/{importId} — Delete an Import
Monitors
Monitors run searches on a schedule to keep websets updated.POST /monitors — Create a Monitor
| Field | Type | Required | Description |
|---|---|---|---|
websetId | string | Yes | The webset to attach the monitor to. |
cadence.cron | string | Yes | Standard 5-field Unix cron expression. Triggers at most once per day. |
cadence.timezone | string | No (default "Etc/UTC") | IANA timezone string. |
behavior.type | string | Yes | "search" (find new items) or "refresh" (re-process existing items). |
behavior.config.parameters | object | Yes for search | Same shape as the search object in create webset. |
GET /monitors/{monitorId} — Get Monitor Details
PATCH /monitors/{monitorId} — Update a Monitor
Update cadence, behavior, or metadata.
DELETE /monitors/{monitorId} — Delete a Monitor
GET /monitors — List All Monitors
Query params: cursor, limit.
Response: { "data": [Monitor, ...], "hasMore": boolean, "nextCursor": string | null }
GET /monitors/{monitorId}/runs — List Monitor Runs
Returns the history of executions for this monitor.
GET /monitors/{monitorId}/runs/{runId} — Get a Monitor Run
Webhooks
POST /webhooks — Create a Webhook
| Field | Type | Required | Description |
|---|---|---|---|
url | string (URL) | Yes | Endpoint to receive webhook POST requests. |
events | array (1–12 items) | Yes | Event types to subscribe to (see Event Types below). |
metadata | object | No | Arbitrary key-value pairs. |
Webhook object. Important: The secret field is only returned on creation. Store it securely for signature verification.
GET /webhooks/{webhookId} — Get Webhook Details
PATCH /webhooks/{webhookId} — Update a Webhook
Update url, events, or metadata.
DELETE /webhooks/{webhookId} — Delete a Webhook
GET /webhooks — List All Webhooks
Query params: cursor, limit.
GET /webhooks/{webhookId}/attempts — List Delivery Attempts
Returns the history of delivery attempts for this webhook, including response status codes and bodies.
Response: { "data": [WebhookAttempt, ...], "hasMore": boolean, "nextCursor": string | null }
Webhook Signature Verification
Webhooks are signed with HMAC SHA256. The signature is in theExa-Signature header:
- Parse the header to extract
t(timestamp) andv1(signature). - Construct the signed payload:
{timestamp}.{raw_request_body}. - Compute HMAC SHA256 using the
secretfrom webhook creation. - Compare your computed signature with
v1.
Events
Events track state changes across the system. Retained for 60 days.GET /events — List All Events
Query params: cursor, limit.
Response: { "data": [Event, ...], "hasMore": boolean, "nextCursor": string | null }
GET /events/{eventId} — Get a Single Event
Response: An event object with id, object ("event"), type, data, and createdAt.
Teams
GET /teams/me — Get Team Info
Returns your team’s concurrency usage and limits.
Object Schemas
Webset
running, idle, paused
WebsetSearch
created, running, completed, canceled
Canceled reasons: webset_deleted, webset_canceled
Progress: found = number of items discovered so far. completion = percentage (0–100).
WebsetItem
Item Properties by Entity Type
Company (properties.type = "company"):
| Field | Type | Description |
|---|---|---|
url | string | Company website URL |
description | string | Short description of relevance |
content | string? | Full text content of the company website |
company.name | string | Company name |
company.location | string? | Main location |
company.employees | number? | Employee count |
company.industry | string? | Industry |
company.about | string? | Short description |
company.logoUrl | string? | Logo URL |
properties.type = "person"):
| Field | Type | Description |
|---|---|---|
url | string | Profile URL |
description | string | Short description of relevance |
person.name | string | Full name |
person.location | string? | Location |
person.position | string? | Current work position |
person.pictureUrl | string? | Profile image URL |
properties.type = "article"):
| Field | Type | Description |
|---|---|---|
url | string | Article URL |
description | string | Short description of relevance |
content | string? | Full text content |
article.author | string? | Author(s) |
article.publishedAt | string? | Publication date |
properties.type = "research_paper"):
| Field | Type | Description |
|---|---|---|
url | string | Paper URL |
description | string | Short description of relevance |
content | string? | Full text content |
researchPaper.author | string? | Author(s) |
researchPaper.publishedAt | string? | Publication date |
properties.type = "custom"):
| Field | Type | Description |
|---|---|---|
url | string | Item URL |
description | string | Short description |
content | string? | Full text content |
custom.author | string? | Author(s) |
custom.publishedAt | string? | Publication date |
WebsetEnrichment
pending, completed, canceled
Format values: text, number, date, url, email, phone, options
When format is options, the options array contains objects with a label field (max 20 options).
EnrichmentResult (on each item)
result is always an array of strings (even for numbers/dates — they’re stringified). null if the enrichment couldn’t find the data.
Evaluation (on each item)
satisfied values: yes, no, unclear
Webhook
active, inactive
secret is only returned on creation. Store it immediately for signature verification.
WebhookAttempt
Event Types
| Event | When | Data |
|---|---|---|
webset.created | Webset is created | Webset |
webset.deleted | Webset is deleted | Webset |
webset.paused | Webset is paused | Webset |
webset.idle | All operations complete | Webset |
webset.search.created | A search starts | WebsetSearch |
webset.search.updated | Search progress updates | WebsetSearch |
webset.search.completed | A search finishes | WebsetSearch |
webset.search.canceled | A search is canceled | WebsetSearch |
webset.item.created | A new item is added (passed verification) | WebsetItem |
webset.item.enriched | An enrichment result is added to an item | WebsetItem |
webset.export.created | An export is scheduled | Export |
webset.export.completed | An export is ready to download | Export |
import.created | An import starts | Import |
import.completed | An import finishes | Import |
monitor.created | A monitor is created | Monitor |
monitor.updated | A monitor’s configuration is updated | Monitor |
monitor.deleted | A monitor is deleted | Monitor |
monitor.run.created | A monitor run starts | MonitorRun |
monitor.run.completed | A monitor run finishes | MonitorRun |
Pagination
All list endpoints use cursor-based pagination:nextCursor as the cursor query parameter in the next request. Continue until hasMore is false.
Python SDK: Use page.has_more and page.next_cursor (snake_case attributes). JavaScript SDK: Use page.hasMore and page.nextCursor.
Patterns and Best Practices
- Websets are async. After creating, poll with GET or use webhooks. Don’t expect results in the create response.
- Use
wait_until_idlein SDKs to block until processing completes. Default timeout is 3600s (1 hour), poll interval 5s. - Multiple searches can run on one webset. Use
POST /websets/{id}/searchesto add more. Searches run sequentially with each other but in parallel with enrichments. - Items are available immediately. You can list items while the webset is still
running. - Enrichment format controls output type. Use
text,number,date,url,email,phone, oroptions. optionsformat requires anoptionsarray with 1–20 items, each having alabelstring.- Monitor cron triggers at most once per day. This is a system constraint.
- Use
expand=itemsfor convenience.GET /websets/{id}?expand=itemsreturns the webset and its latest 100 items in one call. - Use
externalIdfor idempotency. SetexternalIdon creation to prevent duplicate websets. Returns 409 if the ID already exists. You can then useexternalIdin place ofidfor all subsequent API calls. - Webhook secrets are shown once. The
secretfield is only returned in the create webhook response. Store it immediately. - Enrichment results are arrays. Even for single values,
resultis always["value"]ornullif not found. - Criteria
successRateon search responses shows what percentage (0–100) of evaluated items matched that criterion. - Entity type auto-detection works well. Only specify
entitywhen you need fine control. For non-standard entities, use{"type": "custom", "description": "Your entity type"}. - Item data is nested under
properties. Accessitem.properties.url,item.properties.company.name, etc. — notitem.url. Enrichment results are atitem.enrichments[].result(always alist[str]ornull). - Enrichment results have
enrichmentId, notdescription. To get the human-readable description, build a map fromwebset.enrichments:{e.id: e.description for e in webset.enrichments}, then look upenr.enrichment_id. - Initial search is on the webset object. After
create(), the search is atwebset.searches[0]— no separate list call needed. Poll progress viasearches.get(webset_id, search_id).

