Experimental feature

Embeddings Index HTTP API reference

Reference documentation of the Sanity Embeddings Index HTTP API to create and manage embeddings indexes for your content.

Using this feature requires Sanity to send data to OpenAI and Pinecone to store vector interpretations of documents.

Gotcha

Embeddings Index API is currently in beta. Features and behavior may change without notice.

Embeddings Index API is available to users on the Team plan and above.

Embeddings Index API functionality is available through the Embeddings Index CLI, the Embeddings Index UI for Sanity Studio, and the Embeddings Index HTTP API.

The Sanity Embeddings Index HTTP API endpoints expose functionality to create, delete, fetch, and query embeddings indexes in a Sanity project.

Base URL and HTTP API version

  • REQUIREDhttps://<projectId>.api.sanity.ioBase URL

    In the base URL, replace the <projectId> placeholder with the ID of the project where you want to create an embeddings index for an existing project database.

    For more information about retrieving a project ID, see URL Format.

  • REQUIRED/vXHTTP API version

    The Embeddings Index HTTP API version that exposes the endpoints to perform CRUD operations on embeddings indexes.

HTTP headers

  • REQUIREDAuthorizationBearer token

    With each request, pass a bearer token in the Authentication HTTP header.

    For more information about minting a valid bearer token, see the corresponding section in this article.

  • REQUIREDContent-Typeapplication/json

    With each request, pass the application/json MIME content type to notify the server that the request payload, if included, is JSON.

    If the request includes a body, the format must be valid JSON.

  • REQUIREDAcceptapplication/json

    With each request, pass the application/json MIME content type to notify the server that the client can understand a response body in JSON format.

Bearer token

To consume the HTTP API endpoints, you need to include a bearer token with each request.

The bearer token needs the following permissions:

  • Create dataset
  • Create token
  • Create webhook

To mint a bearer token:

  1. Log in to manage.sanity.io.
  2. Select an organization, if applicable, and then a project.
  3. Go to API, and then from the left-side navigation, select Tokens.
  4. In the Token section, click + Add API token.
  5. In the Name field, give the token a descriptive name.
    Example: Embeddings index for movie database
  6. Under Permissions, select Developer, and then click Save.
  7. Copy the generated token, and store it securely.
    Don't share or expose it as plain text.

Pass this token as the bearer token with each request.

Create an embeddings index

Create a new embeddings index for an existing database in a Sanity project.

Request body

The request body must be valid JSON with the following schema:

{
  indexName: string
  filter: string 
  projection: string 
}
  • REQUIREDindexstring

    The name of the embeddings index.
    It must be unique per dataset.

  • REQUIREDfilterstring

    Specify the filtering criteria to include in the index only the selected subset of documents from the database.

    The filter must be a valid GROQ filter without the square brackets that wrap the value assigned to _type.

    Example:

    "_type=='director'",

  • REQUIREDprojectionstring

    Specify the projection criteria to include in the index only the selected subset of properties from the filtered documents.

    The projection must be a valid GROQ projection, including curly brackets.

    Example:

    "{title, director}",

Request example

The example uses cURL

curl --request POST 'https://ab2cdefg.api.sanity.io/vX/embeddings-index/my-movies' \
     --header 'Authorization: Bearer <bearer-token>' \
     --header 'Content-Type: application/json' \
     --header 'Accept: application/json' \
     --data '{
               "indexName": "embeddings-index-movies",
               "filter": "_type=='movie'",
               "projection": "{...}" 
            }'

Response example

# Status code on successful creation
200 OK
{
    "status": "ok",
    "message": "Index created. Documents enqueued for indexing.",
    "index": {
        "status": "pending",
        "indexName": "embeddings-index-movies",
        "projectId": "ab2cdefg",
        "dataset": "my-movies",
        "projection": "{...}",
        "filter": "_type=='movie'",
        "createdAt": "2023-09-15T12:27:55Z",
        "updatedAt": "2023-09-15T12:27:55Z",
        "failedDocumentCount": 0,
        "startDocumentCount": 19,
        "remainingDocumentCount": 19,
        "webhookId": "abcdefghI7jKKl1P"
    }
}
  • When the index is created, all published document IDs that match the filter are enqueued for indexing.
  • The initial status of a newly created index is pending. It changes to indexing while processing, and as long as there are any remaining documents to index. After completing indexing all matching documents, the index status changes to active. An embeddings index with an active status can only receive updates via webhook.
  • startDocumentCount records the number of documents that match the index filter upon index creation.
  • Draft documents aren't indexed.

Error handling

The request can return 40x HTTP status codes in the following cases:

  • The bearer token isn't valid (auth error.)
    • Make sure you minted a Developer token (see under Bearer token above.)
  • The GROQ filter and/or the GROQ projection aren't valid.
    • Review the input GROQ queries to amke sure they are valid
  • You're trying to create an embeddings index with the same name as another existing embedding index.
    • Make sure the embeddings index names are unique for each database in a Sanity project.
  • You're trying to update the filter or the projection of an existing embeddings index.
    • Currently, it's not possible to update existing embeddings indexes.
      To update an index, delete the current one, and then create a new index with the updated filter or projection values.

Get all embeddings indexes

Retrieve a list of all existing embeddings indexes for an existing database in a Sanity project.

Request body

The request has no body.

Request example

The example uses cURL.

curl --request GET 'https://ab2cdefg.api.sanity.io/vX/embeddings-index/my-movies' \
     --header 'Authorization: Bearer <bearer-token>' \
     --header 'Content-Type: application/json' \
     --header 'Accept: application/json' \
     --data ''

Response example

An array of JSON objects. Each object represents an embeddings index related to the specified database.

If the request retrieves no embeddings indexes, the response returns an empty array: []

# Status code on successful creation
200 OK
[
    {
        "status": "active",
        "indexName": "embeddings-index-movies",
        "projectId": "ab2cdefg",
        "dataset": "my-movies",
        "projection": "{...}",
        "filter": "_type=='movie'",
        "createdAt": "2023-09-15T12:27:55Z",
        "updatedAt": "2023-09-15T12:27:55Z",
        "failedDocumentCount": 0,
        "startDocumentCount": 19,
        "remainingDocumentCount": 0,
        "webhookId": "abcdefghI7jKKl1P"
    }
]

Error handling

The request can return 40x HTTP status codes in the following cases:

  • The target dataset doesn't exist.

Get an embeddings index

Retrieve a specific embeddings index for an existing database in a Sanity project.

Request body

The request has no body.

Request example

The example uses cURL.

curl --request GET 'https://ab2cdefg.api.sanity.io/vX/embeddings-index/my-movies/embeddings-index-movies' \
     --header 'Authorization: Bearer <bearer-token>' \
     --header 'Content-Type: application/json' \
     --header 'Accept: application/json' \
     --data ''

Response example

A JSON object representing the specified embeddings index.

# Status code on successful creation
200 OK
{
	"status": "active",
	"indexName": "embeddings-index-movies",
	"projectId": "ab2cdefg",
	"dataset": "my-movies",
	"projection": "{...}",
	"filter": "_type=='movie'",
	"createdAt": "2023-09-15T12:27:55Z",
	"updatedAt": "2023-09-15T12:28:21Z",
	"failedDocumentCount": 0,
	"startDocumentCount": 19,
	"remainingDocumentCount": 0,
	"webhookId": "abcdefghI7jKKl1P"
}

Error handling

The request can return 40x HTTP status codes in the following cases:

  • The target dataset doesn't exist.
  • The requested embeddings index doesn't exist.

Query an embeddings index

Query an embeddings index to retrieve documents that are closely related to the input query in the request.

It returns an array of matching documents IDs with their relevance score, based on the input query in the request.

Request body

The request body must be valid JSON with the following schema:

{
    query: string
    maxResults?: number
    filter?: {
        type?: string | string[]
    }
}
  • REQUIREDquerystring

    The text string used to query the embeddings index.

    Example:

    "sci-fi adventure with cowboys and aliens"

  • maxResultsnumber

    Max. number of results to return for each request.

    Default: 10

  • filterstring

    Optional filter to select specific document types.
    Comma-separated list of document types to include in the results.

    Example:

    { "type": "summary,synopsis,userReview" }

Request example

The example uses cURL.

curl --request POST 'https://ab2cdefg.api.sanity.io/vX/embeddings-index/query/my-movies/embeddings-index-movies' \
     --header 'Authorization: Bearer <bearer-token>' \
     --header 'Content-Type: application/json' \
     --header 'Accept: application/json' \
     --data '{  
                "query": "sci-fi adventure with cowboys and aliens",
                "maxResults": 10,
                "filter": {
                  "type": "summary,synopsis,userReview"
                }
             }'

Response example

An array of JSON objects. Each object represents a matching document.

# Status code on successful creation
200 OK
[
    {
        "score": 1,
        "value": {
            "documentId": "85803a29-fe3f-44ba-9c0f-9a40239fd735",
            "type": "synopsis"
    }
}]
  • score can be between 0 and 1.
    The higher the score, the more relevant the matching document.
  • documentId is the UUID of the matching document.
    For more information about document IDs, see IDs and Paths.

Error handling

The request can return 40x HTTP status codes in the following cases:

  • The target dataset doesn't exist.
  • The queried embeddings index doesn't exist.
  • The query returns no relevant matches at all.

Delete an embeddings index

Delete a specific embeddings index for an existing database in a Sanity project.

Request body

The request has no body.

Request example

The example uses cURL.

curl --request DELETE 'https://ab2cdefg.api.sanity.io/vX/embeddings-index/my-movies/embeddings-index-movies' \
     --header 'Authorization: Bearer <bearer-token>' \
     --header 'Content-Type: application/json' \
     --header 'Accept: application/json' \
     --data ''

Response example

A JSON object representing the embeddings index at the time of its deletion.

# Status code on successful creation
200 OK
{
	"status": "active",
	"indexName": "embeddings-index-movies",
	"projectId": "ab2cdefg",
	"dataset": "my-movies",
	"projection": "{...}",
	"filter": "_type=='movie'",
	"createdAt": "2023-09-15T12:27:55Z",
	"updatedAt": "2023-09-15T12:28:21Z",
	"failedDocumentCount": 0,
	"startDocumentCount": 19,
	"remainingDocumentCount": 0,
	"webhookId": "abcdefghI7jKKl1P"
}

Error handling

The request can return 40x HTTP status codes in the following cases:

  • The target dataset doesn't exist.
  • The requested embeddings index doesn't exist.

Known limitations

Creating an embeddings index for very large datasets can be slow. The Embeddings Index HTTP API rate limit depends on the OpenAI rate limit, which sets a cap for the HTTP API at about 8,000 tokens per minute.

Further reading

Set up an embeddings index and learn more about what embeddings can do for your content

Was this article helpful?