Content Lake (Datastore)

API CDN

Description of the CDN-distributed, cached version of the Sanity API.

When querying content for your frontend, choosing between Sanity's two content delivery APIs affects response speed, freshness, and rate limits.

  • api.sanity.io: the uncached API. This is the default and will always give you the freshest data, but requests will be slower because they need to reach the backend on every request. Requests are also more costly because they trigger more computation on the servers.
  • apicdn.sanity.io: the CDN-distributed, cached API. This opt-in feature provides fast responses for cached requests. Use the API CDN for frontends that serve end users. For static builds, the live uncached API is a better fit to ensure you get the latest content.

To use the API CDN, use apicdn.sanity.io instead of api.sanity.io. Most clients provide a useCdn option that makes this switch seamless.

Supported endpoints

Choosing the right API

Cache policy

The API CDN is primarily meant to cache query results for end users:

  • GET, HEAD, and OPTION requests are cached.
  • POST requests to /graphql and /data/query are also cached, as these endpoints are read-only.
  • Maximum HTTP POST size is 300 KB.
  • All other POST requests are rejected since they can contain mutations.
  • Responses larger than 10 MB are not cached.
  • Non-200 responses are not cached.
  • Cookies are ignored when identifying cache hits.
  • Authenticated requests are cached. Caching is segmented for each unique authentication token.
  • Listeners, including the Live Content API, are redirected to the API and do not query cached content.

During periods of high content traffic (mutations or requests), we prioritize the cache invalidation queue to ensure consistent caching windows for customers with our High Frequency CDN.

If Sanity's Content Lake is unavailable, the API CDN will return the last cached content for up to two hours.

All official clients will automatically fall back to using the live API where appropriate.

Locations

Sanity currently has CDNs for the API in these locations:

  • Asia
    • Mumbai, India
  • Oceania
    • Sydney, Australia
  • Europe
    • Saint-Ghislain, Belgium
  • South America
    • São Paulo, Brazil
  • North America
    • Oregon, United States
    • Iowa, United States
    • Northern Virginia, United States

A short-lived global CDN also sits in front of these locations, with points of presence on all continents. This global CDN does not cache private datasets or POST queries. Using the API CDN is still recommended for both public and private datasets.

IP addresses in use

We maintain a unified list of all IPs that may be useful to permit in instances where you have egress filtering enabled. See the IP addresses used by Sanity document for details.

Rate limiting and concurrency

Cached responses from the API CDN are not rate limited. However, requests that result in a cache miss are forwarded to the direct API, which enforces rate limits and concurrency limits per dataset.

Concurrency limits

Concurrency limits restrict how many requests can be in-flight at the same time for a single dataset:

  • Queries: 500 concurrent requests per dataset
  • Mutations: 100 concurrent requests per dataset

Per-IP rate limits

The direct API also enforces per-IP rate limits:

  • API calls: 500 requests per second per IP
  • Mutations: 25 requests per second per IP
  • Uploads: 25 requests per second per IP

When you exceed a limit, the API returns an HTTP 429 Too Many Requests response. Implement exponential backoff or wait briefly before retrying.

Best practices

  • Use the API CDN for reads: set useCdn: true in your client configuration to serve cached responses and avoid hitting direct API limits.
  • Limit concurrency in static builds: static site generators that fetch many pages at build time can exceed the 500 concurrent query limit. Use a concurrency limiter to cap parallel requests. The Importing data guide covers similar patterns for managing request throughput.
  • Implement retry logic: when you receive a 429 response, use exponential backoff before retrying. Libraries like p-limit can help manage concurrency in your application.
  • Batch mutations: combine multiple mutations into a single transaction instead of sending them individually.

Was this page helpful?