API CDN
Description of the CDN-distributed, cached version of the Sanity API.
When querying content for your frontend, choosing between Sanity's two content delivery APIs affects response speed, freshness, and rate limits.
api.sanity.io: the uncached API. This is the default and will always give you the freshest data, but requests will be slower because they need to reach the backend on every request. Requests are also more costly because they trigger more computation on the servers.apicdn.sanity.io: the CDN-distributed, cached API. This opt-in feature provides fast responses for cached requests. Use the API CDN for frontends that serve end users. For static builds, the live uncached API is a better fit to ensure you get the latest content.
To use the API CDN, use apicdn.sanity.io instead of api.sanity.io. Most clients provide a useCdn option that makes this switch seamless.
Supported endpoints
The API CDN supports /<version>/data/query for GROQ queries and /<version>/graphql for GraphQL queries.
Choosing the right API
Make sure to pick the right tool for your workload.
If you are going to fetch content from a browser, we recommend the API CDN so your requests can scale.
When building integrations with Sanity or responding to webhooks, we recommend using the API to capture the latest saved content.
Cache policy
The API CDN is primarily meant to cache query results for end users:
- GET, HEAD, and OPTION requests are cached.
- POST requests to
/graphqland/data/queryare also cached, as these endpoints are read-only. - Maximum HTTP POST size is 300 KB.
- All other POST requests are rejected since they can contain mutations.
- Responses larger than 10 MB are not cached.
- Non-200 responses are not cached.
- Cookies are ignored when identifying cache hits.
- Authenticated requests are cached. Caching is segmented for each unique authentication token.
- Listeners, including the Live Content API, are redirected to the API and do not query cached content.
During periods of high content traffic (mutations or requests), we prioritize the cache invalidation queue to ensure consistent caching windows for customers with our High Frequency CDN.
If Sanity's Content Lake is unavailable, the API CDN will return the last cached content for up to two hours.
All official clients will automatically fall back to using the live API where appropriate.
Caches are based on the URLs, including query parameters and other URL fragments, of your requests. Optimize your performance by ensuring that your request URLs will be shared across your traffic and will benefit from caching.
Locations
Sanity currently has CDNs for the API in these locations:
- Asia
- Mumbai, India
- Oceania
- Sydney, Australia
- Europe
- Saint-Ghislain, Belgium
- South America
- São Paulo, Brazil
- North America
- Oregon, United States
- Iowa, United States
- Northern Virginia, United States
A short-lived global CDN also sits in front of these locations, with points of presence on all continents. This global CDN does not cache private datasets or POST queries. Using the API CDN is still recommended for both public and private datasets.
IP addresses in use
We maintain a unified list of all IPs that may be useful to permit in instances where you have egress filtering enabled. See the IP addresses used by Sanity document for details.
Add live content to your application
Learn to use the Live Content API with Next.js or your own integration for real-time content updates in your app.
Technical limits
A list of data store limits.
Getting started with @sanity/client
Learn how to install and configure the official Sanity JavaScript client for querying and mutating content across different environments.
Rate limiting and concurrency
Cached responses from the API CDN are not rate limited. However, requests that result in a cache miss are forwarded to the direct API, which enforces rate limits and concurrency limits per dataset.
Concurrency limits
Concurrency limits restrict how many requests can be in-flight at the same time for a single dataset:
- Queries: 500 concurrent requests per dataset
- Mutations: 100 concurrent requests per dataset
Per-IP rate limits
The direct API also enforces per-IP rate limits:
- API calls: 500 requests per second per IP
- Mutations: 25 requests per second per IP
- Uploads: 25 requests per second per IP
When you exceed a limit, the API returns an HTTP 429 Too Many Requests response. Implement exponential backoff or wait briefly before retrying.
Best practices
- Use the API CDN for reads: set
useCdn: truein your client configuration to serve cached responses and avoid hitting direct API limits. - Limit concurrency in static builds: static site generators that fetch many pages at build time can exceed the 500 concurrent query limit. Use a concurrency limiter to cap parallel requests. The Importing data guide covers similar patterns for managing request throughput.
- Implement retry logic: when you receive a
429response, use exponential backoff before retrying. Libraries likep-limitcan help manage concurrency in your application. - Batch mutations: combine multiple mutations into a single transaction instead of sending them individually.
For the complete list of API limits, including document size, query result size, and asset limits, see Technical limits.