Bot/Crawler Traffic Causing Excessive Sanity API Requests Despite Caching

18 replies
Last updated: Nov 30, 2025
I've already gotten great, personal assistance from Sanity with regards to investigating bandwidth/request concerns, but has anyone out there in the community had experience dealing with what appears to be unprompted requests (or rather, requests unprompted by humans) running up the tally?
I am seeing a few hundred people causing gigs of bandwidth and tens of thousands of requests despite using API CDN and caching the rendered content server-side through PHP. Today, with fewer visits across all pages, and with more aggressive caching, the same queries resulted in more requests than days when there were fewer visits.

To be clear: Sanity is running fine, and honoring every request; it isn't a Sanity issue -- but there are more requests being asked of it than I can account for despite the code just being looping through $client->fetch queries.
AI Update

I feel you on this one - it's frustrating when request counts don't match what you'd expect from actual human traffic. Based on your description and what I know about Sanity's API CDN, here are some likely culprits:

Common causes of unexpected requests

1. The token bypass issue

This is probably the biggest one: if you're passing an authorization token (even a read token) to your Sanity client, you're completely bypassing the API CDN. Every single request hits the origin API directly, no matter what your caching setup looks like.

In your PHP client config, make sure you're doing:

$client = new SanityClient([
    'projectId' => 'your-project',
    'dataset' => 'production',
    'useCdn' => true,  // This alone isn't enough!
    'token' => null,   // Don't pass a token for public reads
    'apiVersion' => '2025-02-06'
]);

The API CDN provides unlimited rate for cached content, but any request with an authorization header bypasses it entirely and counts against your plan limits.

2. GROQ queries with dynamic functions

If your queries use functions like now(), dateTime(), or reference _updatedAt in ways that change constantly, the API CDN can't cache them effectively. Each "unique" query (even with the same structure but different timestamps) counts as a separate request.

3. Bot and crawler traffic

Even with server-side caching, if bots are hitting different pages/URLs on your site, each unique URL might trigger its own set of Sanity queries. SEO crawlers, monitoring services, and malicious bots can generate surprising request volumes. They might be:

  • Ignoring your robots.txt
  • Crawling pages faster than your cache can keep up
  • Hitting URLs with different query parameters that bypass your cache

4. Draft/preview mode leakage

If you have any preview functionality that queries draft content, those requests will bypass the CDN entirely since draft content can't be cached. Check if you accidentally have useCdn: false in some client instances.

Debugging strategies

Since Sanity's already helped you investigate, here are some additional things to check:

  • Check your server logs: Cross-reference your PHP application logs with Sanity request timing - are there background jobs or cron tasks making requests?
  • Monitor for retry logic: If you have any error handling that retries failed requests, a temporary issue could cascade into thousands of retries
  • Verify CDN headers: Test your queries directly and check the response headers to confirm CDN hits vs misses
  • Look for loops: Sometimes a foreach over query results accidentally triggers nested queries, multiplying requests exponentially

The aggressive caching paradox

Your observation about more aggressive caching leading to more requests is particularly interesting. This could indicate:

  • Cache invalidation is happening too frequently, causing cache misses
  • Your caching layer is actually making more "check if fresh" requests
  • There's a race condition where multiple processes are checking cache status simultaneously
  • Your cache key generation might not be stable, causing the same query to be treated as different

The fact that this is happening with fewer actual visits but more requests strongly suggests either bot traffic or some kind of automated process that's not directly tied to page views.

Quick wins to try

  1. Remove any tokens from public-facing queries - This is the #1 CDN bypass culprit
  2. Add rate limiting at your server level - Even if it's not a Sanity issue, you probably want to throttle suspicious traffic
  3. Check for any useCdn: false configurations - Search your codebase for this
  4. Monitor a single query type - Pick one specific query and track just that to isolate the pattern

Hope this helps narrow it down! If you share more about your PHP caching implementation or can identify which specific queries are generating the most requests, the community might spot something specific to that setup.

Show original thread
18 replies

Sanity – Build the way you think, not the way your CMS thinks

Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.

Was this answer helpful?