Agents leave receipts. We read 1.46 million of them

We analyzed what agents are actually doing with content. Here is what you can learn from teams succefully running AI content operations today.

Hollie Aghajani
Staff Product Marketing Manager
Knut Melvær
Principal Developer Marketing Manager
J.Requena
Chief Marketing Officer

Published June 15, 2026

Most "state of AI" reports we have seen run on surveys. You ask people how they use AI, they tell you what they hope is true, and you get an optimistic picture of a future that hasn't shown up yet.

We wanted firmer ground, so we did two things a survey doesn't. We measured what teams actually do, from the one place that can't round up: their AI agents' own tool calls. And we interviewed content leaders about why they do it. Quantitative reach, qualitative depth.

What we mean by "AI agent tool calls"

An AI agent (like Claude or ChatGPT) acts on its own to complete a task, not just answering a question. To get work done, it makes tool calls: individual requests to a system to do one thing. Query a document. Edit a field. Publish a change.

Every tool call in this report is one such request hitting Sanity's MCP server: a single, logged action an agent took against real content. When we say "agent actions," we mean the same thing. One action, one tool call, one row in the data.

Between September 2025 and April 2026, AI agent tool calls on Sanity's MCP server grew from 7,400 a month to 521,000. Across that window we logged 1.46 million calls, from 12,300 users across 12,500 organizations. What we see has gone past pilots. It's production work, running daily, and still counting.

The full report is at research.sanity.io/ai-content-ops, including a prompt that helps you get started with this in your org. This post goes into what's behind the research.

The heaviest AI user we found isn't an engineer

Eight percent of users drive 68% of the activity. That part won't surprise anyone. What stands out is that the most intense user in the entire dataset is a content marketer working through a chat window, not a developer on the command line.

The heavy content work that used to land on an engineer's plate, scripting a migration, wiring up a batch job, is now being done by the people who own the content, in plain language. AI isn't just making developers faster. It's moving the work off them entirely, to the people who know what the content is supposed to say.

What separates the power users isn't better models or more engineers. It's one person who owns the work and a few recurring workflows that compound. Often that person was never going to be an engineer in the first place.

9% of the work replaces weeks of it

Ninety-one percent of activity is daily work: querying, editing, publishing. The leverage is in the other 9%. Migration is 3% of calls, but each project replaces weeks of engineering work. Localization is 2%, but it used to mean an agency contract and weeks of back-and-forth.

None of it happens without structured content underneath. You can't ask an agent to expire a breaking-news flag or translate every product description into Japanese if the content is trapped in a page template. Structure is what makes content legible to an agent. That's why Sanity is the AI Content Operating System, not a place to store pages.

When agents tell you what they were trying to do

Even a couple of years in, there isn't much established practice for how to design and run agentic services at scale. "The agent as a user" is still a fresh field. What's interesting about agents is that they can tell us what they're trying to achieve when they make a tool call.

So we added a way to let them.

When an AI agent calls our MCP server, it can tell us why, in its own words. Every tool carries an optional intent field, and the models fill it in:

“Publish 20 imported blog posts migrated from WordPress.”
“Translating UI strings from English to Urdu (batch 6 of 6, final batch).”
“Re-noindex article with incomplete body update to prevent template content exposure.”

The intent field is defined once and added to every agent-facing tool the server exposes over the Model Context Protocol. The field's description tells the model what to write: the goal, not the operation. Nothing in the tool logic reads it. It rides along as telemetry, logged once per call at the transport layer.

// src/mcp/utils/tools.ts

// One optional field, defined once, merged into every agent-facing tool.

export const BaseToolParamsSchema = z.object({
  intent: z
    .string()
    .optional()
    .describe(`
      Briefly explain why you are making this tool call, 
      the high level goal, not the specific operation. 
      For example: 
      "migrating content from legacy CMS," 
      "building a product catalog for launch,"
      "cleaning up orphaned references after schema change." 
      This helps us understand usage patterns and improve the tools.
    `),
});

// src/mcp/utils/analyticsTransport.ts

// Emitted once per call, at the transport layer. No tool reads intent.

this.analytics.trackToolUsed({
  tool_name: context.toolName,
  tool_params: context.toolParams, // full arguments, including the intent string
  client_name: context.clientName, // e.g. "claude-code", from the MCP handshake
  user_id: context.userId,
  success,
});

// src/mcp/utils/tools.ts

// One optional field, defined once, merged into every agent-facing tool.

export const BaseToolParamsSchema = z.object({
  intent: z
    .string()
    .optional()
    .describe(`
      Briefly explain why you are making this tool call, 
      the high level goal, not the specific operation. 
      For example: 
      "migrating content from legacy CMS," 
      "building a product catalog for launch,"
      "cleaning up orphaned references after schema change." 
      This helps us understand usage patterns and improve the tools.
    `),
});

// src/mcp/utils/analyticsTransport.ts

// Emitted once per call, at the transport layer. No tool reads intent.

this.analytics.trackToolUsed({
  tool_name: context.toolName,
  tool_params: context.toolParams, // full arguments, including the intent string
  client_name: context.clientName, // e.g. "claude-code", from the MCP handshake
  user_id: context.userId,
  success,
});

The telemetry gave us 1.46 million logged tool calls across the window; since March, nearly every one of them also tells us why it was called. That's the what, at scale. To understand the why, our user researcher ran in-depth interviews with 12 enterprise content leaders across 8 industries: what they were trying to do, where they stalled, what finally worked. The telemetry shows the pattern. The interviews explain it. We haven't seen the two paired like this before in the content operations space.

Two caveats, in full transparency. The data represents a specific population, teams who already wired AI into a structured content backend, so it's a leading indicator, not the whole market. And because intent is free text the model writes, the figures here are aggregate counts and the example intents are anonymized. We treat that telemetry as sensitive rather than assume it's clean.

How we built this report

The report is also an example of the thing it describes.

Our MCP team shipped the intent field on March 2, 2026. Three weeks later we had the first version of this. Knut pointed Claude Code at our BigQuery warehouse over an MCP connection, asked it to pull and categorize the intent data, and had a working analysis and an interactive draft in about an hour. That first hour got us a draft, no more. The longer work was human: interrogating what the data actually said, throwing out the first reads that didn't hold, design iteration, and the Figma-to-code translation we mention below. But the analysis that used to need a data vendor and a survey panel? An hour.

That hour held up because the hard parts were already done by people. The MCP team had instrumented every call. User research had run the interviews, the depth no BigQuery query returns. What AI compressed was everything after, the analysis and the build. That used to mean a data vendor, a survey panel, and a design agency working over two months. It didn't replace the instrumentation or the conversations. The whole thing rests on those.

Where we stepped in was to find the real story in the numbers and design how to present it. We also took a heavy hand translating the design from Figma to code, despite the many skills, MCPs, and attempts to bridge the gap between rich vector designs and frontend development.

But in practice, this is an AI report, about AI, built with AI, standing on what the team built.

Get the report and your 8-week plan

Read the full report here. You can also point your assistant at the machine-readable version to get started right away.

The part worth your time is the planner at the end. Six questions about your team become a prompt for your own 8-week plan, built from this data in whatever model you run. Let us know how it works for you by tagging Sanity on social or emailing knut@sanity.io.

The agent era already arrived for the teams in this data. They're over eight weeks in. Your catch-up plan is one prompt away.