Was this page helpful?
Learn how to set up Sanity to deliver detailed logs for API requests, enabling insights into content interaction and bandwidth usage.
Sanity can be set up to deliver detailed logs for all API requests related to a project. This allows you to make informed decisions about how content is requested and interacted with in the Content Lake.
You can use these logs to get insights into what’s driving requests and bandwidth usage, where requests come from, and more.
Sanity’s Content Lake supports marking your requests with tags as a simple but powerful way of adding some context to your API activities. Visit the request tags reference article to learn more about this feature.
You can access request logs on self-serve plans by going to the Usage section of your project settings. At the bottom of this page, you'll find a button to download up to 1GB of log data from the last 7 days up to the day before you download the data. You can request a new export every 24 hours.
The request log export will come as a compressed NDJSON file. You use different tools to analyze this like GROQ CLI or jq. You can even convert it to CSV using a package like json2csv:
gunzip --stdout [compressed logfile].ndjson.gz | npx json2csv --ndjson --output [output].csv
Exploring tools like Jupyter Notebook can also be helpful for more extensive analysis.
Another helpful tip is to upload a sample of your log files to AI tools like ChatGPT and ask them to analyze it or provide you with methods for doing so in Python or other programming languages and frameworks.
Visit the request log data reference to learn how the logs are structured and formatted.
For projects on enterprise plans, logs are delivered as compressed NDJSON files to your Google Cloud Storage (GCS) bucket, which then serves as a staging area for ingesting the reports into a data analysis tool of your choice.
This feature is available on certain Enterprise plans. Talk to sales to learn more.
Visit the request log data reference to learn how the logs are structured and formatted.
You can always extract, that is, download, the raw request log file on demand for ad hoc analysis. However, you can save time and make insights more broadly accessible to your team if you load logs into a data lake, such as BigQuery, and set up pre-defined queries for common reports.
The entire process – from enabling the log delivery service to querying your data for insights – requires a few separate steps to set up and follows the Extract, Load, and Transform (ELT) pattern for data integration. You will find an example implementation below, detailing how to implement Sanity request logs with Google Cloud Storage and BigQuery.
In this step, you will enable log delivery in your Sanity project and connect your GCS bucket in the Sanity project management settings. The setup described in this step is the only part of this guide that is required to use the request log feature, while the subsequent steps are provided as an example implementation.
node, npm, and the gcloud suite of tools. While this guide demonstrates how to achieve the necessary setup in GCP using the command line, the same result can be achieved using the GCP web interface.gcloud CLIEnsure that the Google Cloud CLI is configured to the correct project for where you want to store your request logs.
# Replace [PROJECT_ID] with your actual Google Cloud project ID gcloud config set project [PROJECT_ID]
Create a new GCS bucket where your files will be uploaded to.
# Replace [BUCKET_NAME] with your actual bucket name gcloud storage buckets create gs://[BUCKET_NAME]
For Sanity to deliver files to your GCS bucket, you must give our service account (serviceAccount:delivery@sanity-log-delivery.iam.gserviceaccount.com) the storage.objectCreator role:
# Replace [BUCKET_NAME] with your actual bucket name gcloud storage buckets add-iam-policy-binding gs://[BUCKET_NAME] --member=serviceAccount:delivery@sanity-log-delivery.iam.gserviceaccount.com --role=roles/storage.objectCreator
Log Delivery is disabled by default. It must be enabled in the Sanity project settings by a project administrator.
Once the pipeline for log delivery has been configured, it’s time to hook up your preferred data analysis tool. The process will vary somewhat from tool to tool. The following section will show you how to accomplish this task using BigQuery from Google.
You can set up a direct connection between BigQuery and GCS buckets using External Tables. Please read the documentation to understand the costs and limitations.
Sanity has structured the bucket key in a way that allows for partitioning per project, event type, and date.
We key the object using Hive partitioning with the following format:
gs://event-logs/project_id=[string]/kind=[event-type:string]/dt=[date:DATE]/[file-name:string].ndjson.gzThis allows the log data to be loaded into various data platforms with the project ID, data type, and date used as partitioning properties.
node, npm, and the gcloud command line tools installedCreate a JSON file locally named schema.json with the nested schema definition for your log data.
If you’re working on a Sanity Studio project, we recommend placing this schema file in its folder (for example, /log-delivery/schema.json) to avoid confusion with the Studio schema for your content model.
{
"sourceFormat": "NEWLINE_DELIMITED_JSON",
"schema": {
"fields": [
{ "name": "timestamp", "type": "TIMESTAMP", "mode": "REQUIRED" },
{ "name": "traceId", "type": "STRING", "mode": "REQUIRED" },
{ "name": "spanId", "type": "STRING", "mode": "REQUIRED" },
{ "name": "severityText", "type": "STRING", "mode": "NULLABLE" },
{ "name": "severityNumber", "type": "INT64", "mode": "REQUIRED" },
{
"name": "body",
"type": "RECORD",
"mode": "REQUIRED",
"fields": [
{ "name": "duration", "type": "DECIMAL", "mode": "NULLABLE" },
{ "name": "insertId", "type": "STRING", "mode": "NULLABLE" },
{ "name": "method", "type": "STRING", "mode": "NULLABLE" },
{ "name": "referer", "type": "STRING", "mode": "NULLABLE" },
{ "name": "remoteIp", "type": "STRING", "mode": "NULLABLE" },
{ "name": "requestSize", "type": "INT64", "mode": "NULLABLE" },
{ "name": "responseSize", "type": "INT64", "mode": "NULLABLE" },
{ "name": "status", "type": "INT64", "mode": "NULLABLE" },
{ "name": "url", "type": "STRING", "mode": "NULLABLE" },
{ "name": "userAgent", "type": "STRING", "mode": "NULLABLE" }
]
},
{
"name": "attributes",
"type": "RECORD",
"mode": "NULLABLE",
"fields": [
{
"name": "sanity",
"type": "RECORD",
"mode": "NULLABLE",
"fields": [
{ "name": "projectId", "type": "STRING", "mode": "REQUIRED" },
{ "name": "dataset", "type": "STRING", "mode": "NULLABLE" },
{ "name": "domain", "type": "STRING", "mode": "NULLABLE" },
{
"name": "groqQueryIdentifier",
"type": "STRING",
"mode": "NULLABLE"
},
{ "name": "apiVersion", "type": "STRING", "mode": "NULLABLE" },
{ "name": "endpoint", "type": "STRING", "mode": "NULLABLE" },
{ "name": "tags", "type": "STRING", "mode": "REPEATED" },
{ "name": "studioRequest", "type": "BOOLEAN", "mode": "NULLABLE" }
]
}
]
},
{
"name": "resource",
"type": "RECORD",
"mode": "REQUIRED",
"fields": [
{
"name": "service",
"type": "RECORD",
"mode": "NULLABLE",
"fields": [{ "name": "name", "type": "STRING", "mode": "NULLABLE" }]
},
{
"name": "sanity",
"type": "RECORD",
"mode": "NULLABLE",
"fields": [
{ "name": "type", "type": "STRING", "mode": "NULLABLE" },
{ "name": "version", "type": "STRING", "mode": "NULLABLE" }
]
}
]
}
]
},
"compression": "GZIP",
"sourceUris": ["gs://[BUCKET_NAME]/[PREFIX]event-logs/*"],
"hivePartitioningOptions": {
"mode": "CUSTOM",
"sourceUriPrefix": "gs://[BUCKET_NAME]/[PREFIX]event-logs/{project_id:STRING}/{kind:STRING}/{dt:DATE}/"
}
}Make sure to replace [BUCKET_NAME] and [PREFIX] with the appropriate values for your setup.
Run the following command using the bq (BigQuery) CLI tool bundled with the gcloud CLI:
# Replace [DATASET_NAME] and [TABLE_NAME] with your details bq mk --external_table_definition=schema.json [DATASET_NAME].[TABLE_NAME]
Once the log data is loaded into the table, you can run queries against it to test if the implementation works as expected.
Example: Get data from yesterday.
/* Replace [GCP_PROJECT_NAME], [DATASET_NAME], and [TABLE_NAME] with your details */ SELECT * FROM `[GCP_PROJECT_NAME].[DATASET_NAME].[TABLE_NAME]` WHERE project_id = '[SANITY_PROJECT_ID]' AND kind = 'request-log' AND dt = DATE_ADD(CURRENT_DATE(), INTERVAL -1 DAY)
Your log data is now ready to provide answers and insights into API and CDN usage. The following section will show how to query your logs using BigQuery and SQL.
You can also use AI solutions like ChatGPT to figure out queries for specific questions by giving it the log table schema and specifying that you are working with BigQuery.
At this point, you should have accomplished the following:
You will also need the appropriate user privileges to query BigQuery in the Google Cloud Platform.
Caution: BigQuery can get expensive when querying large datasets as they have a pay-per-usage model by default. Before running queries on this platform, understand the BigQuery pricing model and how your query will impact cost.
Sanity projects are metered on bandwidth usage. A large part of bandwidth usage can come from image and video downloads. Use this BigQuery query to understand which asset is using the most bandwidth.
/* Replace [PROJECT], [DATASET], and [TABLE_NAME] with your details */ SELECT body.url, sum(body.responseSize) / 1000 / 1000 AS responseMBs FROM `[PROJECT].[DATASET].[TABLE_NAME]` WHERE attributes.sanity.domain = 'cdn' AND timestamp > TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 DAY) GROUP BY 1 ORDER BY 2 DESC LIMIT 10;
You can use this information to search your Sanity dataset for the documents using this asset and then optimize to reduce bandwidth.
/* Replace [PROJECT], [DATASET], and [TABLE_NAME] with your details */ SELECT DATE(timestamp) AS date, body.method, attributes.sanity.groqQueryIdentifier AS groq_query_identifier, COUNT(*) as times_called, AVG(body.duration) / 1000 AS average_response_time_seconds FROM `[PROJECT].[DATASET].[TABLE_NAME]` WHERE body.duration IS NOT NULL AND attributes.sanity.groqQueryIdentifier IS NOT NULL AND attributes.sanity.groqQueryIdentifier != "" AND body.method = "GET" AND attributes.sanity.endpoint = "query" GROUP BY 1,2,3 ORDER BY 1 DESC,5 DESC,4 DESC
Note we cannot create a GROQ query identifier if the query is in a POST body.
/* Replace [PROJECT], [DATASET], and [TABLE_NAME] with your details */
WITH ErrorCount AS (
SELECT
DATE(timestamp) AS date,
COUNTIF(body.status >= 500) AS server_error_count,
COUNTIF(body.status >= 400 AND body.status < 500) AS user_error_count,
COUNT(*) AS total_requests
FROM
`[PROJECT].[DATASET].[TABLE_NAME]`
WHERE
body.status IS NOT NULL
GROUP BY
date
)
SELECT
date,
server_error_count,
user_error_count,
total_requests,
ROUND((server_error_count + user_error_count) / total_requests * 100, 2) AS error_percentage
FROM
ErrorCount
ORDER BY
date;/* Replace [PROJECT], [DATASET], and [TABLE_NAME] with your details */ SELECT attributes.sanity.dataset AS dataset_name, COUNT(DISTINCT attributes.sanity.groqQueryIdentifier) AS unique_get_queries, COUNT(*) AS total_requests, SUM(body.responseSize) AS total_response_size FROM `[PROJECT].[DATASET].[TABLE_NAME]` WHERE attributes.sanity.dataset IS NOT NULL GROUP BY dataset_name ORDER BY total_requests DESC;
Customers have full control of the data and the security of their systems; the solution has multiple levels of security;
DIRECTORY_CUSTOMER_ID as an allowed gcloud organization. Sanity’s customer ID can be found in the project management area during the setup process.gunzip --stdout [compressed logfile].ndjson.gz | npx json2csv --ndjson --output [output].csv# Replace [PROJECT_ID] with your actual Google Cloud project ID
gcloud config set project [PROJECT_ID]# Replace [BUCKET_NAME] with your actual bucket name
gcloud storage buckets create gs://[BUCKET_NAME]# Replace [BUCKET_NAME] with your actual bucket name
gcloud storage buckets add-iam-policy-binding gs://[BUCKET_NAME] --member=serviceAccount:delivery@sanity-log-delivery.iam.gserviceaccount.com --role=roles/storage.objectCreator

{
"sourceFormat": "NEWLINE_DELIMITED_JSON",
"schema": {
"fields": [
{ "name": "timestamp", "type": "TIMESTAMP", "mode": "REQUIRED" },
{ "name": "traceId", "type": "STRING", "mode": "REQUIRED" },
{ "name": "spanId", "type": "STRING", "mode": "REQUIRED" },
{ "name": "severityText", "type": "STRING", "mode": "NULLABLE" },
{ "name": "severityNumber", "type": "INT64", "mode": "REQUIRED" },
{
"name": "body",
"type": "RECORD",
"mode": "REQUIRED",
"fields": [
{ "name": "duration", "type": "DECIMAL", "mode": "NULLABLE" },
{ "name": "insertId", "type": "STRING", "mode": "NULLABLE" },
{ "name": "method", "type": "STRING", "mode": "NULLABLE" },
{ "name": "referer", "type": "STRING", "mode": "NULLABLE" },
{ "name": "remoteIp", "type": "STRING", "mode": "NULLABLE" },
{ "name": "requestSize", "type": "INT64", "mode": "NULLABLE" },
{ "name": "responseSize", "type": "INT64", "mode": "NULLABLE" },
{ "name": "status", "type": "INT64", "mode": "NULLABLE" },
{ "name": "url", "type": "STRING", "mode": "NULLABLE" },
{ "name": "userAgent", "type": "STRING", "mode": "NULLABLE" }
]
},
{
"name": "attributes",
"type": "RECORD",
"mode": "NULLABLE",
"fields": [
{
"name": "sanity",
"type": "RECORD",
"mode": "NULLABLE",
"fields": [
{ "name": "projectId", "type": "STRING", "mode": "REQUIRED" },
{ "name": "dataset", "type": "STRING", "mode": "NULLABLE" },
{ "name": "domain", "type": "STRING", "mode": "NULLABLE" },
{
"name": "groqQueryIdentifier",
"type": "STRING",
"mode": "NULLABLE"
},
{ "name": "apiVersion", "type": "STRING", "mode": "NULLABLE" },
{ "name": "endpoint", "type": "STRING", "mode": "NULLABLE" },
{ "name": "tags", "type": "STRING", "mode": "REPEATED" },
{ "name": "studioRequest", "type": "BOOLEAN", "mode": "NULLABLE" }
]
}
]
},
{
"name": "resource",
"type": "RECORD",
"mode": "REQUIRED",
"fields": [
{
"name": "service",
"type": "RECORD",
"mode": "NULLABLE",
"fields": [{ "name": "name", "type": "STRING", "mode": "NULLABLE" }]
},
{
"name": "sanity",
"type": "RECORD",
"mode": "NULLABLE",
"fields": [
{ "name": "type", "type": "STRING", "mode": "NULLABLE" },
{ "name": "version", "type": "STRING", "mode": "NULLABLE" }
]
}
]
}
]
},
"compression": "GZIP",
"sourceUris": ["gs://[BUCKET_NAME]/[PREFIX]event-logs/*"],
"hivePartitioningOptions": {
"mode": "CUSTOM",
"sourceUriPrefix": "gs://[BUCKET_NAME]/[PREFIX]event-logs/{project_id:STRING}/{kind:STRING}/{dt:DATE}/"
}
}# Replace [DATASET_NAME] and [TABLE_NAME] with your details
bq mk --external_table_definition=schema.json [DATASET_NAME].[TABLE_NAME]/* Replace [GCP_PROJECT_NAME], [DATASET_NAME], and [TABLE_NAME] with your details */
SELECT
*
FROM
`[GCP_PROJECT_NAME].[DATASET_NAME].[TABLE_NAME]`
WHERE
project_id = '[SANITY_PROJECT_ID]' AND
kind = 'request-log' AND
dt = DATE_ADD(CURRENT_DATE(), INTERVAL -1 DAY)/* Replace [PROJECT], [DATASET], and [TABLE_NAME] with your details */
SELECT body.url, sum(body.responseSize) / 1000 / 1000 AS responseMBs
FROM `[PROJECT].[DATASET].[TABLE_NAME]`
WHERE attributes.sanity.domain = 'cdn'
AND timestamp > TIMESTAMP_ADD(CURRENT_TIMESTAMP(), INTERVAL -1 DAY)
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10;/* Replace [PROJECT], [DATASET], and [TABLE_NAME] with your details */
SELECT
DATE(timestamp) AS date,
body.method,
attributes.sanity.groqQueryIdentifier AS groq_query_identifier,
COUNT(*) as times_called,
AVG(body.duration) / 1000 AS average_response_time_seconds
FROM
`[PROJECT].[DATASET].[TABLE_NAME]`
WHERE
body.duration IS NOT NULL
AND attributes.sanity.groqQueryIdentifier IS NOT NULL
AND attributes.sanity.groqQueryIdentifier != ""
AND body.method = "GET"
AND attributes.sanity.endpoint = "query"
GROUP BY
1,2,3
ORDER BY
1 DESC,5 DESC,4 DESC/* Replace [PROJECT], [DATASET], and [TABLE_NAME] with your details */
WITH ErrorCount AS (
SELECT
DATE(timestamp) AS date,
COUNTIF(body.status >= 500) AS server_error_count,
COUNTIF(body.status >= 400 AND body.status < 500) AS user_error_count,
COUNT(*) AS total_requests
FROM
`[PROJECT].[DATASET].[TABLE_NAME]`
WHERE
body.status IS NOT NULL
GROUP BY
date
)
SELECT
date,
server_error_count,
user_error_count,
total_requests,
ROUND((server_error_count + user_error_count) / total_requests * 100, 2) AS error_percentage
FROM
ErrorCount
ORDER BY
date;/* Replace [PROJECT], [DATASET], and [TABLE_NAME] with your details */
SELECT
attributes.sanity.dataset AS dataset_name,
COUNT(DISTINCT attributes.sanity.groqQueryIdentifier) AS unique_get_queries,
COUNT(*) AS total_requests,
SUM(body.responseSize) AS total_response_size
FROM
`[PROJECT].[DATASET].[TABLE_NAME]`
WHERE
attributes.sanity.dataset IS NOT NULL
GROUP BY
dataset_name
ORDER BY
total_requests DESC;