Excluding noindex pages from next-sitemap in a custom implementation
You're right to question this - it's definitely a common need! The challenge with next-sitemap is real, but there are good solutions available.
The issue is that next-sitemap typically runs as a post-build script, outside of your Next.js runtime environment where you'd normally have access to your Sanity client. However, there are a few approaches to solve this:
Option 1: Use Next.js App Router's Native Sitemap (Recommended)
If you're using Next.js 13+ with the App Router, the best approach is to ditch next-sitemap entirely and use Next.js's built-in sitemap.ts file. This runs server-side where you have full access to Sanity:
// app/sitemap.ts
import { client } from '@/sanity/lib/client'
export default async function sitemap() {
const pages = await client.fetch(`
*[_type == "page" && !(_id in path("drafts.**")) && seo.noIndex != true] {
"slug": slug.current,
_updatedAt
}
`)
return pages.map((page) => ({
url: `https://yourdomain.com/${page.slug}`,
lastModified: page._updatedAt,
}))
}This approach gives you complete control and direct Sanity access, making it trivial to exclude noindex pages.
Option 2: Fetch Data in next-sitemap Config
You can make next-sitemap work with Sanity by fetching data directly in the config file:
// next-sitemap.config.js
const { createClient } = require('@sanity/client')
const client = createClient({
projectId: 'your-project-id',
dataset: 'production',
useCdn: false,
apiVersion: '2024-01-01'
})
module.exports = async () => {
const noIndexSlugs = await client.fetch(`
*[_type == "page" && seo.noIndex == true].slug.current
`)
return {
siteUrl: 'https://yourdomain.com',
generateRobotsTxt: true,
transform: async (config, path) => {
// Exclude paths that are marked noindex
if (noIndexSlugs.some(slug => path.includes(slug))) {
return null // returning null excludes the page
}
return {
loc: path,
changefreq: config.changefreq,
priority: config.priority,
lastmod: config.lastmod,
}
},
}
}The transform function returning null is the key - this tells next-sitemap to exclude that URL from the sitemap.
Option 3: Use additionalPaths with Sanity Data
Another next-sitemap approach is to explicitly define which paths to include:
module.exports = async () => {
const indexablePages = await client.fetch(`
*[_type == "page" && seo.noIndex != true] {
"slug": slug.current,
_updatedAt
}
`)
return {
siteUrl: 'https://yourdomain.com',
additionalPaths: async (config) => {
return indexablePages.map(page => ({
loc: `/${page.slug}`,
lastmod: page._updatedAt,
}))
},
}
}Bottom line: The native Next.js sitemap approach (Option 1) is cleaner and more maintainable if you're on App Router. If you're stuck with next-sitemap, Options 2 and 3 both work - the config file is just a Node.js module where you can absolutely make Sanity queries before exporting the configuration. The person who told you it's tricky might not have realized you can use async functions and import the Sanity client directly in the config file!
Sanity – Build the way you think, not the way your CMS thinks
Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.