Is there a way to query documents that aren’t referenced by another type of document?

5 replies
Last updated: Nov 23, 2020
Is there a way to query documents that aren’t referenced by another type of document? I’m trying to delete documents that aren’t being referenced.
AI Update

Looking at your question, you want to find documents that aren't referenced by any other documents and potentially delete them. This is definitely possible with GROQ queries!

Finding Unreferenced Documents

To find documents that aren't referenced by any other document, you can use the references() function in a clever way. The references() function checks if a document contains references to a given document ID. To find unreferenced documents, you need to check if any other documents reference each one:

*[_type == "yourDocumentType" && count(*[references(^._id)]) == 0]

Here's what's happening:

  • The outer query *[_type == "yourDocumentType"] gets all documents of your target type
  • Inside the filter, ^._id refers to the current document's ID from the outer scope
  • *[references(^._id)] finds all documents that reference that ID
  • count(...) counts how many documents reference it
  • == 0 filters for documents with zero references

You can make this more comprehensive to check across all document types:

*[count(*[references(^._id)]) == 0]{
  _id,
  _type,
  title
}

Deleting Unreferenced Documents

Once you've identified unreferenced documents, you have a couple of options for deletion:

Option 1: Using the Sanity Client

You can use the Sanity Client to programmatically delete documents:

import {createClient} from '@sanity/client'

const client = createClient({
  projectId: 'your-project-id',
  dataset: 'your-dataset',
  token: 'your-token-with-write-access',
  apiVersion: '2024-01-01',
  useCdn: false
})

// First, fetch unreferenced documents
const unreferencedDocs = await client.fetch(
  `*[_type == "yourType" && count(*[references(^._id)]) == 0]._id`
)

// Then delete them
for (const docId of unreferencedDocs) {
  await client.delete(docId)
  console.log(`Deleted ${docId}`)
}

Option 2: Using a Migration Script

For larger operations, you can create a migration script that runs via the Sanity CLI. This is useful for more complex deletion logic or when you need to process many documents.

Important Considerations

Performance: The count(*[references(^._id)]) pattern can be slow on large datasets since it performs a subquery for each document. Consider running this during off-peak hours or on a smaller subset first.

Safety First: Always test on a development dataset first! You might want to export your data before running bulk deletions:

sanity dataset export production backup.tar.gz

Draft vs Published: Remember that Sanity stores drafts with the prefix drafts.. You might want to check both published and draft documents, or filter them appropriately:

// Only check published documents (exclude drafts)
*[
  !(_id in path("drafts.**")) && 
  count(*[references(^._id)]) == 0
]{_id, _type}

Cross-dataset references: If you're using cross-dataset references, note that the references() function only checks within the current dataset, not across datasets.

Alternative approach for better performance: If you're dealing with a very large dataset, you might want to build a list of all referenced document IDs first, then find documents not in that list:

// Get all referenced IDs
const referencedIds = await client.fetch(
  `array::unique(*[]._ref)`
)

// Find documents not in that list
const unreferenced = await client.fetch(
  `*[!(_id in $referencedIds)]`,
  { referencedIds }
)

Hope this helps you clean up those orphaned documents! Let me know if you need clarification on any part of this approach.

Show original thread
5 replies
The closest I’ve got to working out which documents don’t have a reference is this
Ideally I’d want a list of
_ids
of the documents that need removing
Found the solution from an old script I think
user Y
had sent me to remove assets that were not being used. No idea how this query works or what it is doing but it brings me back the IDs I want to delete.
ah that's handy thanks, a quick explication how it works
1. fetch all documents of type offer

*[_type in ["offer"]]
2. include _id and and refs (refs containing count of references)

{
_id,
"refs": count(*[ references(^._id) ])
}
3. filter results to only show data where refs count == 0 and then return only the id

[ refs == 0 ]._id


*[ _type in ["offer"]]{
  _id,
  "refs": count(*[references(^._id)])
}[refs == 0]._id
Thank you for taking the time to explain
user S
🙂

Sanity – Build the way you think, not the way your CMS thinks

Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.

Was this answer helpful?