GROQ query to find duplicate data
GROQ query that finds duplicate data in a given field.
By Geoff Ball
GROQ query in Vision or Sanity CLI
// GROQ query
*[_type == $type] {
"duplicates": *[_type == $type && @[$findDups] == ^[$findDups] && _id != ^._id && !(_id in path('drafts.**'))] {_id, $findDups: @[$findDups]}
}[count(duplicates) > 0]
// Params
{
"type": "post",
"findDups": "title"
}This GROQ query takes advantage of Equijoins to search all documents of a particular type and return the _id and duplicated field so that action can be taken. One might use this query in Vision to see duplicates at a glance (or make modifications in the Studio), or from the Sanity CLI (or a program like Insomnia or Postman) where mutations could be made directly.
It is recommended to use a newer version of the GROQ API (e.g., v2021-03-25) due to speed enhancements made since v1. In the event that the query times out when used with a large dataset, you may want to add a slice to the end of the query (e.g., [0..50]).
This query depends on two params:
: The document type to search. Change
"type": "post"post to your desired document type.: The field to search for duplicates. Change
"findDups": "title"title to the name (in the schema) of the field in which to search for duplicates.
The user can quickly change the document type (type) and field to scan for duplicates (findDups).
The query comes in three parts:
*[_type == $type]: Filter for all documents that have a_typeof$type(pulled in fromparams).{ "duplicates": *[_type == $type && @[$findDups] == ^[$findDups] && _id != ^._id && !(_id in path('drafts.**'))] {_id, $findDups: @[$findDups]} }: Run another query that returns documents of that same_typethat have duplicated data in the field named infindDupsbut whose_ids don't match. From that, return only_idand the field named inparamsasfindDups(for brevity) and assign it all to the keyduplicates.[count(duplicates) > 0]: From all the data assigned toduplicates, return only those that actually have duplicates. There's no need to see the rest.
Contributor

Geoff Ball
Support Engineer at Sanity.io
Canada