How to search nested string fields in complex Portable Text with custom blocks?
matchor do I have to write every possible path?An example of my output block looks like that:
"body": [
// CLASSICAL BLOCK TYPE
// `pt::text(body)` makes it easy to check matches
{
"_type": "block",
"children": [
{
"_type": "span",
"marks": [],
"text": "nisi aliquam sequi voluptas quia ut rem esse quae qui voluptatem officia consectetur incididunt Neque et iure voluptatem ipsum ab amet ipsa occaecat ullam cupidatat ut velit cupidatat sequi nisi nostrum irure consequatur Quis aliquip commodi suscipit iste consectetur sequi velit ipsa enim dolor Neque voluptatem sit ipsam enim eiusmod ipsam doloremque aperiam"
}
],
"markDefs": [],
"style": "pd-s pm-s"
},
// DIPTYCH TYPE
{
"_type": "diptych",
"left": [
{
"_type": "asideText",
"excerpt": "Quis minim nisi ad quia irure voluptatem veniam id nulla magna fugiat quasi ut voluptatem est laboris sequi nulla ea numquam commodi magnam qui ut dolore dicta est magna adipisci numquam velit ut labore qui perspiciatis velit minim et aute quia nulla incidunt Neque Sed molestiae explicabo voluptatem",
"surtitle": "sit qui",
"title": "laborum unde Ut mollit et"
}
],
"right": [
{
"_type": "imagesCompo",
"mainImage": {
"_type": "image",
"asset": {
"_ref": "image-b2b2275f06bd2728f18eed194b5f734d244e593a-240x314-jpg",
"_type": "reference"
}
}
}
],
},
// PROCESS TYPE
{
"_type": "process",
"description": "corporis adipisci molestiae totam est ab sit error vel vel Sed odit ut mollit reprehenderit eiusmod eu dolorem voluptatem dicta explicabo exercitation nostrud cupidatat ut porro minim iste pariatur anim commodi architecto irure porro ad fugit incididunt ad",
"surtitle": "ad culpa architecto",
"title": "eum beatae Ut elit fugiat Nemo"
},
// INTRO TYPE
{
"_type": "intro",
"chapters": [
"qui non incididunt eiusmod cupidatat",
"doloremque corporis quia",
"quasi aute",
"voluptatem fugiat dolor adipisci"
],
"description": "vel aliquid nostrud labore ex eiusmod numquam molestiae mollit enim autem vel dolore voluptas velit quaerat pariatur ut adipisci nulla non sit doloremque totam in Ut ad numquam consequatur cillum Duis quae Lorem sed consequat consequatur commodi eius enim veniam ad unde incididunt exercitationem ad inventore velit nostrum fugit",
"title": "aliqua"
},
]For example for the
diptychtype, I need to walk through all the text children but they can be on
leftor
rightfield.So the generated GROQ request would looks something like
score(
left[].title match $value,
left[].surtitle match $value,
left[].excerpt match $value,
right[].title match $value,
right[].surtitle match $value,
right[].excerpt match $value,
)I'm also worried about the performance of running a score functions with a lot of match.
Two possible solutions that I'm thinking about.
1. A function that is walking through a path, something like
walk(['left', 'right'], ['title', 'surtitle', 'excerpt'])2. A way to write custom GROQ functions under a custom namespace. That would allow me to implement a custom
ept::text(body)for
extended-portable-textand handle my fields more precisely.Again, I'm maybe going too far with GROQ possibilities, let me know if this is relevant.
Maybe I should open a Github discussion because those could be improvements for the language, I don't know what do you think.
I feel your pain! Searching through complex nested structures with lots of block types is definitely challenging. Unfortunately, GROQ doesn't currently have a built-in "walk" function that can automatically traverse arbitrary paths, but there are several strategies to make this more manageable.
Current Approaches:
- Use
pt::text()for standard Portable Text blocks: As you've discovered, pt::text() is great for extracting plain text from standard Portable Text structures. This handles the traversal automatically for those classic block types:
*[_type == "post"] | score(pt::text(body) match $value)- Flatten your searchable text at write time (Recommended): Instead of trying to walk through all possible paths at query time, consider creating a computed field that aggregates all searchable text when documents are saved. You could use a Sanity Function with a document mutation listener to automatically update a
searchableTextfield whenever content changes:
// In your sanity.blueprint.ts
export default {
name: 'search-indexer',
functions: [
{
name: 'index-searchable-text',
trigger: 'document.publish',
handler: async (event, context) => {
const doc = event.document;
const searchableText = extractAllText(doc.body); // Your custom extraction logic
await context.client.patch(doc._id)
.set({ searchableText })
.commit();
}
}
]
}Then your GROQ query becomes simply:
*[_type == "yourType"] | score(searchableText match $value) | order(_score desc)This approach with Sanity Functions is the modern, recommended way to handle this kind of automation - it runs on Sanity's infrastructure, scales automatically, and doesn't require external hosting like webhooks would.
- Array flattening with
[]operator: For your specific structure, you can use GROQ's array traversal. The match operator works with tokenized text:
*[_type == "yourType"] | score(
body[_type == "diptych"].left[].title match $value,
body[_type == "diptych"].left[].surtitle match $value,
body[_type == "diptych"].left[].excerpt match $value,
body[_type == "diptych"].right[].title match $value,
body[_type == "process"].description match $value,
body[_type == "intro"].description match $value
) | order(_score desc)Regarding Your Proposed Solutions:
Custom walk function: This doesn't exist in GROQ currently, and you're right that it would be a language-level enhancement. This would be worth proposing in the Sanity GROQ GitHub discussions.
Custom GROQ functions under namespaces: Unfortunately, GROQ doesn't support user-defined functions or custom namespaces. The namespace functions like
pt::text()are built into the GROQ engine itself.
Performance Considerations:
Multiple match operations in a score() function shouldn't cause major performance issues for moderately-sized datasets, but if you're worried about scaling, the "flatten at write time" approach with Sanity Functions is definitely more performant since you're doing the heavy lifting once when content changes rather than on every search query.
My Recommendation:
Use Sanity Functions to create a preprocessing step that flattens all searchable text into a dedicated field when documents are created or updated. This gives you:
- Best query performance
- Simplified GROQ queries
- Full control over what gets indexed
- Ability to add field weighting by concatenating important fields multiple times
- No external infrastructure to maintain
You should definitely open a GitHub discussion about the walk function idea though - it's a legitimate use case that others would benefit from!
Show original thread8 replies
Was this answer helpful?
Sanity – Build the way you think, not the way your CMS thinks
Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.