GROQ Search - using Score() to walk through all Descending String Fields and make a Match

8 replies
Last updated: Dec 10, 2021
Hello, I'm building a search engine with complex texts structure (lot of blocks inside text). Is there an easy way to walk through all descending string fields and make a
match
or do I have to write every possible path?An example of my output block looks like that:

"body": [
  // CLASSICAL BLOCK TYPE
  // `pt::text(body)` makes it easy to check matches
  {
    "_type": "block",
    "children": [
      {
        "_type": "span",
        "marks": [],
        "text": "nisi aliquam sequi voluptas quia ut rem esse quae qui voluptatem officia consectetur incididunt Neque et iure voluptatem ipsum ab amet ipsa occaecat ullam cupidatat ut velit cupidatat sequi nisi nostrum irure consequatur Quis aliquip commodi suscipit iste consectetur sequi velit ipsa enim dolor Neque voluptatem sit ipsam enim eiusmod ipsam doloremque aperiam"
      }
    ],
    "markDefs": [],
    "style": "pd-s pm-s"
  },

  // DIPTYCH TYPE
  {
    "_type": "diptych",
    "left": [
      {
        "_type": "asideText",
        "excerpt": "Quis minim nisi ad quia irure voluptatem veniam id nulla magna fugiat quasi ut voluptatem est laboris sequi nulla ea numquam commodi magnam qui ut dolore dicta est magna adipisci numquam velit ut labore qui perspiciatis velit minim et aute quia nulla incidunt Neque Sed molestiae explicabo voluptatem",
        "surtitle": "sit  qui",
        "title": "laborum unde Ut mollit et"
      }
    ],
    "right": [
      {
        "_type": "imagesCompo",
        "mainImage": {
          "_type": "image",
          "asset": {
            "_ref": "image-b2b2275f06bd2728f18eed194b5f734d244e593a-240x314-jpg",
            "_type": "reference"
          }
        }
      }
    ],
  },

  // PROCESS TYPE
  {
    "_type": "process",
    "description": "corporis adipisci molestiae totam est ab sit error vel vel Sed odit ut mollit reprehenderit eiusmod eu dolorem voluptatem dicta explicabo exercitation nostrud cupidatat ut porro minim iste pariatur anim commodi architecto irure porro ad fugit incididunt ad",
    "surtitle": "ad culpa architecto",
    "title": "eum beatae Ut elit fugiat Nemo"
  },

  // INTRO TYPE
  {
    "_type": "intro",
    "chapters": [
      "qui non incididunt eiusmod cupidatat",
      "doloremque corporis quia",
      "quasi aute",
      "voluptatem fugiat dolor adipisci"
    ],
    "description": "vel aliquid nostrud labore ex eiusmod numquam molestiae mollit enim autem vel dolore voluptas velit quaerat pariatur ut adipisci nulla non sit doloremque totam in Ut ad numquam consequatur cillum Duis quae Lorem sed consequat consequatur commodi eius enim veniam ad unde incididunt exercitationem ad inventore velit nostrum fugit",
    "title": "aliqua"
  },
]
I removed some fields that are not useful but I have string fields used for display/config in my structure.

For example for the
diptych
type, I need to walk through all the text children but they can be on
left
or
right
field.So the generated GROQ request would looks something like

score(
  left[].title match $value,
  left[].surtitle match $value,
  left[].excerpt match $value,
  right[].title match $value,
  right[].surtitle match $value,
  right[].excerpt match $value,
)
It starts to grow really fast and I have a lot of block types with a lot of different fields.
I'm also worried about the performance of running a score functions with a lot of match.

Two possible solutions that I'm thinking about.

1. A function that is walking through a path, something like
walk(['left', 'right'], ['title', 'surtitle', 'excerpt'])
.According to the doc, this doesn't exist right now and maybe it's not relevant to have something like that in GROQ (let me know if I'm going too far).

2. A way to write custom GROQ functions under a custom namespace. That would allow me to implement a custom
ept::text(body)
for
extended-portable-text
and handle my fields more precisely.Again, I'm maybe going too far with GROQ possibilities, let me know if this is relevant.

Maybe I should open a Github discussion because those could be improvements for the language, I don't know what do you think.
Dec 10, 2021, 7:49 AM
Hi! This is exactly how
score()
is intended. I don’t know exactly how well it scales with the number of fields, but 10-20 fields probably should not be a problem. Have you tried it out yet?
In theory you could rewrite:


score(
  [...left[].title, ...left[].surtitle, ...left[].excerpt] match $value,
  [...right[].title, ...right[].surtitle, ...right[].excerpt] match $value
)
but we are a bit restrictive about what expressions we support inside
score()
, and this is rejected right now. But it could be something we could support. Of course, it would perform identically, since it’s just another way to write the original query.
I totally see that it becomes verbose to write out the full GROQ. We have been talking about possible ways to support reusable fragments or functions. But this stuff is still on the drawing board. You should definitely feel free to open a GitHub issue about this.
Dec 10, 2021, 8:52 AM
Thanks for the answer.I wasn't thinking about the
[] match $value
syntax, this would already help.
There is just something that I dont get in your snippet, what the purpose of the
...
syntax on only
title
and
excerpt
fields?
[...left[].title, left[].surtitle, ...left[].excerpt] match $value
Dec 10, 2021, 9:22 AM
That’s a typo, supposed to be
there
Dec 10, 2021, 9:29 AM
(Fixed it)
Dec 10, 2021, 9:29 AM
And about performances, what's the faster between1. making a big mapping with wrapped
select()
and then use
score()
to match on only one field
*[_type in ['page', 'article']]{
  _id,
  title,
  'body': select(
    _type == 'page' => content.body,
    _type == 'article' => select(
      content.articleContent[]._type == 'diptych' => content.articleContent[].left[].title
      ...
    ),
  )
}
|score(body match $value)
|order(_score desc)
[_score > 0]
{_id, title}
2. make multiples
match

 *[_type in ['page', 'article']]
|score(content.body match $value, content.articleContent[].left[].title match $value)
|order(_score desc)
[_score > 0]
{_id, title}
Dec 10, 2021, 9:43 AM
You can’t
score()
on a projection, actually. The
{ ... }
part can only follow the
| score()
call.
Dec 10, 2021, 9:51 AM
Oh yeah I forgot that, that's solving the question so 😅
Dec 10, 2021, 9:53 AM
Thanks for the answers 🙂
Dec 10, 2021, 9:53 AM

Sanity.io – build remarkable experiences at scale

Sanity is a customizable solution that treats content as data to power your digital business. Free to get started, and pay-as-you-go on all plans.