GraphQL Query: Filtering not Working Properly

2 replies
Last updated: Feb 1, 2022
Hello looking to get some insight into this query I am building (it is with graphQL)

{
    allSeriesType(
      where: {
        _: { is_draft: false }
        isPublished: { eq: true }
        title:{matches:"master the basics"}
      }
      sort: [{ publishDate: DESC }]
    ) {
    title
    }
  }
It is returning:


{
  "data": {
    "allSeriesType": [
      {
        "title": "Learn the Routine: Dance Choreography"
      },
      {
        "title": "Master the Basics: HIIT"
      },
      {
        "title": "Master the Basics: Seated Dance Cardio"
      },
      {
        "title": "Master the Basics: Yoga"
      },
      {
        "title": "Master the Basics: Seated Strength Training "
      },
      {
        "title": "Master the Basics: Dance Cardio"
      },
      {
        "title": "Master the Basics: Strength Training"
      },
      {
        "title": "Master the Basics: Barre"
      },
      {
        "title": "Master the Basics: Cardio Boxing"
      },
      {
        "title": "The Gratitude Challenge: Yoga and Meditation"
      },
      {
        "title": "Master the Basics: Seated Yoga"
      }
    ]
  }
}
I am wondering why it is returning
The Gratitude Challenge: Yoga and Meditation
and
Learn the Routine: Dance Choreography
AI Update

The matches operator in Sanity's GraphQL API uses tokenization, which means it splits both your search pattern and the document titles into individual words (tokens) and then checks if all the tokens from your search pattern appear anywhere in the title. This explains why you're seeing unexpected results.

When you query with title:{matches:"master the basics"}, the system tokenizes this into three separate words: "master", "the", and "basics". According to the GraphQL changelog, as of API version v2023-08-01: "Match patterns are now tokenized into terms, and all terms must be present for a match."

So your query returns documents where all three words appear in the title, regardless of order or position. Looking at your unexpected results:

  • "The Gratitude Challenge: Yoga and Meditation" - contains "the"
  • "Learn the Routine: Dance Choreography" - contains "the"

The issue is that "the" is a common word that appears in almost all your titles. The tokenization treats each word equally, so as long as "master", "the", and "basics" all appear somewhere in the title (even separated), it's considered a match. The word "the" appearing in "The Gratitude Challenge" satisfies one of your three required tokens.

Solutions:

  1. Remove common words from your search pattern:

    title:{matches:"master basics"}

    This will only look for documents containing both "master" and "basics", giving you more precise results.

  2. Use GROQ instead of GraphQL for more flexible string matching. GROQ's match operator gives you similar tokenization but with more control, including wildcard support:

    *[_type == "seriesType" && !(_id in path("drafts.**")) && 
      isPublished == true && title match "master*basics*"]
  3. Add a slug or normalized field specifically for filtering. Create a field without common words like "the", "a", "an" for more precise searches.

  4. Filter out unwanted results by adding additional constraints to your GraphQL query, such as checking for the presence of specific words:

    where: {
      title: { matches: "master basics" }
      # Add other filters to narrow results
    }

The tokenization behavior is working as designed - it's meant to provide flexible text search across word boundaries rather than exact phrase matching. This is the same behavior you'll find in GROQ's match operator, which also tokenizes "content as words (i.e. split on whitespace and punctuation)."

those dont appear to be part of the filter type
I see what the problem can be its because matches looks for words containing. Changed it to
master basics
and it worked

Sanity – Build the way you think, not the way your CMS thinks

Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.

Was this answer helpful?