Related Resources by Taxonomy Tag

By Andy Fitzgerald

Three simple approaches to generate ranked lists of related content using taxonomy tags managed in Sanity Studio

Related Resources Ranked by Taxonomy Tag

// GROQ Query
// Ranked related resources (in 3 projections)
*[_type in $types]
{
  title,
  _createdAt,
  "tagCount": length(topic[]),
  "relatedResources": *[
    _type in $types
    && array::intersects(topic[]._ref, ^.topic[]._ref)
    && _id != ^._id]
    {
      title,
      topic,
      "insightType": insightType->prefLabel,
      "sharedTags": 
        length(topic[] + ^.topic[]) 
        - count(array::unique(topic[]._ref + ^.topic[]._ref)),
    } 
    // If you only need to order by sharedTags, 
    // you can uncomment and stop here:
    // | order(sharedTags desc) [0..3]
}
{ 
  ...,
  "relatedResources": relatedResources[]
    {
      ...,
      "relatedness": round((sharedTags * 2) / (length(topic[]) + ^.tagCount), 2)
    } 
    // | order(relatedness desc) [0..3]
}
{
  ...,
  "relatedResources": relatedResources[]
    {
      ...,
      "relatednessAdj": select(
        insightType == "Case Study" => round(relatedness + $boosts.caseStudy, 2),
        insightType == "Interview" => round(relatedness + $boosts.interview, 2),
        relatedness
      )
    } | order(relatednessAdj desc) [0..3]
} 
| order(_createdAt desc)

// GROQ Params
{
  "types": [
    "article",
    "caseStudy"
  ],
  "boosts": {
    "caseStudy": 0.2,
    "interview": -0.1
  },
}

GROQ provides score and boost functions that can be used to sort and filter items in an array, but these do not yet support dereferencing or subqueries, both of which can be useful if you're using expressive, centrally managed taxonomies in Sanity Studio.

This GROQ query provides three increasingly nuanced approaches for sorting related resources based on taxonomy terms, each expressed in a chained projection that builds on the previous one:

  1. Tag Count: Finds the number of shared tags between a candidate related resource and a parent resource by subtracting the number of unique tags between the two resources from the total number of tags between them. This leaves you with a count of the duplicate—shared—tags.
  2. Relatedness: Divides the total number of shared tags (which we get by multiplying tagCount by 2) by the total number of tags. This gives us the percentage of shared tags between the two resources (a number between zero and one, rounded to two decimal places).
  3. Curation: Applies a relatedness "boost" based on the value of the insightType taxonomy term. In this example, insightType captures the semantic differences between article document types that are otherwise structurally identical.

By managing content structure and content semantics separately, each aspect can evolve as needed to meet changing content and business needs while minimizing the impact on the other elements of your content operations. For instance, you could add a new "video" insightType that shares the same structure as "interviews" without needing to change the structure (or related queries) of your "article" document _type.

For more information on managing standards-compliant taxonomy terms in Sanity Studio, check out the Taxonomy Manager plugin here on Sanity Exchange.

Contributor

Other recipes by the contributor

Get Linked Data Function

Automatically retrieve images, titles, short descriptions, and more from linked resources on the web.

Andy Fitzgerald
Go to Get Linked Data Function

Taxonomy Term Auto-Tag

Use the Sanity Embeddings Index to auto-tag resources from a pre-defined list of taxonomy terms managed in Sanity Studio.

Andy Fitzgerald
Go to Taxonomy Term Auto-Tag

Import Taxonomy Terms

Import taxonomy terms, structure, and metadata into the Taxonomy Manager plugin. Includes a spreadsheet template you can use to author and correctly format your taxonomy.

Andy Fitzgerald
Go to Import Taxonomy Terms