Taxonomy Term Auto-Tag

By Andy Fitzgerald

Use the Sanity Embeddings Index to auto-tag resources from a pre-defined list of taxonomy terms managed in Sanity Studio.

schemaTypes/method.ts

defineField({
  name: 'topics',
  title: 'Topics',
  description:'Topics discussed in this method. If left empty, topics will be automatically applied when you publish this resource.',
  type: 'array',
  of: [
    {
      type: 'reference',
      to: [{type: 'skosConcept'}],
      options: {
        embeddingsIndex: {
          indexName: 'io-taxonomy',
          maxResults: 10,
          searchMode: 'embeddings'
        }
      },
    },
  ],
}),

functions/taxonomy-term-auto-tag/index.ts

import {createClient} from "@sanity/client";
import {documentEventHandler} from "@sanity/functions";

export const handler = documentEventHandler(async ({context, event}) => {
  const client = createClient({
    ...context.clientOptions,
    apiVersion: "vX",
    useCdn: false,
  });
  const {data} = event;
  const {local} = context; // local is true when running locally

  const dataset = "production"; // your dataset
  const indexName = "io-taxonomy"; // the name of your embeddings index

  try {
    // Query the embeddings index
    const result = await client.request({
      url: `/embeddings-index/query/${dataset}/${indexName}`,
      method: "POST",
      body: {
        query: `Based on the following text segment, suggest three relevant topic tags that succinctly and clearly describe the text: ${data.sourceText}.`,
        maxResults: 3,
      },
    });
    // Convert embeddings results to tags refs
    const tags = result.map(
      ({value}: {value: {documentId: string; type: string}}) => ({
        _ref: value.documentId,
        _type: "reference",
      })
    );
    // Patch using schema-aware agent action
    await client.agent.action.patch({
      noWrite: local ? true : false, // if local is true, don't write to the document, just return the result for logging
      schemaId: "_.schemas.production",
      documentId: data._id,
      target: {
        path: ["topics"],
        operation: "set",
        value: tags,
      },
    });
    console.log(
      local
        ? "Referenced tags (LOCAL TEST MODE - Content Lake not updated):"
        : "Referenced tags:",
      result
    );
  } catch (error) {
    console.error("Error occurred during tag retrieval: ", error);
  }
});

sanity.blueprint.ts

import {defineBlueprint, defineDocumentFunction} from '@sanity/blueprints'

export default defineBlueprint({
  resources: [
    defineDocumentFunction({
      type: 'sanity.function.document',
      name: 'taxonomy-term-auto-tag',
      src: './functions/taxonomy-term-auto-tag',
      memory: 2,
      timeout: 30,
      event: {
        on: ['publish'],
        filter: "_type == 'method' && !defined(topics)", // specify an appropriate type in your schema
        projection: '{_id, "sourceText": pt::text(overview)}', // specify the text field you'll use to inform tag choice. In many starter templates, this is `body`.
      },
    }),
  ],
})

Embeddings index projection

// if using the Sanity Taxonomy Manager plugin, project the `prefLabel` and `definition` of your concepts into the embeddings index

{_type,
  _id,
  prefLabel,
  definition
}

Inspired by Sanity's official Auto-Tag recipe, this function auto-tags your resources from a predefined set of managed taxonomy terms. While "free-tagging" can be an effective approach for small collections, controlling your vocabulary terms as a managed taxonomy provides key functionality for generating content recommendations, query expansion, and content intelligence across composable collections.

  • This example manages taxonomy terms in the Sanity Taxonomy Manager plugin, but this is not required: any set of defined terms managed in Sanity Studio can be used
  • Since we're populating a reference array (as opposed to an array of strings), we need to find and reference terms using the Embeddings Index, which is only available on paid Sanity plans.

Getting Started

  • Initialize blueprints if you haven't already: npx sanity blueprints init
  • Create an Embeddings Index to use for tag retrieval
  • Add the function and blueprint definition
  • Deploy: npx sanity blueprints deploy

How it Works

When a content editor publishes a new "Method" document without tags, this function:

  1. Queries the embeddings index (of your taxonomy terms) to find matching concepts for the document text you've provided
  2. Converts the returned results to an array of references
  3. Patches the tags to the document, using the schema-aware agent.action.patch method

For more information on managing standards-compliant taxonomy terms in Sanity Studio, check out the Taxonomy Manager plugin here on Sanity Exchange.

Contributor

Other recipes by the contributor

Get Linked Data Function

Automatically retrieve images, titles, short descriptions, and more from linked resources on the web.

Andy Fitzgerald
Go to Get Linked Data Function

Related Resources by Taxonomy Tag

Three simple approaches to generate ranked lists of related content using taxonomy tags managed in Sanity Studio

Andy Fitzgerald
Go to Related Resources by Taxonomy Tag

Import Taxonomy Terms

Import taxonomy terms, structure, and metadata into the Taxonomy Manager plugin. Includes a spreadsheet template you can use to author and correctly format your taxonomy.

Andy Fitzgerald
Go to Import Taxonomy Terms