Shopify + Sanity: Read about the investment and partnership –>

Cross Dataset References

All you need to know about creating references across projects and datasets.

Getting started

Enterprise Feature

Cross dataset references is an enterprise feature. Use our contact form to start a conversation with our sales team to enable your project to use this feature.

A fundamental requirement for enabling a content-driven workflow is having access to the proper tools to help you compartmentalize and then connect your content. A way of composing sets of fields to create documents, and of connecting documents to create relationships. Boxes and arrows, if you will.

The premier tool for connecting content in Sanity is the reference schema type, for creating binding relationships between content types. The reference type only allows references within a single dataset. This covers most use cases, but sometimes more complex architectures and content needs present a legitimate case for a way of communicating across datasets and projects.

You might wish to keep separate datasets for your product database and your marketing content to make sure your editors can focus on the appropriate content for their department, but still have the need to reference your products on your landing site. Or perhaps you need to keep several localized versions of your content, each in their dedicated dataset, but still be able to connect content across locales.

For these scenarios, there is the crossDatasetReference schema type! With it comes the ability to make references between documents in different datasets, and indeed, different projects, accounts, and organizations.

While closely related to the reference type, the crossDatasetReference type has some unique capabilities and some different limitations that you should be aware of.

The anatomy of a cross-dataset reference

A cross-dataset reference is, as its name suggests, a reference in one dataset to a document in another dataset in the same or another project. In order for this to be possible, there are some requirements that must be filled.

For the remainder of this article, we’ll use the term source dataset when we talk about the dataset that is doing the referring, and target dataset when we talk about the dataset that is being referred to.

  • Both datasets must belong to projects that are on an enterprise plan.
  • If using Sanity Studio the source and target studios must both be updated to version 2.27.3 or later.
  • The source dataset must have stored a token with reading permission to the target dataset.
  • If the studios used to interact with the respective datasets are hosted on different domains, an entry must be made for the URL of the studio for your source dataset in the list of allowed CORS origins for the studio for your target dataset.
  • The project ID and dataset name of the target dataset must be known at the time of creating the reference field in the source dataset.
  • Similarly, the type of document you wish to refer to in the target dataset, and one or more of its fields must be known in order to set up search and preview in the source dataset.

Cross-project tokens

Before we delve into the crossDatasetReference schema, let’s take a brief moment to look at cross-project tokens.

Protip

Tokens are normally not needed when connecting content within the same project!

In order to be able to search for documents across projects, you will need to configure a cross-project token. A cross-project token is a regular API token with read permission that resides in the source dataset, that grants access to view documents in the target dataset.

The most convenient way to handle tokens in your source dataset is by installing the Cross Project Tokens plugin. The easiest way to create and manage tokens in your target dataset is by navigating to the API tab in sanity.io/manage.

For a step-by-step guide on how to create and manage cross-project tokens, refer to the Shared Content Quickstart article.

Exploring the crossDatasetReference schema

To read all the nitty-gritty details about the crossDatasetReference schema type, visit the schema type reference documentation.

The crossDatasetReference type is, as mentioned, closely related to the reference type. It supports most of the same properties and options, in addition to some specific ones. Let’s have a look at a minimal example of a crossdatasetReference schema, and then go a bit further once we’ve established the basics.

{
  title: 'Reference to a document in a another dataset',
  name: 'myCoolReferenceAcrossDatasets',
  type: 'crossDatasetReference',
  dataset: 'name-of-other-dataset',
  projectId: "xyzabc",
  to: [
    {
      type: 'article',
      __experimental_search: [{ path: ['title'] }],
      preview: {
        select: {
          title: 'title',
        },
      },
    },
  ],
}
  • All fields in the above example, except the title, are required
  • The type must be set to crossDatasetReference
  • The dataset and projectId must have the appropriate values
  • The to field accepts an array of entries to different document types in the target dataset. You may define as many types here as you please, but each crossDatasetReference field is limited to connecting to a single external dataset.
  • Because the entire schema of all document types in the target dataset is not known to the source dataset, the following is true for each entry in the to array:
    • In addition to type, each entry must specify one or more fields to use when searching for content in the target dataset. This is done using the experimental search API which you can read more about here.
    • For much the same reason, you must define a preview for the document type. To learn more about previews and list views, please refer to this article.

Let’s add a few more fields and a little more complexity to our schema:

{
  title: 'Reference to a document in a another dataset',
  name: 'myCoolReferenceAcrossDatasets',
  type: 'crossDatasetReference',
  dataset: 'name-of-other-dataset',
  projectId: "xyzabc",
	tokenId:'myTokenForDatasetA',
	studioUrl: ({ type, id }) => `https://target.studio/desk/${type};${id}`,
  to: [
    {
      type: 'article',
      __experimental_search: [
            { path: ['title'], weight: 1.5 },
            { path: ['excerpt'], weight: 0.8 }
          ],
      preview: {
        select: {
          title: 'title',
          media: 'heroImage',
        },
      },
    },
    {
      type: 'person',
      __experimental_search: [{ path: ['name'] }],
      preview: {
          select: {
            name: 'name',
            picture: 'portrait',
            honorific: 'jobTitle',
          },
          prepare({ name, picture, honorific }) {
            return {
              title: name,
              media: picture,
              subtitle: honorific,
            };
          },
        },
      },
    },
  ],
}

Let’s look at what we’ve added.

  • The tokenId field on line 7 is an optional identifier you can assign to your cross-dataset token in the event that you have more than one token for the same target dataset. As long as you only store a single token for each target dataset the studio can handle the logistics for you, so for most common scenarios, this field can be omitted.
  • The studioUrl field on the next line accepts a function, which is invoked with the type and id of your referenced document, and which should return a string in the shape of a URL to the document editing pane address in the target dataset studio. This field is used to create a direct link from the reference preview to its editing environment (providing your editors have access to it, of course).
  • Moving along, we see that a second entry has been added to the __experimental_search field on lines 13 and 14 and that we’ve configured the entries with a weight to determine their relevance as you search.
  • Finally, we’ve added a second document type to our to array with an expanded preview configuration.

The Cross-Dataset Reference field in your studio.

Having set up your token and configured your schema, you should see the crossDatasetReference field show up in your studio. While similar to reference inputs they differ in some key aspects:

  • The “Create New” button and option to open the referenced document in a new pane to the right are not available across datasets. Instead, you will find an intent link that will open the referenced document in the target studio (if you have access to it, and have set the studioUrl property).
  • Linking to drafts is not available across datasets. Unless the document has been published at some point, it will not show up in search.
  • Depending on network conditions, searching and previewing cross-dataset reference fields might be less performant than doing the same operations on internal references.
Searching for documents in target studio.

If studioUrl is set, the referenced document will open in the target studio.

Referential integrity for cross-dataset references

As with the reference schema type, crossDatasetReference fields are by default assumed to have bi-directional integrity which means that if you try to delete a document that is referred to by another document, the studio will alert you with a warning.

However, unlike references within a single dataset, the studio will allow you to proceed with deleting or unpublishing documents that have cross-dataset references.

If you go ahead and delete the document despite the warnings, it will show up as unavailable in any studio referencing it and will block publishing until the problem is fixed if any changes are made to the referring document.

These measures are in place so that you can feel confident about connecting your content across datasets, and that you will be notified if a referenced document disappears.

Sometimes you don't need this guarantee while you want to keep the convenience of references. Referential integrity can be turned off by adding the weak: true property to a reference field configuration.

You will still be notified that the document you are referring to has gone missing, but you will no longer be blocked from publishing.

Querying cross-dataset references

To GROQ, a crossDatasetReference behaves very much the same as an internal reference. Doing a surface-level query on a document with a cross-dataset reference will return a result that looks like this:

{
  _type: 'crossDatasetReference',
  _ref: 'id-of-reference-document',
  _dataset: 'name-of-dataset',
  _projectId: 'id-of-project',
}

You can expand cross-dataset references the way you’d expect:

*[_type == 'crossDatasetReferringDocument']{
	 // Expand entire referenced document
   personFromDatasetB->,
	 // Expand projection of certain fields
  "personDetails": personFromDatasetB->{name, jobTitle}
}

In conclusion:

The cross-dataset reference schema type is a powerful tool for enabling Shared Content across datasets, projects, and even organizations. It allows you to keep your content connected beyond its original context by extending the reference field with methods for authenticating and querying across datasets.

Further reading:

Was this article helpful?