Cross Dataset References
All you need to know about creating references across projects and datasets.
Enterprise Feature
Cross dataset references is an enterprise feature. Use our contact form to start a conversation with our sales team to enable your project to use this feature.
A fundamental requirement for enabling a content-driven workflow is having access to the proper tools to help you compartmentalize and then connect your content. A way of composing sets of fields to create documents, and of connecting documents to create relationships. Boxes and arrows, if you will.
The premier tool for connecting content in Sanity is the reference schema type, for creating binding relationships between content types. The reference type only allows references within a single dataset. This covers most use cases, but sometimes more complex architectures and content needs present a legitimate case for a way of communicating across datasets and projects.
You might wish to keep separate datasets for your product database and your marketing content to make sure your editors can focus on the appropriate content for their department, but still have the need to reference your products on your landing site. Or perhaps you need to keep several localized versions of your content, each in their dedicated dataset, but still be able to connect content across locales.
For these scenarios, there is the crossDatasetReference
schema type! With it comes the ability to make references between documents in different datasets, and indeed, different projects, accounts, and organizations.
While closely related to the reference
type, the crossDatasetReference
type has some unique capabilities and some different limitations that you should be aware of.
A cross-dataset reference is, as its name suggests, a reference in one dataset to a document in another dataset in the same or another project. In order for this to be possible, there are some requirements that must be filled.
For the remainder of this article, we’ll use the term source dataset when we talk about the dataset that is doing the referring, and target dataset when we talk about the dataset that is being referred to.
- Both datasets must belong to projects that are on an enterprise plan.
- If using Sanity Studio the source and target studios must both be updated to version 2.27.3 or later.
- The source dataset must have stored a token with reading permission to the target dataset.
- If the studios used to interact with the respective datasets are hosted on different domains, an entry must be made for the URL of the studio for your source dataset in the list of allowed CORS origins for the studio for your target dataset.
- The project ID and dataset name of the target dataset must be known at the time of creating the reference field in the source dataset.
- Similarly, the type of document you wish to refer to in the target dataset, and one or more of its fields must be known in order to set up search and preview in the source dataset.
Before we delve into the crossDatasetReference
schema, let’s take a brief moment to look at cross-project tokens.
Protip
Tokens are normally not needed when connecting content within the same project!
In order to be able to search for documents across projects, you will need to configure a cross-project token. A cross-project token is a regular API token with read permission that resides in the source dataset, that grants access to view documents in the target dataset.
The most convenient way to handle tokens in your source dataset is by installing the Cross Project Tokens plugin. The easiest way to create and manage tokens in your target dataset is by navigating to the API tab in sanity.io/manage.
For a step-by-step guide on how to create and manage cross-project tokens, refer to the Shared Content Quickstart article.
To read all the nitty-gritty details about the crossDatasetReference schema type, visit the schema type reference documentation.
The crossDatasetReference
type is, as mentioned, closely related to the reference
type. It supports most of the same properties and options, in addition to some specific ones. Let’s have a look at a minimal example of a crossdatasetReference
schema, and then go a bit further once we’ve established the basics.
{
title: 'Reference to a document in a another dataset',
name: 'myCoolReferenceAcrossDatasets',
type: 'crossDatasetReference',
dataset: 'name-of-other-dataset',
projectId: "xyzabc",
to: [
{
type: 'article',
__experimental_search: [{ path: ['title'] }],
preview: {
select: {
title: 'title',
},
},
},
],
}
- All fields in the above example, except the title, are required
- The
type
must be set tocrossDatasetReference
- The
dataset
andprojectId
must have the appropriate values - The
to
field accepts an array of entries to different document types in the target dataset. You may define as many types here as you please, but eachcrossDatasetReference
field is limited to connecting to a single external dataset. - Because the entire schema of all document types in the target dataset is not known to the source dataset, the following is true for each entry in the
to
array:- In addition to
type
, each entry must specify one or more fields to use when searching for content in the target dataset. This is done using the experimental search API which you can read more about here. - For much the same reason, you must define a preview for the document type. To learn more about previews and list views, please refer to this article.
- In addition to
Let’s add a few more fields and a little more complexity to our schema:
{
title: 'Reference to a document in a another dataset',
name: 'myCoolReferenceAcrossDatasets',
type: 'crossDatasetReference',
dataset: 'name-of-other-dataset',
projectId: "xyzabc",
tokenId:'myTokenForDatasetA',
studioUrl: ({ type, id }) => `https://target.studio/desk/${type};${id}`,
to: [
{
type: 'article',
__experimental_search: [
{ path: ['title'], weight: 1.5 },
{ path: ['excerpt'], weight: 0.8 }
],
preview: {
select: {
title: 'title',
media: 'heroImage',
},
},
},
{
type: 'person',
__experimental_search: [{ path: ['name'] }],
preview: {
select: {
name: 'name',
picture: 'portrait',
honorific: 'jobTitle',
},
prepare({ name, picture, honorific }) {
return {
title: name,
media: picture,
subtitle: honorific,
};
},
},
},
},
],
}
Let’s look at what we’ve added.
- The
tokenId
field on line 7 is an optional identifier you can assign to your cross-dataset token in the event that you have more than one token for the same target dataset. As long as you only store a single token for each target dataset the studio can handle the logistics for you, so for most common scenarios, this field can be omitted. - The
studioUrl
field on the next line accepts a function, which is invoked with thetype
andid
of your referenced document, and which should return a string in the shape of a URL to the document editing pane address in the target dataset studio. This field is used to create a direct link from the reference preview to its editing environment (providing your editors have access to it, of course). - Moving along, we see that a second entry has been added to the
__experimental_search
field on lines 13 and 14 and that we’ve configured the entries with a weight to determine their relevance as you search. - Finally, we’ve added a second document type to our
to
array with an expanded preview configuration.
Having set up your token and configured your schema, you should see the crossDatasetReference
field show up in your studio. While similar to reference
inputs they differ in some key aspects:
- The “Create New” button and option to open the referenced document in a new pane to the right are not available across datasets. Instead, you will find an intent link that will open the referenced document in the target studio (if you have access to it, and have set the
studioUrl
property). - Linking to drafts is not available across datasets. Unless the document has been published at some point, it will not show up in search.
- Depending on network conditions, searching and previewing cross-dataset reference fields might be less performant than doing the same operations on internal references.
As with the reference
schema type, crossDatasetReference
fields are by default assumed to have bi-directional integrity which means that if you try to delete a document that is referred to by another document, the studio will alert you with a warning.
However, unlike references within a single dataset, the studio will allow you to proceed with deleting or unpublishing documents that have cross-dataset references.
If you go ahead and delete the document despite the warnings, it will show up as unavailable in any studio referencing it and will block publishing until the problem is fixed if any changes are made to the referring document.
These measures are in place so that you can feel confident about connecting your content across datasets, and that you will be notified if a referenced document disappears.
Sometimes you don't need this guarantee while you want to keep the convenience of references. Referential integrity can be turned off by adding the weak: true
property to a reference field configuration.
You will still be notified that the document you are referring to has gone missing, but you will no longer be blocked from publishing.
To GROQ, a crossDatasetReference
behaves very much the same as an internal reference. Doing a surface-level query on a document with a cross-dataset reference will return a result that looks like this:
{
_type: 'crossDatasetReference',
_ref: 'id-of-reference-document',
_dataset: 'name-of-dataset',
_projectId: 'id-of-project',
}
You can expand cross-dataset references the way you’d expect:
*[_type == 'crossDatasetReferringDocument']{
// Expand entire referenced document
personFromDatasetB->,
// Expand projection of certain fields
"personDetails": personFromDatasetB->{name, jobTitle}
}
The cross-dataset reference schema type is a powerful tool for enabling Shared Content across datasets, projects, and even organizations. It allows you to keep your content connected beyond its original context by extending the reference field with methods for authenticating and querying across datasets.
Further reading: