👀 Our most exciting product launch yet 🚀 Join us May 8th for Sanity Connect
Last updated March 27, 2021

Sanity Backup Function with GitHub Actions and Artifacts

By Jérôme Pott

When using Sanity, our content is stored safely and in multiple copies in Google Cloud. Thanks to the document history, we can restore our documents to a previous state. However, deleted documents and datasets cannot be recovered.

Even if those scenarios are unlikely to happen, it is worth creating a simple backup routine, just in case. And the method I'm going to show you here is easy to set up, won't cost you any money, and doesn't require you to register with a 3rd party service (I assume that all my readers have a GitHub account🙃).

TL;DR

https://github.com/sanity-io/github-action-sanity#backup-routine

Ways to backup Sanity datasets

There are three ways to backup datasets:

  1. cURL request to an export URL endpoint
  2. Using the Sanity CLI
  3. Using the @sanity/export npm package

In an another blog post, I explained how to use the @sanity/export npm package inside a serverless function to back up content to Google Drive or Dropbox.

There's however an easier way: GitHub Actions (GA). Here are their advantages:

  • Backup files are stored alongside your studio code.
  • They only require a few lines of YAML config.
  • They support CRON jobs.
  • They are cheap (execution time + storage).
  • We can make use of the GitHub ecosystem (notifications for failed workflows, access management, etc.)

Going full onboard with GitHub Actions

There is a GitHub Action that wraps the Sanity CLI. Basically, it means that we can run sanity dataset export inside our GA workflow.
Before we can export the dataset, we need to generate a read token from the Sanity project dashboard and store it as a secret in the GitHub repository.

This is how the first workflow step looks like:

- name: Export dataset
  uses: sanity-io/github-action-sanity@v0.1-alpha
  env:
    SANITY_AUTH_TOKEN: ${{ secrets.SANITY_AUTH_TOKEN }}
  with:
    args: dataset export production backups/backup.tar.gz

Then we need to upload the generated backup file so that it will be available for download as a workflow artifact. For this, we use the upload-artifact action and we specify the same path as above: backups/backup.tar.gz.

By default, this step passes even if GitHub cannot find our generated backup file. That is why I recommend setting the if-no-files-found option to error.

And here's the details of the step:

- name: Upload backup.tar.gz
  uses: actions/upload-artifact@v2
  with:
    name: backup-tarball
    path: backups/backup.tar.gz
    # Fails the workflow if no files are found; defaults to 'warn'
    if-no-files-found: error

In addition to running the backup routine on a schedule, you also add an option to trigger the backup process manually from the GA dashboard. This can be useful in various situations, e.g. right after content editors added a large amount of data, or right before manipulating datasets.

Here's an example of a workflow triggered manually or by a CRON job:

on:
  schedule:
    # Runs at 04:00 UTC on the 1st and 17th of every month
    - cron: '0 4 */16 * *'
  workflow_dispatch:

Conclusion

We now have set a solid backup routine in place. You can of course tune the frequency of the backups to your needs. Make sure to also read the latest information about pricing, size limits and file retention from GitHub. For example, as of writing this, backup files are automatically deleted after 90 days on public repo. I personally think that 90 days is long enough, even too long maybe. If you want to keep backups files for a shorter time, you can do so in the repository settings under Actions.

Finally, if you would like to see the workflow described in this post along with the generated artifacts, you can visit this page: https://github.com/mornir/movies-studio/actions/workflows/main.yml

Sanity – build remarkable experiences at scale

Sanity Composable Content Cloud is the headless CMS that gives you (and your team) a content backend to drive websites and applications with modern tooling. It offers a real-time editing environment for content creators that’s easy to configure but designed to be customized with JavaScript and React when needed. With the hosted document store, you query content freely and easily integrate with any framework or data source to distribute and enrich content.

Sanity scales from weekend projects to enterprise needs and is used by companies like Puma, AT&T, Burger King, Tata, and Figma.

Other guides by author