What steps are you taking to backup your Sanity data?

16 replies
Last updated: May 27, 2020
Hey guys! Curious, what steps is everyone here taking to make sure their Sanity data is backed up, in case of some unfortunate accident?
AI Update

Great question! Dataset backups are definitely something you should have a strategy for. Here's what the community typically does:

Using the CLI Export Command

The most common approach is using the Sanity CLI's export command. It's straightforward and gives you complete control:

sanity dataset export production

This creates a production.tar.gz file containing all your documents and assets. You can specify a custom filename/location too:

sanity dataset export production ./backups/backup-2024-01-15.tar.gz

The export includes everything - all documents, assets (images/files), and hidden configuration documents used by Studio and plugins. What it doesn't include is your schema definition (that lives in your code) or system-level documents.

Options Worth Knowing

By default, exports are compressed as .tar.gz archives with assets included. The CLI command sanity dataset export automatically includes assets and compresses the output. If you only need documents without assets, you can export to NDJSON format, though the standard tar.gz approach is recommended for complete backups since it contains both the NDJSON data stream and your asset files in separate folders.

Automated Backup Strategies

For production sites, manual exports aren't ideal. Here are common automation approaches:

GitHub Actions: You can set up automated backups using GitHub Actions that run on a schedule and store exports as artifacts or push them to cloud storage.

CI/CD Pipelines: Many folks integrate exports into their deployment pipelines, especially before running content migrations.

Scheduled Scripts: Write a simple script that runs sanity dataset export and uploads the result to S3, Google Cloud Storage, or your preferred backup service.

Before Migrations

One critical practice: always export before running content migrations:

sanity dataset export production backup-before-migration.tar.gz
sanity migration run your-migration

If something goes wrong, you can restore with:

sanity dataset import backup-before-migration.tar.gz production

Enterprise Options

If you're on an Enterprise plan, there are managed options:

  • Backups feature: Managed backups with retention policies handled by Sanity
  • Cloud Clone: Server-side dataset duplication without downloading data locally

Storage Considerations

Important note: Sanity doesn't provide built-in backup storage, so you'll need to handle that yourself. Store your exports in cloud storage services with appropriate retention policies for your compliance needs.

The combination of regular automated exports to external storage plus pre-migration backups covers most disaster recovery scenarios pretty well!

Show original thread
16 replies
I made two small projects with Sanity for clients and I didn't set a backup a routine. I only use one dataset and I don't see how it could happen that I delete the production dataset by mistake.But for big projects using several datasets, it could be good to have a backup routine in place to sleep better a night
😄 But then you should definitely charge the client for implementing that service.For a hobby project, I set up a CRON job that triggers a serverless function which backs up the dataset to Google Drive.

https://dev.to/mornir/create-a-service-account-to-authenticate-with-google-5b1k Dropbox is also an option.
For the CRON job I use GitHub Actions. For the severless function I use Netlify Functions.But I think that it should be possible to use GitHub Actions for the whole process! Even for storing the backups (as artifacts).
🤯 this looks fascinating
Jérôme Pott
. It's a new world for me, haven't dealt with cronjobs or cloud functions yet but expected that this task would take me down this path.
Thanks!
So wait, you did this for a hobby project but not a client one? Is this because of an abundance of trust in nothing going wrong sanity-side with your data?
Sanity keeps daily backups of all your content for 30 days. If something goes wrong on their side, they can restore the data.On my end, as a developer, I don't see how I would type "sanity delete dataset production" by mistake.
It's true that documents deleted by the client are gone forever. But he can still undo the deletion right after or copy from the production website before triggering a rebuild (for
jamstack websites). For some projects I just disable the delete option for some document types.And again: we're talking about websites with less than 10 pages.
When I found the time and the motivation to look more into GitHub Action, I'll try to set up a backup routine for every repo with a Sanity Studio.So the backups files would be stored right along the code
😀
Ahh, you're right the daily backups for 30 days is probably enough to sleep safely at night. Still, going to look into doing some backups, to learn how to do it if nothing else 🙂
Jérôme Pott
stored along with the code? you mean you'd back-up the data to the repo? Isn't that going to bloat it too much?
yeah, especially if you backup several times a day and have a lot of assets
I also wonder what is GitHub pricing for storing artefacts. certainly not as cheap as Google Drive
Artifacts are a thing that becomes relevant once using actions right? Haven't dealt with them at all yet either
yeah, better to have your own backups. Sanity internal backups are not meant for cases when you're the one who screwed up😆But maybe in the future, we'll have a tab in Sanity management dashboard to view and manage their backups
😀
Yes, that's right. From what I read, artifacts are the data/files generated by an Action.
oh, you can't manage sanity's backups. That too is a good thing to keep in mind. Thanks again
Jérôme Pott
you're welcome. I was asking myself the same questions when first working with Sanity.

Sanity – Build the way you think, not the way your CMS thinks

Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.

Was this answer helpful?