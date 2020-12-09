How to use Cloud Clone for datasets
Copy a dataset inside Sanity's infrastructure using either the CLI or HTTP API.
Enterprise Feature
This feature is part of our Advanced Dataset Management offering on the enterprise plan. Contact us if you need this feature and want to discuss our enterprise plan.
Cloud Clone provides a more efficient way of duplicating datasets and is ideal for situations when:
- you want to run tests against real production data in a CI flow
- you regularly copy datasets from production for developing new features
Instead of exporting and importing a dataset with the CLI, you can have that process happen inside of Sanity's infrastructure which will be more efficient and reliable.
There are two methods of initiating and monitoring the cloning of datasets in the cloud: through the Sanity CLI or with the HTTP API.
The quickest way to begin developing with a freshly copied dataset is to use the CLI.
Gotcha
As with other project-specific CLI commands, this command will only work from within a configured Sanity project.
By default, the CLI command runs the copy synchronously. If you don't want to wait for the process to be completed, you can use the
--detach flag to skip the progress. It will log a job ID that you can use to watch the progress again with the
--attach <jobId> flag.
# Syntax:
# sanity dataset copy
# sanity dataset copy <source-dataset>
# sanity dataset copy <source-dataset> <target-dataset>
# This command will ask for which dataset to copy and what to call the new dataset
sanity dataset copy
# This command will copy the production dataset and request a name for the new dataset
sanity dataset copy production
# This command will copy the production dataset into a new dataset named newFeature
sanity dataset copy production newFeature
# This command will initiate the copy between production and newFeature
# It will run in the background and not display progress while it works
sanity dataset copy production newFeature --background
# This command will initiate the copy between production and newFeature
# It does not copy document history, speeding the copy action
# at the expense of the history retention
sanity dataset copy production newFeature --skip-history
Gotcha
This process creates a new dataset given the specified name. If a dataset already exists with that name, the command will throw an error.
It's encouraged to use this feature instead of exporting/importing your data to another dataset. In most cases, this will be a faster method. On large datasets or datasets with a large number of assets or large assets, the process will take some time to complete.
PUT /v1/projects/:projectId/datasets/:datasetName/copy
If you'd rather integrate with the HTTP API instead of going through the CLI, there's an API endpoint for the copy functionality, as well as an endpoint for monitoring copy completion.
In order to start a copy, a PUT request is sent to the specific dataset's
/copy endpoint.
https://api.sanity.io/v1/projects/<project-id>/datasets/<dataset-name>/copy
The request needs to be authorized via a Bearer token, which can be generated from a project's dashboard.
The body of the request should be an object containing the
targetDataset property with a string to use to name the new dataset.
{
"targetDataset": "production-copy"
}
curl --location --request PUT 'https://api.sanity.io/v1/projects/<project-id>/datasets/<dataset-name>/copy' \ -H 'Authorization: Bearer <token-here>' \ -H 'Content-Type: application/json' \ --data-raw '{ "targetDataset": "production-copy" }'
{
"datasetName": "production",
"message": "Starting copying dataset production to production-copy...",
"aclMode": "public",
"jobId": "jobIdString"
}
GET /v1/jobs/:jobId
When you run a copy via the HTTP API, you'll receive a Job ID. This ID can be used to query the status of the clone job.
curl --location --request GET 'https://api.sanity.io/v1/jobs/<jobid>' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer <token here>'
// Running
{
"id": "jacsfsmnxp",
"state": "running",
"authors": [
"authorId"
],
"created_at": "2020-11-09T17:34:28.071123Z",
"updated_at": "2020-11-09T17:34:28.144826Z"
}
// Completed
{
"id": "jarrwsdptf",
"state": "completed",
"authors": [
"authorId"
],
"created_at": "2020-11-09T17:07:41.304227Z",
"updated_at": "2020-11-09T17:08:30.457692Z"
}
GET /v1/jobs/:jobId/listen
Each job has a
/listen endpoint to allow you to monitor its status programmatically. Much like the static status endpoint, this endpoint accepts the Job ID that is returned by starting a copy action.
curl --location --request GET 'https://api.sanity.io/v1/jobs/<jobid>/listen' \ -H 'Content-Type: application/json' \ -H 'Authorization: Bearer <token here>'
While listening, event data will be sent back at intervals providing updates on the status of your copy. The response contains the event name as well as a JSON object containing information about the current status of the copy.
event: welcome
data: {"listener_id": "ladaicdbdo"}
event: job
data: {"job_id":"jacsfsmnxp","state":"running","progress":60}
event: job
data: {"job_id":"jacsfsmnxp","state":"running","progress":80}
event: job
data: {"job_id":"jacsfsmnxp","state":"completed","progress":100}