Understanding the behavior of uploading duplicate images in Sanity.io.

10 replies
Last updated: Jun 10, 2020
Is it correct that if I upload the exact same image, I won't get duplicate assets? For every upload I do of the same image, it returns a result with the exact same
_id
and
url
, and
count(*[_type == "sanity.imageAsset"])
doesn't change.This is the behavior I want, but I would assume it uploaded a new asset for each attempt, and didn't do a compare or something. So just want to confirm this behavior before I rely upon it.
🙂
AI Update

Yes, this is correct and confirmed behavior that you can rely on! Sanity implements automatic asset deduplication based on the content of the file itself.

When you upload an image to Sanity, the platform generates a deterministic ID based on the actual image content (using a SHA-1 hash). This means that uploading the exact same image multiple times will:

  • Return the same _id every time (formatted as image-<sha1hash>-<dimensions>-<format>)
  • Point to the same url
  • Not create duplicate sanity.imageAsset documents in your Content Lake
  • Keep your count(*[_type == "sanity.imageAsset"]) unchanged

As mentioned in the Sanity documentation on uploading assets efficiently, "Images uploaded to the Content Lake receive deterministic IDs based on the image content itself, ensuring that uploading the same image multiple times results in the same ID without creating duplicate documents."

This is a deliberate feature that helps prevent asset bloat and keeps your Content Lake clean. It's particularly useful when:

  • Running migration scripts that might process the same images multiple times
  • Multiple content editors upload the same brand assets or stock photos
  • Automated systems upload images programmatically

So you can absolutely rely on this behavior in your workflows! The deduplication happens automatically at the platform level based on the file's content hash, not just the filename, so even if you upload the same image with different filenames, it will be deduplicated.

Yes, it’s correct and you can expect it to be consistent, reliable behaviour. Just confirmed with
user S
:
If we already have an image with the same hash it’s not stored and we’ll return the asset id of the existing one
Thanks
user S
! Thats very cool! 🙂
user S
How does this know it is the same image? Does it look for the
originalFilename
?
It uses a cryptographic hash function (SHA-1) to calculate a hash value on upload - if hash values are the same, the image is considered identical and not stored but instead linked from the existing one.
user S
So does it still replace the image file with the current one being uploaded, or does it just ignore the request altogether and keep using the "old" file if it matches an existing asset?
user S
So does it still replace the image file with the current one being uploaded, or does it just ignore the request altogether and keep using the "old" file if it matches an existing asset?
It will keep using the existing (old) file, as it’s identical, so no need to replace 🙂
Gotcha. So does that mean there's no way to replace the file for an existing asset? It has to be deleted and then another one created?
And if that's the case, can a custom ID for the asset be specified on upload so that it can match the old asset that was deleted to keep references on documents intact?
user S
Was still wondering if you had an answer here. Is there no way to make an existing asset ID use a new uploaded file?

Sanity – Build the way you think, not the way your CMS thinks

Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.

Was this answer helpful?