Does Uploading an Identical Image Result in Duplicate Assets?

9 replies
Last updated: Jun 10, 2020
Is it correct that if I upload the exact same image, I won't get duplicate assets? For every upload I do of the same image, it returns a result with the exact same
_id
and
url
, and
count(*[_type == "sanity.imageAsset"])
doesn't change.This is the behavior I want, but I would assume it uploaded a new asset for each attempt, and didn't do a compare or something. So just want to confirm this behavior before I rely upon it.
🙂
AI Update

Yes, you're absolutely correct! This is indeed the intended behavior, and you can safely rely on it. Sanity automatically deduplicates image assets based on their content.

When you upload an image to Sanity, the platform generates a deterministic ID based on the actual image content (using a SHA1 hash of the file). This means that uploading the exact same image multiple times will always return the same _id and url without creating duplicate asset documents in your Content Lake.

As mentioned in the Sanity documentation on uploading assets efficiently, "Images uploaded to the Content Lake receive deterministic IDs based on the image content itself, ensuring that uploading the same image multiple times results in the same ID without creating duplicate documents."

The asset ID follows the format image-<sha1hash>-<dimensions>-<format>, so if the content is identical, the hash will be identical, resulting in the same asset ID. This is why your count(*[_type == "sanity.imageAsset"]) query doesn't change when you re-upload the same image.

This deduplication behavior is actually a feature that helps:

  • Save storage space by avoiding duplicate files
  • Maintain referential integrity (all references point to the same asset)
  • Speed up migrations and bulk uploads (you can upload without worrying about creating duplicates)

So yes, you can confidently build your workflows around this behavior. It's a core part of how Sanity's asset pipeline works, designed to eliminate the burden of managing duplicate image versions. The sha1hash-based deduplication ensures that identical image content always maps to the same asset document.

Yes, it’s correct and you can expect it to be consistent, reliable behaviour. Just confirmed with
user A
:
If we already have an image with the same hash it’s not stored and we’ll return the asset id of the existing one
Thanks
user M
! Thats very cool! 🙂
user M
How does this know it is the same image? Does it look for the
originalFilename
?
It uses a cryptographic hash function (SHA-1) to calculate a hash value on upload - if hash values are the same, the image is considered identical and not stored but instead linked from the existing one.
user M
So does it still replace the image file with the current one being uploaded, or does it just ignore the request altogether and keep using the "old" file if it matches an existing asset?
It will keep using the existing (old) file, as it’s identical, so no need to replace 🙂
Gotcha. So does that mean there's no way to replace the file for an existing asset? It has to be deleted and then another one created?
And if that's the case, can a custom ID for the asset be specified on upload so that it can match the old asset that was deleted to keep references on documents intact?
user M
Was still wondering if you had an answer here. Is there no way to make an existing asset ID use a new uploaded file?

Sanity – Build the way you think, not the way your CMS thinks

Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.

Was this answer helpful?