Is it okay to save a complex object without following a schema in Sanity.io?

2 replies
Last updated: Jan 24, 2024
Hey hey! Random question.
Would it be bad to have a complex object that I save into an entry, which is basically not following any schema? Think of it as a blob of data.

I'd like to do this so I can query the data inside of it with GROQ.

Just curious if there's any downside in terms of performance?
AI Update

You can totally store a blob of unstructured data in Sanity documents and query it with GROQ! But there are some performance considerations you should be aware of.

The Good News

Sanity's Content Lake is schemaless at its core – documents are just JSON objects. So you can absolutely store arbitrary object data in a field without defining every nested property in your schema. GROQ can traverse and query this data just fine.

The Performance Gotchas

The main performance concern comes down to how GROQ can optimize your queries. According to the high performance GROQ guide, the query engine uses special index structures to speed up queries, but this only works for certain types of filter expressions.

Optimized filters (fast):

  • Simple attribute comparisons with literals: *[_type == "product"] or *[price > 100]
  • Defined checks: *[defined(myField)]
  • Reference checks: *[references(someId)]

Non-optimized filters (slow on large datasets):

  • Comparing two non-literal fields: *[salePrice < displayPrice]
  • Using functions or concatenation in filters: *[firstName + " " + lastName == "John Doe"]
  • Deeply nested property comparisons inside your blob

So if you're querying deep into your unstructured data blob with complex expressions, you might hit an unfiltered over-fetch where the query engine has to load every document into memory before filtering. This is fine with hundreds of documents but can get slow with thousands.

Practical Tips

  1. Stack optimizable filters first: Even if you need to query inside your blob, reduce the search space with indexed fields first:

    *[_type == "entry" && defined(myBlob) && myBlob.someField > 10]
  2. Keep frequently-queried fields at the root level: If you find yourself constantly filtering by certain properties in your blob, consider pulling them out as top-level document fields. A little denormalization can significantly improve query performance.

  3. Watch document size: While there's no strict limit mentioned, documents do have size and attribute limits that vary by plan tier. Huge JSON blobs could hit these limits.

  4. Consider your dataset size: Performance issues often don't appear until you have thousands of documents. A query that's fast with 100 documents might slow down considerably at 10,000+.

  5. Test with realistic data: If this is a critical query path, test it with a dataset size similar to what you expect in production.

Bottom Line

Storing unstructured data and querying it with GROQ is totally viable! Just be mindful that complex queries deep into that data won't benefit from Sanity's query optimization. If you keep your most-queried fields at the document root level and use simple, optimizable filters, you'll be in great shape.

The backend is schemaless so you can plop data into your dataset without surfacing it in your Studio’s schema with no ill effect.
Thanks!

Sanity – Build the way you think, not the way your CMS thinks

Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.

Was this answer helpful?