Can GROQ query JSON strings stored in document fields?

25 replies
Last updated: Aug 19, 2020
Hey team, no way to perform GROQ queries on a field that contains string of json within a doc right? So if we had a field
data
with a value
"[ { "id":"one" }, { "id":"two"}, ... ]
, we wouldn’t be to query that nested json data?
AI Update

You're correct – GROQ cannot parse JSON strings stored as plain text within a field. If you have a field containing a string like "[ { \"id\":\"one\" }, { \"id\":\"two\"} ]", GROQ treats it as just a string value and has no built-in JSON parsing function to query the nested data.

GROQ is designed to work with structured JSON documents that are already parsed and stored in the Content Lake, not with JSON-as-string data. The query language expects your content to be properly structured when it's stored in Sanity.

Your Options

1. Store it as proper structured data (recommended)

Instead of storing JSON as a string, define a proper schema that matches your data structure:

{
  name: 'myDocument',
  type: 'document',
  fields: [
    {
      name: 'data',
      type: 'array',
      of: [{
        type: 'object',
        fields: [
          {name: 'id', type: 'string'}
        ]
      }]
    }
  ]
}

Now you can query it with GROQ: *[_type == "myDocument" && "one" in data[].id]

2. Use a custom JSON field type

If you genuinely need to store arbitrary JSON, you could use a custom input component or plugin that accepts JSON input but stores it as parsed objects (not strings). The GROQ Arcade can help you test queries against JSON data structures.

3. Parse it in your application layer

If you must keep it as a string (maybe it's coming from an external source), fetch the document and parse the JSON string in your application code after retrieval, not in the GROQ query itself.

The key principle: GROQ works with Sanity's structured content model where data is stored as proper JSON objects and arrays, not as stringified JSON. Converting your string field to a properly structured field will give you the full power of GROQ queries.

Show original thread
25 replies
If you’re looking to run one GROQ query to grab a piece of data from that string, I believe you won’t be able to, since GROQ will just view that as a string and not as JSON. You could, in theory return the JSON string and then use GROQ in your application code to grab a field, but if the data is simple enough, that would probably be overkill.
How are you getting the JSON string into your data?
If you’re looking to run one GROQ query to grab a piece of data from that string, I believe you won’t be able to, since GROQ will just view that as a string and not as JSON. You could, in theory return the JSON string and then use GROQ in your application code to grab a field, but if the data is simple enough, that would probably be overkill.
How are you getting the JSON string into your data?
Thanks
user T
. So yea the second solution is what I’m trying to avoid. Basically I have a doc with
_id = apiData
in my schema that collates all of my site’s 3rd party data into a json string within a field for that doc labeled
data
. The goal here is to push all 3rd party data to sanity so I can run groq queries against everything.
For a couple of reasons it would be MUCH more convenient if the third party data and my sanity data were all part of the same dataset when running groq queries as they are tied into one another.
It might take a little more effort on the front, but instead of pushing a JSON string into a string field, you could structure that data into an object and push that object to a corresponding object (or array, i suppose) schema. Depending on what 3rd party data you’re working with you could do some really interesting things with it if structured
if you’ve got a small snippet of the JSON that you’d be comfortable sharing, i’d be curious to see what the overall shape of things is. I feel like housing third-party data is a really interesting use case
Thanks, yea I had discussed this in some DMs with
user A
. I was thinking of maybe setting things up within the schema, I guess a custom object/array is a good idea. The lazy me was hoping to push the data in there regardless of the shape.
Basically I’m trying to replicate how Gatsby collates everything into a single GraphQL schema, except I want it to do it with NextJS+ Sanity+GROQ
I have some other data, but one major piece is an array of objects with the following structure:

{
  Id: 'a004o000006S8QNAA0',
  Name: 'Part 107 Test Prep - In Person - Austin, TX 11/8/2020',
  Date_Offered__c: '2020-11-08',
  SKU__c: 'OP201107AUTX08NOVSundayGround2',
  Location_City__c: 'Austin',
  Location_State__c: 'TX',
  Product2__r: {
    Id: '01t46000001BbYzAAK',
    Name: 'Part 107 Test Prep - In Person',
    shortID: '01t46000001BbYz'
  },
  Facility__r: {
    Name: 'Hampton Inn and Suites - Austin South/Buda',
    Class_City__c: 'TX - Austin',
    BillingStreet: '1201 Cabelas Drive',
    BillingCity: 'Buda',
    BillingPostalCode: '78610',
    BillingState: 'TX'
  },
  Contact__c: 'xxx',
  Class_Status__c: 'Upcoming',
  Sold_Out__c: false,
  Seats_Available__c: 8,
  shortID: 'a004o000006S8QN',
  slug: 'tx-austin',
  instructor: {
    _createdAt: '2019-10-24T13:23:12Z',
    _id: 'o3sJ5YAauiy3u5Qul2kUcT',
    _rev: 'EPjlJ1Z1osBysrKoTHrQFl',
    _type: 'instructor',
    _updatedAt: '2020-06-11T13:35:42Z',
    bio: [Array],
    email: 'xxx',
    experience: '10 years',
    expertise: 'Aerial photography and videography, Video production, Broadcast media ',
    fullName: 'Jerry White',
    instructorHours: '696',
    name: 'Jerry White',
    photo: [Object],
    pilotRating: 'Airline Transit Pilot, Multi-engine, Small Unmanned Aerial Systems',
    sfID: '0034600000gm8kFAAQ',
    slug: [Object],
    state: [Object],
    title: 'Airline Pilot & Business Owner'
  }
},
The
instructor
property actually contains data from Sanity
You can push this to a document. if your schema isn’t set up, you’ll get a warning in Studio, but the data will exist on your document and can be queried (just fyi)
I can definitely understand not wanting to manually set up the schema for that json 😄
I can definitely understand not wanting to manually set up the schema for that json 😄
oooooh good call I forgot that documents don’t necessarily need predefined schemas in order to work on the backend
yup, as long as you don’t mind that warning, which i’d think this would be a singleton document that you’d not edit in Studio anyway…
hmm thanks ALOT I’m going to play around with it to see if it’s too good to be true
Glad to help!
YEEESSSS! I update the data and am successfully able to run this query:

*[ _id == "apiData" ] {
  "classes": @.sfClasses[ Location_State__c == "NY" ] {
    ...
  }
}
All your data in one place 🎉 😄
Thanks again
user T
Yes
And easy to update via webhook
Hmm revisiting this. One catch here is that while the structure is fine if we just need to query each third-party data in isolation, it becomes tricky when trying to include them in subqueries. For example, in my Sanity dataset I have some
inPersonClass
docs, each of which have a Salesforce product ID
sfID
. Now in my
apiData
doc holding all third-party data, I have a
sfClasses
array holding all of my Salesforce classes. I am first trying to query some
inPersonClass
, grab their
sfID
, then run a query on that
sfClasses
array within my
apiData
. Here is what I am trying, but getting stuck because we can’t reference grandparent scope:

{
  "course": *[ _type == "inPersonClass" && slug.current == $slug ][0] {
    _id,
    sfID,
    "api": *[ _id == "apiData" ][0] {
      "classes": sfClasses[ Product2__r.Id match ^.sfID + "*" ] {
        ...
      }
    }
  }
}
I might be missing something here so am reaching out. Would be awesome if we could set variables within projections and then use those variables further down in the query.
Ah I think I got it. Looks to be doable by filtering the original projection:

{
  "course": *[ _type == "inPersonClass" && slug.current == $slug ][0] {
    _id,
    sfID,
	"apiData": *[ _id == "apiData" ][0] { 
      sfClasses
    }
  } | [] {
    ...,
    "apiData": {
    	"sfClasses": ^.apiData.sfClasses[ Product2__r.Id match ^.sfID + "*" ] {
        ...
      }
  	}
  }
}
That makes sense to me! Also, thanks for responding to seb who was also having this issue. This was the suggestion I was going to give them 😄

Sanity – Build the way you think, not the way your CMS thinks

Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.

Was this answer helpful?