✨Discover storytelling in the AI age with Pixar's Matthew Luhn at Sanity Connect, May 8th—register now
Last updated May 22, 2019

Introduction to Portable Text

Official(made by Sanity team)

By Knut Melvær

Learn how Portable Text works

Warning

This guide contains code examples for an older version of Sanity Studio (v2), which is deprecated.

Learn how to migrate to the new Studio v3 →

HTML is great, but not for storing rich text content

The text you read now is most probably rendered with HTML in a web browser. That's how we have dealt with text on the Web since it's conception. And it works great for what it's doing. There has been a lot of work put into both making the HTML specification better and more semantic in order to make the web a more accessable place for everyone. We feel we need to make this clear from the get-go when introducing an alternative way of handling rich text digitally.

Portable Text is not intended as a contender to HTML. But there are plenty of use cases where storing your content formatted as HTML introduce a lot of friction and headache. Take the many cases where you have to output rich text content from a CMS through React’s dangerouslySetInnerHTML, Vue’s v-html directive, or Svelte’s {@html content}. Where you also loose most of the nice features that comes with these frameworks. In addition to having to think about the XSS-vulnerabilities that all these frameworks warns against.

These are among the reasons we made the Portable Text specification. It will actually make it more convenient for you to output content as semantic HTML through of a broad range of frontend frameworks, as well as surfaces outside of the web browser, like native apps and voice assistants. We believe you will find it appealing once you learn how it works, and what it lets you do. For Sanity specifically having Portable Text as the way to deal with rich text content makes it less hard to build the real-time collaborative editing environment.

What is Portable Text?

Portable Text is a JSON based rich text specification for modern content editing platforms. In order words, it's not "an alternative to Markdown", which was designed to be easy for humans to read and write something that would specifically end up as HTML. Portable Text is designed to be a format for text editing interfaces and for serializing into any human-readable format.

The specification is open source and available on GitHub. If you have peeked at the data-structures produced by Sanity’s rich text editor, it may seem a bit daunting and inscrutable at first. So let's break it down to reveal that there's actually a simple structure to Portable Text.

Portable Text stores rich text content as an array of blocks and custom block types. Think of a "block" as a paragraph. Rich text blocks have a children array, which consists of spans of text, or custom span types called "inline objects".

Think of when you use a highligher to mark important parts of a text. In Portable Text spans can have such marks. It doesn't always need to mean "important" though. Spans can be decorated with any simple string, for example "this span of text should be emphasized, which can be translated to to the <em> tag in HTML. Spans can also be annotated with a reference to a mark defintion, for instance, "this span of text" should reference a link object in this block. Annotations are a powerful because they let you add data structures to rich text, which will be queryable if you use Sanity as your backend.

In addition, Portable Text has some specified attributes to express common features like list-levels, style, and more.

Approaching text as data

Before we get down to the brass tacks of Portable Text. Let's take a minute to consider what the implications are by dealing with rich text in this way.

Your content will now be highly structured and deeply typed. This will make it more sustainable and easier to take with you as your projects evolve and develop. It will be especially helpful whenever you plan to take your content to a new presentation layer, or do a redesign. This hinges on, however, whether you structure your text to what it means, rather to how it should be presented.

This is the same line of thought as you would have with semantic HTML. You should avoid embedding marks like green or largeText, because these are matters of presentation. Rather, ask the questions “why should the text be green”, or “what does largeText mean”. If the text should be green because there's something you want to call out, make a decorator called "highlight", or "important". If the text should be large because it is supposed to be a header, create a style called "heading".

There will be situations where the line between meaning and presentations isn't clear cut. Take text alignment. In many cases this can be a concern of the stylesheet to where the content is rendered. But there are cases where you would want control over text-alignment to be able to reproduce certain representations of text, like poems or the like. In this case we would suggest either adding a custom type, or creating a separate style that would also embed that usecase more specifically.

Some simple examples

Let's take a closer look at some actual Portable Text content:

[{
  "_type": "block",
  "_key": "da5f884c9804",
  "style": "normal",
  "children": [{
      "_type": "span",
      "_key": "da5f884c98040",
      "text": "Say hi to ",
      "marks": []
    },
    {
      "_type": "span",
      "_key": "da5f884c98041",
      "text": "Portable Text",
      "marks": [
        "strong",
        "<markDefId>"
      ]
    },
    {
      "_type": "span",
      "_key": "da5f884c98042",
      "text": ".",
      "marks": []
    }
  ],
  "markDefs": [{
    "_type": "link",
    "_key": "<markDefId>",
    "href": "https://www.portabletext.org"
  }]
}]

Notice that this example includes a _key. This is included to make it easier to use Portable Text in real-time interfaces. However, we will omit it in the following code examples for readability.

Here we have one block that we can translate into HTML like this:

<p>
  Say hi to <a href="https://portabletext.org"><strong>Portable Text</strong></a>
</p>

Obviously, in this case the HTML looks much simpler, but let's say you wanted to add a map or some geolocation to your content. In HTML you would probably grab an embed from Google Maps or similar:

<p>You can visit us in our opening hours at this location:</p>
<iframe src="https://www.google.com/maps/embed?pb=!1m18!1m12!1m3!1d1999.4909029336122!2d10.756783516124631!3d59.92399607024174!2m3!1f0!2f0!3f0!3m2!1i1024!2i768!4f13.1!3m3!1m2!1s0x46416e6896bd529d%3A0x54376e7b89f2db2a!2sThorvald+Meyers+gate+49%2C+0555+Oslo!5e0!3m2!1sno!2sno!4v1558423749014!5m2!1sno!2sno" width="600" height="450" frameborder="0" style="border:0" allowfullscreen></iframe>

This works as long as Google doesn't deprecate this way of embeding maps, and you're fine with having your content locked to this specific service and for just the web browser. If you want to resuse this text with the location in another form of presentation, for example an app, you could tease out the address from the srcattribute by preparsing the HTML, but you probably would rather spend that time and effort doing what you actually are supposed to be doing.

The idea behind Portable Text is that even your rich text content should be typed and structured:

{
  "body": [
    {
      "_type": "block",
      "style": "normal",
      "children": [
        {
          "_type": "span",
          "marks": [],
          "text":  "You can visit us in our opening hours at this location:"
        }
      ],
      "markDefs": []
    },
    {
      "_type": "location",
      "coordinates": {
        "_type": "geopoint",
        "lat": 59.924010,
        "long": 10.758880
      }
    }
  ]
}

Not only is this way more future proof, it also doesn't lock you to a certain embed or mode of presentation. The Portable Text snippet is ready to be used in an App or a Voice Interface. With Sanity, you can also query your content for all geo-locations using GROQ:

// select (*) all documents, pick the `body` ([.body]) field and loop over ([]) all objects in it,
// filter out everything where the _type field value is not "geolocation" ([_type == "geolocation"]),
// and pick the `coodinates` field:

*[].body[][_type == "geolocation"].coordinates

// given the document above, this would output

[
  {
      "_type": "geopoint",
      "lat": 59.924010,
      "long": 10.758880
  }
]

The location block could also be extended to include address information, or any other relevant data in addition to the coordinates.

One thing to note is that Portable Text allows for nested structures. In other words, maybe you want to make it possible to include an image block, which have a rich text caption field:

[
  {
    "_type": "block",
    "style": "normal",
    "children": [{
      "_type": "span",
      "marks": [],
      "text": "This is one of her most famous portraits:"
    }],
    "markDefs": []
  },
  {
    "_type": "bodyImage",
    "asset": {
      "_type": "reference",
      "_ref": "<aRandomString>"
    },
    "description": [{
      "_type": "block",
      "style": "normal",
      "children": [{
          "_type": "span",
          "marks": [],
          "text": "Photo by:"
        },
        {
          "_type": "span",
          "marks": ["emphasis"],
          "text": "Annie Leibovitz"
        }
      ],
      "markDefs": []
    }]
  }
]

How to get started with Portable Text with Sanity Studio

In order get default rich text input field for Portable Text with Sanity Studio you need to define the following schema:

export default {
  name: 'body',
  type: 'array',
  title: 'Body',
  of: [
    {
      type: 'block'
    }
  ]
}

As previously mentioned, rich text is an array that contains blocks (and potentially other types). If you remove the block type from the array, it will turn into an array input.

Let's say we want a custom image field with a rich text caption field in our rich text. If we want to be compatible with GraphQL we have to follow the strict schemas convention and “hoist” all object fields and refer to them by name. In order words, we have to define the rich text field for the caption, as well as the image field that we want to use for the body field, and import these to the createSchemaTypes.types array usually found in the schemas.js file in your studio.

The caption text can be defined like this:

// bodyImageCaption.js
export default {
  name: 'bodyImageCaption',
  type: 'array',
  of: [
    { type: 'block' }
  ]
}

And the image field like this:

// bodyImage.js
export default {
  name: 'bodyImage',
  type: 'image',
  title: 'Image',
  fields: [
    {
      name: 'caption',
      type: 'bodyImageCaption',
    },
    {
      name: 'alt',
      type: 'string'
    }
  ]
}

Notice that we refer to the name from bodyImageCaption.js as the type in the field-array for bodyImage.

How do you render Portable Text?

Rendering Portable Text is really about looping over the blocks in the array, and use the data to create the elements and attributes you need for whatever framework you are using. We have made libraries that takes most of the hard parts and lets you specify how your custom types should be rendered. Before taking a closer look at this, let's see how we can render the default array-of-blocks structure to plain text, just to get a sense of how it's done:

function blocksToText(blocks) {
  return blocks
    .map(block => block.children.map(child => child.text).join(''))
}

This script would break (you should use this one if you need to convert to plain text) when you introduce custom block types that doesn't have children with text in them, but shows that serialization is mainly about looping through these structures.

The whole point of having your rich text content in Portable Text is to be able to have full control over presentation. This is done through what we call serialization, in other words, specifying what should happen to the data structures that appears in the objects you loop through. For example, how should the frontend render the data you have put in your custom image with text component?

There's libraries for HTML, Markdown, React, Vue, and Hyperscript, that do most of the heavy lifting when it comes to serializing Portable Text. They deal with the default features of the rich text editor (lists, headings, emphasis, etc) and requires custom serializers.

Sanity – build remarkable experiences at scale

Sanity Composable Content Cloud is the headless CMS that gives you (and your team) a content backend to drive websites and applications with modern tooling. It offers a real-time editing environment for content creators that’s easy to configure but designed to be customized with JavaScript and React when needed. With the hosted document store, you query content freely and easily integrate with any framework or data source to distribute and enrich content.

Sanity scales from weekend projects to enterprise needs and is used by companies like Puma, AT&T, Burger King, Tata, and Figma.

Other guides by author