Discussion about special characters in Portable Text to HTML conversion

15 replies
Last updated: Oct 5, 2022
Does the portable text to html accept special characters? We need to be able to put characters like Ă (we currently have the old blocks to html, but will be upgrading soon)
AI Update

Yes, @portabletext/to-html (and the older @sanity/block-content-to-html) fully support special characters like Ă, á, ñ, and other Unicode characters. The issue you're experiencing isn't with Portable Text itself, but rather with how your HTML is being rendered.

Based on a community discussion about this exact issue, the problem is typically a character encoding mismatch between how Sanity stores the content (UTF-8) and how your browser interprets it.

The Solution

The most common fix is to ensure your HTML page is explicitly set to use UTF-8 encoding. Add this meta tag to your HTML's <head> section:

<meta charset="UTF-8" />

If you're seeing characters like "á" instead of "á", or "Ă" displaying incorrectly, this is a classic sign that UTF-8 encoded content is being interpreted as Windows-1252 or another encoding.

Why This Works

Sanity's API serves all content with UTF-8 encoding (Content-Type: application/json;charset=utf-8). UTF-8 can encode every character in Unicode, including all the special characters you need. When your HTML doesn't declare its encoding, browsers may guess incorrectly, causing the garbled text you're seeing.

Additional Options

If adding the meta tag doesn't work (though it should), you can also configure your web server to send the proper HTTP header:

Content-Type: text/html;charset=utf-8

The exact method depends on your hosting setup, but the meta tag solution typically works for most cases.

This issue affects both the old @sanity/block-content-to-html package and the newer @portabletext/to-html, so the fix will work regardless of which version you're using. The encoding issue is on the rendering side, not with Portable Text's handling of the characters themselves - Portable Text stores and processes all Unicode characters correctly by default.

I believe it does! Is that not the behavior you're getting?
It works fine with emojis, Japanese characters, Cyrillic characters, etc. So it should be fine, there is no reason these characters won’t work. They’re not more special than any other. :)
I am able to get some special characters to work with our current blocks to html, but not all. Here is a screenshot from a coworker of what she is trying to insert vs what she gets
And here is what I type in vs what I get
Is it possible that those characters don't exist in the font you're using?
This looks like an encoding issue, not a font issue. For example, á (a with acute) is represented in UTF-8 encoding by the two bytes
0xC3
0xA1
. Those same two bytes in Windows-1252 encoding represent à (A with tilde) followed by ¡ (inverted exclamation mark).
Is this something I can fix or is it a sanity thing or something different all together?
ooh, maybe if I can change it to UTF-16?
UTF-8 is the most common and standard encoding in the web world, so I would use that if I had the choice. Sanity's API serves the results in that encoding as far as I can see (their
Content-Type
heading says
application/json;charset=utf-8
here), so you'd have to convert it if you need something else.
Most likely you can fix it by configuring the web site (wherever the right-hand-side parts of the screenshots come from) to serve the content as UTF-8. Exactly how that is done will depend on how the site is hosted, but the HTTP header should say
Content-Type: text/html;charset=utf-8
.
Failing that, putting
<meta charset="utf-8">
in the actual HTML's
head
element is an option.
oh ok, I misunderstood and thought some of the characters I wanted weren't in UTF-8. Just looked at the character list and see I was incorrect. I will look at what you suggested
Many thanks for stepping in here
user Q
!
Ah, no worries. UTF-8 can encode everything in Unicode, just like UTF-16 🙂
Added
<meta charset="UTF-8" />
in the preview where it was missing and it now displays as expected. So happy it was an easy fix. Thank you for the guidance!!!

Sanity – Build the way you think, not the way your CMS thinks

Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.

Was this answer helpful?