How to handle <img /> tags with htmlToBlocks from @sanity/block-tools
Based on the Sanity documentation, htmlToBlocks from @sanity/block-tools (now @portabletext/block-tools) doesn't automatically handle <img /> tags by default. You need to add custom deserialization rules to handle them—no funky workarounds needed!
Adding Support for <img> Tags
You can pass custom rules to htmlToBlocks to intercept specific HTML elements. Here's how to handle images:
For Images Inside <figure> Tags
import {htmlToBlocks} from '@portabletext/block-tools'
import {JSDOM} from 'jsdom'
const blocks = htmlToBlocks(html, blockContentSchema, {
parseHtml: (html) => new JSDOM(html).window.document,
rules: [
{
deserialize(node, next, block) {
const el = node as HTMLElement
if (node.nodeName.toLowerCase() === 'figure') {
const url = el.querySelector('img')?.getAttribute('src')
if (!url) {
return undefined
}
return block({
_type: 'externalImage', // Temporary type
url,
})
}
return undefined
},
},
],
})For Standalone <img> Tags
If your images aren't wrapped in <figure> tags, check for img directly:
rules: [
{
deserialize(node, next, block) {
if (node.nodeName.toLowerCase() === 'img') {
const el = node as HTMLElement
const url = el.getAttribute('src')
if (!url) return undefined
return block({
_type: 'externalImage',
url,
alt: el.getAttribute('alt') || '',
})
}
return undefined
},
},
]Post-Processing: Uploading Images
Since the deserialize method is synchronous, you need to post-process blocks to upload images and create proper Sanity asset references:
// Step 1: Extract URLs with htmlToBlocks
let blocks = htmlToBlocks(html, blockContentSchema, {
parseHtml: (html) => new JSDOM(html).window.document,
rules: [/* your rules */]
})
// Step 2: Upload images and create references
const blocksWithUploads = blocks.map((block) =>
async () => {
if (block._type !== 'externalImage' || !('url' in block)) {
return block
}
// Upload the image to Sanity
const imageAsset = await client.assets.upload('image', fetch(block.url))
// Return proper image block with reference
return {
_key: block._key,
_type: 'image',
asset: {
_ref: imageAsset._id,
_type: 'reference'
}
}
}
)
blocks = await Promise.all(blocksWithUploads.map(fn => fn()))Complete Example from the Migration Guide
The WordPress to Sanity migration course shows a full implementation with rate limiting and caching:
export async function htmlToBlockContent(
html: string,
client: SanityClient,
imageCache: Record<number, string>,
): Promise<Post['content']> {
// Convert HTML to Portable Text
let blocks = htmlToBlocks(html, blockContentSchema, {
parseHtml: (html) => new JSDOM(html).window.document,
rules: [
{
deserialize(node, next, block) {
const el = node as HTMLElement
if (node.nodeName.toLowerCase() === 'figure') {
const url = el.querySelector('img')?.getAttribute('src')
if (!url) return undefined
return block({
_type: 'externalImage',
url,
})
}
return undefined
},
},
],
})
// Upload images with rate limiting
const limit = pLimit(2)
const blocksWithUploads = blocks.map((block) =>
limit(async () => {
if (block._type !== 'externalImage' || !('url' in block)) {
return block
}
// Check cache first
if (imageCache[block.url]) {
return {
_key: block._key,
_type: 'image',
asset: { _ref: imageCache[block.url], _type: 'reference' }
}
}
// Upload and cache
const imageDocument = await sanityUploadFromUrl(block.url, client)
if (imageDocument) {
imageCache[block.url] = imageDocument._id
return {
_key: block._key,
_type: 'image',
asset: { _ref: imageDocument._id, _type: 'reference' }
}
}
return block
}),
)
return await Promise.all(blocksWithUploads)
}Key Takeaways
- Custom rules let you deserialize
<img>tags into temporary block types - Post-processing is required to upload images (the deserialize method is synchronous)
- Store URLs temporarily, then map over blocks to upload and create
_sanityAssetreferences - Consider using rate limiting (like p-limit) and caching for performance
Check out the full Converting HTML to Portable Text lesson and the @sanity/block-tools documentation for more details!
Show original thread5 replies
Sanity – Build the way you think, not the way your CMS thinks
Sanity is the developer-first content operating system that gives you complete control. Schema-as-code, GROQ queries, and real-time APIs mean no more workarounds or waiting for deployments. Free to start, scale as you grow.