How to save id-attributes in blocks when importing HTML with block-tools
19 replies
Last updated: Feb 14, 2022
K
Hello. I'm importing old html in to block-text with block-tools. I have to save id-attributes (due of html's inner links) in Sanity, but I don't get how I can do it properly. I understand I have to create mark and markdef somehow. I have read this example https://github.com/sanity-io/sanity/tree/next/packages/@sanity/block-tools#rules .
Feb 9, 2022, 10:24 AM
K
Here is my code
const result = blockTools.htmlToBlocks(content, schema, {
parseHtml: html => new JSDOM(html).window.document,
rules: [
{
deserialize(el, next, block) {
let idVal= '';
if (el.tagName.toLowerCase() === 'a') {
return;
}
_forEach(el.attributes, (attr) => {
if (attr.name.toLowerCase() === 'id') {
idVal = attr.value;
}
});
// ??? What I should do now?
}
}
]
});
Feb 9, 2022, 10:28 AM
K
Can somebody help me! This is major blocker for me.
Feb 9, 2022, 2:16 PM
A
Hey Kai π. I'm taking a look at this. Would you be able to provide an example of the HTML you want to import?
Feb 9, 2022, 2:51 PM
D
I have an example of creating markdefs with blocktools here: https://gist.github.com/d4rekanguok/8a6c698d16ef6666196ae028c04066bc
The trick seems to be returning a node of type
The trick seems to be returning a node of type
__annotation
Feb 9, 2022, 4:54 PM
K
user E
It is just very basic html, with nothing really special, but it contains standard inner links.Example:...<a href="#toc2">How we handle your personal information</a>...
...<h2 id="toc2">How we handle your personal information</h2>...
Block tools parser saves href="#toc2" but id="toc2" will not be saved, so I cannot recreate same page again.
Feb 10, 2022, 7:35 AM
K
user G
Thank you for your answer π , but Block tools saves href just fine as default, but loses id-attributes, so inner page links does not work.Feb 10, 2022, 7:37 AM
D
I've just given it a quick shot & it looks like
el.idwould correctly log the id attribute, if that doesn't work, could you share the relevant excerpt of your html so we can give it a try?
Feb 10, 2022, 8:02 AM
K
const htmlToBlocks = (content) => {const result = blockTools.htmlToBlocks(content, schema, {
parseHtml: html => new JSDOM(html).window.document,
rules: [
{
deserialize(el, next, block) {
let idVal= '';
if (el.tagName.toLowerCase() !== 'h2') {
return;
}
_forEach(el.attributes, (attr) => {
if (attr.name.toLowerCase() === 'id') {
idVal = attr.value;
}
});
if (!idVal) {
return;
}
const result = {
_type: 'block',
children: [
{
_type: 'span',
text: el.textContent,
htmlId: idVal,
}
],
style: 'h2'
};
return block(result);
}
}
]
});
return result;
}
Feb 11, 2022, 6:49 AM
K
const htmlToBlocks = (content) => {const result = blockTools.htmlToBlocks(content, schema, {
parseHtml: html => new JSDOM(html).window.document,
rules: [
{
deserialize(el, next, block) {
let idVal= '';
if (el.tagName.toLowerCase() !== 'h2') {
return;
}
_forEach(el.attributes, (attr) => {
if (attr.name.toLowerCase() === 'id') {
idVal = attr.value;
}
});
if (!idVal) {
return;
}
const result = {
_type: 'block',
children: [
{
_type: 'span',
text: el.textContent,
htmlId: idVal,
}
],
style: 'h2'
};
return block(result);
}
}
]
});
return result;
}
Feb 11, 2022, 6:49 AM
K
Hi again. I found that I can just put values in json. But because it does not follow any schema I think it is not legit way.
Is there legit way to save ids in blocks?
Is there legit way to save ids in blocks?
const htmlToBlocks = (content) => {
return blockTools.htmlToBlocks(content, schema, {
parseHtml: html => new JSDOM(html).window.document,
rules: [
{
deserialize(el, next, block) {
let idVal= '';
if (el.tagName.toLowerCase() !== 'h2') {
return;
}
_forEach(el.attributes, (attr) => {
if (attr.name.toLowerCase() === 'id') {
idVal = attr.value;
}
});
if (!idVal) {
return;
}
const result = {
_type: 'block',
children: [
{
_type: 'span',
text: el.textContent,
htmlId: idVal,
}
],
style: 'h2'
};
return block(result);
}
}
]
});
}
Feb 11, 2022, 7:16 AM
K
Previous code outputs following:
{ "_key": "30f3d9e74d36", "_type": "block", "children": [ { "_key": "30f3d9e74d360", "_type": "span", "htmlId": "toc1", "marks": [], "text": "Using Polar services in short" } ], "markDefs": [], "style": "h2" },
Feb 11, 2022, 7:17 AM
K
Of course I have to find way to parse that kind of json somehow. π
Feb 11, 2022, 7:19 AM
D
if you want to keep it as a regular block, you'd better off creating an annotation. Instead of returning block you can return an object of type '__annotation' with your custom markdef:
You'd have to define this annotation in your block schema:
return { _type: '__annotation', markDef: { _type: 'htmlId', _key: randomKey(12), htmlId, }, children: next(el.childNodes) }
marks: { annotations: [ { name: 'htmlId', type: 'object', fields: [ { name: htmlId, type: 'string' } ] } ] }
Feb 11, 2022, 7:25 AM
D
alternatively you can create a custom heading block with that property defined
and define it in your schema
return block({ _type: 'customHeading', htmlId, /* etc */ })
{ type: 'array', of: [ { type: 'block' }, { type: 'customHeading', /* your custom props */ } ] }
Feb 11, 2022, 7:25 AM
K
Finally I found working solution:
const htmlToBlocks = (content) => {
return blockTools.htmlToBlocks(content, schema, {
parseHtml: html => new JSDOM(html).window.document,
rules: [
{
deserialize(el, next, block) {
let idVal= '';
if (el.tagName.toLowerCase() !== 'h2') {
return;
}
_forEach(el.attributes, (attr) => {
if (attr.name.toLowerCase() === 'id') {
idVal = attr.value;
}
});
if (!idVal) {
return;
}
const markKey = blockTools.randomKey(12);
const result = {
_type: 'block',
style: 'h2',
children: [
{
_type: 'span',
text: el.textContent,
marks: [
markKey
]
}
],
markDefs: [
{
_key: markKey,
_type: "html_id",
html_id: idVal
}
]
};
return block(result);
}
}
]
});
}
Feb 14, 2022, 9:54 AM
K
Schema:
title: "Legal texts",
name: "legal_text",
type: "document",
icon: GiScales,
i18n: true,
fields: [
{
title: "Title",
name: "title",
type: "string",
},
{
title: "Content",
name: "content",
type: "array",
of: [
{
type: "block",
marks: {
annotations: [
{
name: "link",
type: "object",
title: "Link",
fields: [
{
title: "href",
name: "href",
type: "string"
}
]
},
{
name: "html_id",
type: "object",
title: "html Id",
icon: BsHash,
fields: [
{
title: "#",
name: "html_id",
type: "string"
}
]
}
]
}
}
]
},
{
title: "Key",
name: "key",
type: "string"
}
]
Feb 14, 2022, 9:55 AM
K
Thank you for your help π
Feb 14, 2022, 9:56 AM
A
That's great! I'm glad you got this working.
user G
thank you for helping! πFeb 14, 2022, 2:52 PM
D
user U
figured it out! nice work, I didn't think of just returning the whole block with markDef defined π
Feb 14, 2022, 2:59 PM
Sanityβ build remarkable experiences at scale
The Sanity Composable Content Cloud is the headless CMS that treats content as data to power your digital business. Free to get started, and pay-as-you-go on all plans.