How to save id-attributes in blocks when importing HTML with block-tools
19 replies
Last updated: Feb 14, 2022
K
Hello. I'm importing old html in to block-text with block-tools. I have to save id-attributes (due of html's inner links) in Sanity, but I don't get how I can do it properly. I understand I have to create mark and markdef somehow. I have read this example https://github.com/sanity-io/sanity/tree/next/packages/@sanity/block-tools#rules .
Feb 9, 2022, 10:24 AM
K
Here is my code
const result = blockTools.htmlToBlocks(content, schema, {
parseHtml: html => new JSDOM(html).window.document,
rules: [
{
deserialize(el, next, block) {
let idVal= '';
if (el.tagName.toLowerCase() === 'a') {
return;
}
_forEach(el.attributes, (attr) => {
if (attr.name.toLowerCase() === 'id') {
idVal = attr.value;
}
});
// ??? What I should do now?
}
}
]
});
Feb 9, 2022, 10:28 AM
K
Can somebody help me! This is major blocker for me.
Feb 9, 2022, 2:16 PM
A
Hey Kai π. I'm taking a look at this. Would you be able to provide an example of the HTML you want to import?
Feb 9, 2022, 2:51 PM
D
I have an example of creating markdefs with blocktools here: https://gist.github.com/d4rekanguok/8a6c698d16ef6666196ae028c04066bc
The trick seems to be returning a node of type
The trick seems to be returning a node of type
__annotation
Feb 9, 2022, 4:54 PM
K
user E
It is just very basic html, with nothing really special, but it contains standard inner links.Example:...<a href="#toc2">How we handle your personal information</a>...
...<h2 id="toc2">How we handle your personal information</h2>...
Block tools parser saves href="#toc2" but id="toc2" will not be saved, so I cannot recreate same page again.
Feb 10, 2022, 7:35 AM
K
user G
Thank you for your answer π , but Block tools saves href just fine as default, but loses id-attributes, so inner page links does not work.Feb 10, 2022, 7:37 AM
D
I've just given it a quick shot & it looks like
el.idwould correctly log the id attribute, if that doesn't work, could you share the relevant excerpt of your html so we can give it a try?
Feb 10, 2022, 8:02 AM
K
const htmlToBlocks = (content) => {const result = blockTools.htmlToBlocks(content, schema, {
parseHtml: html => new JSDOM(html).window.document,
rules: [
{
deserialize(el, next, block) {
let idVal= '';
if (el.tagName.toLowerCase() !== 'h2') {
return;
}
_forEach(el.attributes, (attr) => {
if (attr.name.toLowerCase() === 'id') {
idVal = attr.value;
}
});
if (!idVal) {
return;
}
const result = {
_type: 'block',
children: [
{
_type: 'span',
text: el.textContent,
htmlId: idVal,
}
],
style: 'h2'
};
return block(result);
}
}
]
});
return result;
}
Feb 11, 2022, 6:49 AM
K
const htmlToBlocks = (content) => {const result = blockTools.htmlToBlocks(content, schema, {
parseHtml: html => new JSDOM(html).window.document,
rules: [
{
deserialize(el, next, block) {
let idVal= '';
if (el.tagName.toLowerCase() !== 'h2') {
return;
}
_forEach(el.attributes, (attr) => {
if (attr.name.toLowerCase() === 'id') {
idVal = attr.value;
}
});
if (!idVal) {
return;
}
const result = {
_type: 'block',
children: [
{
_type: 'span',
text: el.textContent,
htmlId: idVal,
}
],
style: 'h2'
};
return block(result);
}
}
]
});
return result;
}
Feb 11, 2022, 6:49 AM
K
Hi again. I found that I can just put values in json. But because it does not follow any schema I think it is not legit way.
Is there legit way to save ids in blocks?
Is there legit way to save ids in blocks?
const htmlToBlocks = (content) => {
return blockTools.htmlToBlocks(content, schema, {
parseHtml: html => new JSDOM(html).window.document,
rules: [
{
deserialize(el, next, block) {
let idVal= '';
if (el.tagName.toLowerCase() !== 'h2') {
return;
}
_forEach(el.attributes, (attr) => {
if (attr.name.toLowerCase() === 'id') {
idVal = attr.value;
}
});
if (!idVal) {
return;
}
const result = {
_type: 'block',
children: [
{
_type: 'span',
text: el.textContent,
htmlId: idVal,
}
],
style: 'h2'
};
return block(result);
}
}
]
});
}
Feb 11, 2022, 7:16 AM
K
Previous code outputs following:
{ "_key": "30f3d9e74d36", "_type": "block", "children": [ { "_key": "30f3d9e74d360", "_type": "span", "htmlId": "toc1", "marks": [], "text": "Using Polar services in short" } ], "markDefs": [], "style": "h2" },
Feb 11, 2022, 7:17 AM
K
Of course I have to find way to parse that kind of json somehow. π
Feb 11, 2022, 7:19 AM
D
if you want to keep it as a regular block, you'd better off creating an annotation. Instead of returning block you can return an object of type '__annotation' with your custom markdef:
You'd have to define this annotation in your block schema:
return { _type: '__annotation', markDef: { _type: 'htmlId', _key: randomKey(12), htmlId, }, children: next(el.childNodes) }
marks: { annotations: [ { name: 'htmlId', type: 'object', fields: [ { name: htmlId, type: 'string' } ] } ] }
Feb 11, 2022, 7:25 AM
D
alternatively you can create a custom heading block with that property defined
and define it in your schema
return block({ _type: 'customHeading', htmlId, /* etc */ })
{ type: 'array', of: [ { type: 'block' }, { type: 'customHeading', /* your custom props */ } ] }
Feb 11, 2022, 7:25 AM
K
Finally I found working solution:
const htmlToBlocks = (content) => {
return blockTools.htmlToBlocks(content, schema, {
parseHtml: html => new JSDOM(html).window.document,
rules: [
{
deserialize(el, next, block) {
let idVal= '';
if (el.tagName.toLowerCase() !== 'h2') {
return;
}
_forEach(el.attributes, (attr) => {
if (attr.name.toLowerCase() === 'id') {
idVal = attr.value;
}
});
if (!idVal) {
return;
}
const markKey = blockTools.randomKey(12);
const result = {
_type: 'block',
style: 'h2',
children: [
{
_type: 'span',
text: el.textContent,
marks: [
markKey
]
}
],
markDefs: [
{
_key: markKey,
_type: "html_id",
html_id: idVal
}
]
};
return block(result);
}
}
]
});
}
Feb 14, 2022, 9:54 AM
K
Schema:
title: "Legal texts",
name: "legal_text",
type: "document",
icon: GiScales,
i18n: true,
fields: [
{
title: "Title",
name: "title",
type: "string",
},
{
title: "Content",
name: "content",
type: "array",
of: [
{
type: "block",
marks: {
annotations: [
{
name: "link",
type: "object",
title: "Link",
fields: [
{
title: "href",
name: "href",
type: "string"
}
]
},
{
name: "html_id",
type: "object",
title: "html Id",
icon: BsHash,
fields: [
{
title: "#",
name: "html_id",
type: "string"
}
]
}
]
}
}
]
},
{
title: "Key",
name: "key",
type: "string"
}
]
Feb 14, 2022, 9:55 AM
K
Thank you for your help π
Feb 14, 2022, 9:56 AM
A
That's great! I'm glad you got this working.
user G
thank you for helping! πFeb 14, 2022, 2:52 PM
D
user U
figured it out! nice work, I didn't think of just returning the whole block with markDef defined π
Feb 14, 2022, 2:59 PM
Sanityβ build remarkable experiences at scale
Sanity is a modern headless CMS that treats content as data to power your digital business. Free to get started, and pay-as-you-go on all plans.